On 22/09/14 15:06, Mark Ramm-Christensen (Canonical.com) wrote: > I think we need to make sure that we do the best error reporting we can, so > if Juju isn't working because of Azure issues, we should find some way to > let users know that so that they can try another cloud, contact microsoft, > or otherwise find another way forward. > > --Mark Ramm >
We have to take responsibility for the experience of the user. That means it's a bug in Juju if cloud-level failures are obscured into Juju errors that are hard to debug. The measure of quality in software is "how it deals with the unexpected", that means anticipating and handling errors in a way which is appropriate: * retry a few times if that might help (if it's transient the user experiences delays but not a failure) * fail gracefully, meaning: * don't leave unclean bits that the user has to tidy up * provide a clear guide as to where the problem is and how to fix it Please, review code landings from that perspective. Look at implicit assumptions that a call-out (especially external call-outs to things like cloud services) worked, and insist on appropriate handling, to the standard above, of all the cases where it actually failed, including: * explicit failures (500 etc) * responses that are unexpected (OK with data that doesn't validate to expectations) * hangs and timeouts (no response at all) * crashes / exceptions (for local calls) Mark
-- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju