Hi Bogdan

On Thu, Nov 26, 2015 at 1:29 PM, Bogdan Teleaga <
btele...@cloudbasesolutions.com> wrote:

> This has been a WIP for a while now so maybe some of you have heard
> about it.
>
> It all started out with us needing to have hook retried after a random
> reboot and it evolved into retrying hooks upon any kind of failure.
>
> So as of now failing hooks will be retried automatically after a
> certain time. The minimum wait time will be 20 seconds, while the
> maximum will be 20 minutes and it's going to increase with a factor of
> 2 for every failure. Also a small jitter is introduced for a bit of
> randomness. Using juju resolved will overwrite this timer and cause it
> to restart at the beginning.
>
> I've tested it for a while and it has proven to be relatively robust
> in my tests. Probably having a CI test soonish would be recommended.
>
> The waiting amount has been chosen relatively arbitratily so if anyone
> has comments or ideas for that, I'm open to suggestions. The
> discussion for that should go
> here(https://github.com/juju/juju/pull/3835), since apparently I
> merged the branch with some values I used in testing and did not
> change them back to the intended ones.
>

In the daily deluge of email I managed to miss your post to list, and
stumbled upon this feature whilst exercising 1.26 alpha3 with some
development work this week and assumed it was a bug:

  https://bugs.launchpad.net/juju-core/+bug/1535711

I think this is a dangerous behaviour to introduce to Juju; a hook error
should be a signal to an end user that something really bad happened, and
that they need to dig in further (preferably with points from status
messages); if the function that a hook is performing is re-tryable, that
needs to be handled in charm and not by Juju IMHO.

Specifically I was testing some changes to the odl-controller charm; this
feature covered up a race in the charm hook code accessing the API of ODL,
which I failed to notice the first few times I deployed (not paying
attention due to multi-tasking), and then had me scratching my head as to
what was going on when I started to notice the hook failure.

Cheers

James
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev

Reply via email to