Re: relation departure timing changes

William Reade Fri, 23 Aug 2013 08:34:14 -0700

On Fri, Aug 23, 2013 at 11:59 AM, Gustavo Niemeyer <[email protected]>wrote:

> Also, and again, this isn't about dying vs. not dying. This is about
> broken relations in general, isn't it?
>

True -- but I don't think the macro effects can actually be distinguished.
Until the mooted second stage, the effect of a dying *relation* will stay
the same: all units just run for the exit without paying any attention to
each other :).

> On Fri, Aug 23, 2013 at 7:56 AM, Gustavo Niemeyer <[email protected]>
> wrote:
> > I agree it's suboptimal, but unless we render the alternative scenario
> > into a pretty strong possibility rather than mere chance, there would
> > be little point in investing much time on this. Saying "oh, perhaps it
> > will run if you're in a good day" isn't any better in terms of API
> > than "you cannot depend on this". It would make people spend time
> > trying to follow the pattern, and then wonder why it doesn't work.
>

I agree; the impact on a well-written charm, that is already prepared for
possible silent breakage of related units at any time, will be nil; but
even a perfect and complete implementation of everything discussed here
will not free charms from that responsibility. I don't think that's *in
itself* a strong argument against making a minor change that moves us a
step towards a clearer model; but I agree that it would be deeply unhelpful
to misrepresent the guarantees we do make even by implication. I think
that's about messaging, not about implementation, though.

> > If we're investing on this, it definitely sounds like we should at
> > least have a clear and well-defined behavior for when or it will or
> > will not run, and the cases where it will not run should be mapped to
> > understandable events, otherwise we're not improving the situation.
>

Part of what we're doing here is deciding whether it's worth investing in,
which will then feed into the question of when :). I hope it's clear that
while I'm cautiously for this change, I'm not promising anything about the
timeliness of its delivery; we can only start to make representations about
that once we have made decisions based on the actual costs and benefits...
but we can't fairly determine those without having this discussion.

To be clear, then, the ultimate proposal would be to ensure that the
following statements hold:

1) peers notify peers of their own imminent departure as soon as they're
aware of it
2) peers do not break their relations until all their peers have themselves
departed in response
3) providers notify requirers of their own imminent departure as soon as
they're aware of it
4) providers do not break their relations until all their requirers have
themselves departed in response
5) requirers do not notify providers of their departure until they have
broken their relations

As it stands today, only (5) is accurate.

It would be extremely cheap to implement (1) and (3) today, but that work
would not allow us to guarantee any new properties about the system.

There are legitimate concerns that, in the absence of extremely clear
messaging, implementing (1) and (3) alone may lead charmers to infer the
existence of guarantees that do not in fact exist.

(2) and (4) would be reasonably cheap to implement even today, but to do so
would be to exacerbate the impact of pre-existing bugs triggerable by the
complete failure of a single unit: in addition to preventing the removal of
a provider service (which is, to be sure, inconvenient, but does not cause
underlying cloud resources to be tied up indefinitely), it would prevent
remote service *units* from completing their shutdown, and hence run the
risk of indefinitely tying up limited and/or costly cloud resources.

This fact renders implementation of (2) and (4) unwise at this stage, until
we implement `destroy-unit --force`, which is known to be important but is
not currently considered more important that the non-trivial feature
workload we're already looking at for the remainder of the cycle.

The absolute smallest change we could make would be to the documentation,
by making it clear that the guaranteed lack of reconfiguration window for
requirers is an accidental property of the current system, and that we do
not intend to honour it in future. However, even this change remains
contingent on confirmation (or, at least, persistent silence) from charm
authors, indicating that the loss of this guarantee would not have a
serious impact on their work.

Given the current absence of active dissent, I'm starting to consider the
change to be a Good Idea. I remain very keen to hear from anyone who
expects or fears that any of the above suggestions would materially impact
their ability to write useful charms.

Thank you, Gustavo, for your cautionary advice, which was instrumental in
helping me to lay out the foregoing more clearly than I would otherwise
have managed:).

Cheers
William

-- 
Juju mailing list
[email protected]
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Re: relation departure timing changes

Reply via email to