To be clear: none of this is unexpected (apart from the distinct agent-
state issue noted by jamespage that looks like lp:1205451). Juju
communicates state changes to units as a series of single-change deltas,
effectively; when a new unit comes up in a relation with several others,
it will first see 0, then 1, then 2, then... related units 00 and so,
really, this bug reduces to "when I tell juju there's an error, it acts
like there's an error", and that's a straight-up WONTFIX in juju-core.

BUT if you *didn't* return an error in that situation, and you really
were in a temporarily failed active/active configuration, you'd also
have problems, because you can't reasonably configure and start that
unit without splitting the service's brain. We can mitigate this
situation in a number of ways:

* we could restore the pyjuju behaviour, and just not inform you of
units that are not really present. This won't work, because you'll just
configure in standalone mode, and thus split-brain your intended
active/active deployment.

* we could expose some independent mechanism by which a unit could
report whether it's actually working or not; this would allow you to
leave rabbit unconfigured and exit the hook without error, having made
sure to set the "not actually working" flag. This'd (1) give users a way
to observe this condition from outside, and (2) allow the unit to
recover once more units came online (by having set the "not working"
flag, rather than setting a unit error state).

* we could add new hooks: let's strawman "<name>-relation-idle", which
would fire whenever one of the named relation's hook queues emptied.
This would let you defer *all* relation setup work until the whole
picture was available to you -- at the cost of maybe waiting a long time
before it actually ran -- or at least to let you avoid putting the unit
in an error state before you're really sure it's in one (because you
always know you'll have at least one more hook in which to correct
yourself.

Both the second and third options have some glimmers of value to them,
but even together I'm not sure they're a complete solution. I'd like to
hear your thoughts.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to juju-core in Ubuntu.
https://bugs.launchpad.net/bugs/1267913

Title:
  juju relation-list doesn't report full units list when unit is down

To manage notifications about this bug go to:
https://bugs.launchpad.net/juju-core/+bug/1267913/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs

Reply via email to