On Tue, Nov 18, 2014 at 7:19 AM, Kapil Thangavelu <[email protected]> wrote: > > for clusters... its not a question of futures but being informed of known > unit count to establish quorum. ie 1 to 3 or n+1. leader election helps, but > actually knowing the unit count is critical to being able to establish a > clear state without throwing away data (aka race on peer knowing quorum and > leader) as adhoc leader election has to throw away data from non leaders who > may already be serving clients due to lack of quorum knowledge. > ... > status per future impl helps, as does explicitly marking units.. but pending > cluster count is a missing and important property to properly establish > quorum in a peer rel from one to n that is only resolved by knowing recorded > units count for a svc.
Two things: 1) I'm not sure numbers are good enough in general compared to sets of units. 2) I'm not sure a single number|set gives us all the information we need. Leaving (1) aside for now, I *think* that each of the following numbers|sets is potentially relevant: i) "goal": the units that juju expects to be part of the relation once everything's converged (including those not yet running, and not including those that are dying) ii) "active": the units that are in scope for the relation (but might be dying) iii) "current": the units that are *locally known to be* in scope for the relation Today, we only expose "current" -- ie relation-list returns the "current" units, and it might be a complete lie, but it's a consistent lie that is useful for many purposes, and we're not going to remove it. If we expose "active" but not "goal", we don't help anything very much -- the first unit of a cluster to come up will still think it's alone in the world, and we still have all the original problems. If we expose "goal" but not "active", we create new problems when we try to scale: going from 1 unit to 3 puts that first unit in an apparent minority, and is thus likely to effectively take the whole service down. So: I think we definitely need to expose both "goal" and "active" information. The interesting question is whether we can just expose numbers, or whether we need to expose actual sets of units (as we do for "current")... and I *think* we need sets, not just numbers, because: u/0:current=[u/1,u/2] u/0:active=[u/3,u/4] u/0:goal=[u/3,u/4] ...is legitimate, when 3,4 were created a while ago (and have just come up, but 0 has not yet run their joined hooks) and 1,2 were *just* destroyed (and have themselves left scope, but 0 has not yet run their departed hooks)... ...and if all we expose is numbers, there's no way for the charm to tell the difference between that state and a stable 0,1,2 cluster (or any of the other combinations with sets of the given sizes...) at least until more hooks fire. *Maybe* this doesn't matter, but I'm loath to assume that it *never* matters. Thoughts? Cheers William -- Juju-dev mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
