Well, if A can't talk to C but B can talk to both, it kind of depends on what the state was before the partition, and then what happens after the partition. If the leader is in A, all of the members of C will go into disconnected state, but may also try to become leader since they can talk to B. You might see some weird thrashing of election state etc. If the leader is in B you might be fine but honestly I've never tested that so far as I can recall. Really, if one site loses contact with one or more others, you probably just want to kill all the connections in that site until connectivity comes back. Best thing to do if faced with this question is to actually run a test that simulates it since these things always have a ton of nuance; it is unlikely that you will lose any data (the basic rules of the protocol account for this fairly well), but the performance might degrade in an unexpected way. I think it could happen, in a very bad case, where quorum is made with A to B, then flips to C to B due to network whatever, and data gets truncated. I would put pretty aggressive monitoring around this if I were implementing such a situation and kill one of the partitions if it happened, given the byzantine nature of the edge cases.
C On Wed, May 21, 2014 at 11:36 PM, Steven Bower <[email protected]>wrote: > I am contemplating setting up a zookeeper ensemble across multiple > facilities. I know the docs warn against multi-facility emsembles, but for > the sake of discussion can we assume that all are connected with the same > reliability/performance you'd expect if they were all in the same LAN. > > Imagine a ensemble with three facilities (A, B and C). Within each facility > there are 3 instances of zookeeper. So total 9 members of the ensemble > which gives us quorum at 5 instances. All facilities are connected with > point-to-point connections between each other (by point-to-point i'm > implying that if the connection between A and C went down that A could not > talk to C via B). > > > With this environment what behaviors would I see if for example the link > between A and B went down? > > Any other recommendations? > > thanks, > > steve >
