Hi Peter, It's the second option. The servers don't know if the leader failed or was partitioned from them. So each group of 3 servers in your scenario can't distinguish the situation from another scenario where none of the servers failed but these 3 servers are partitioned from the other 4. To prevent a split brain in an asynchronous network a leader must have the support of a quorum.
Alex > -----Original Message----- > From: cheetah [mailto:[email protected]] > Sent: Tuesday, August 30, 2011 12:23 AM > To: [email protected] > Subject: How zab avoid split-brain problem? > > Hi folks, > I am reading the zab paper, but a bit confusing how zab handle > split > brain problem. > Suppose there are A, B, C, D, E, F and G seven servers, now A is > the > leader. When A dies and at the same time, B,C,D are isolated from E, F > and > G. > In this case, will Zab continue working like this: if B>C>D and > E>F>G, > so the two groups are both voting and electing B and E as their leaders > separately. Thus, there is a split brain problem. > Or Zookeeper just stop working, because there were original 7 > servers, > after 1 failure, a new leader still expects to have a quorum of 3 > servers > voting for it as the leader. And because the two groups are separate > from > each other, no leader can be elected out. > > If it is the first case, Zookeeper will have a split brain > problem, > which probably is not the case. But in the second case, a 7-node > Zookeeper > service can only handle a node failure and a network partition failure. > > Am I understanding wrongly? Looking forward to your insights. > > Thanks, > Peter
