I created a link off of the main wiki and the page itself: http://wiki.apache.org/hadoop/ZooKeeper/FailureScenarios
Would someone please review it? Specifically, I am curious to know about this: "if the leader is in the non-quorum side of the partition, that side of the partition will recognize that it no longer has a quorum of the ensemble. The leader will be demoted to being a regular ZooKeeper server and those nodes will no longer accept reads or writes." I just wanted to clarify - in the time for the non-quorum side to recognize it is no longer a quorum, will there ever be writes that get through? Is it guaranteed that it won't accept writes after the partition? I don't think that guarantee can exist, but wondered how to handle that. On Dec 9, 2010, at 2:04 PM, Mahadev Konar wrote: > Hi Jeremy, > Responses in line below: > > On 12/9/10 11:53 AM, "Jeremy Hanna" <[email protected]> wrote: > > I looked around on the wiki and in the user list archives and couldn't find > something definitive about certain failure scenarios. > > A partition splits the ensemble where a quorum is on one side of the partition > -- if the leader is on the quorum side of the partition, what happens to > reads/writes that go to the non-quorum side? I assume writes return errors > because it can't get to the leader. Reads? > >> The reads will also fail on all the quorum nodes until a new quorum is >> elected. > > -- if the leader is on the non-quorum side of the partition, I would assume > that the quorum side of the partition would elect a new leader for those > clients on its side of the partition. However, is there the possibility for > the leader on the non-quorum side to accept writes before it realizes that > there's no longer a quorum? Just wondering about the possibility of > corruption and then when the cluster syncs back up how the cluster would > handle that data. > >> No there isnt. The leader relinquishes its right as a leader as soon as it >> realizes a quorum isnt committing the changes it proposed. > > (I would be happy to create a wiki page for failure scenarios if one doesn't > exist that people could add to, but maybe this is just common knowledge.) > >> Please do! > > thanks > mahadev
