Hi Srini,
ZooKeeper service will be available if 'quorum' number of servers are
running(simple majority voting factors).
I could see, one of the reason to get a majority vote is to avoid "split-brain"
problem. In a network failure we don't want the two parts of the system to
continue as usual. We need only one part to continue and the other to
understand that it is out of the cluster and keep quiet.
The main reason for suggesting odd number is, with even there won't get much
benefit to the tolerated failures in terms of majority. With 3 and 4 servers,
we could see the majority is 2 and 3. But in both the cases, the tolerated
number of failure is 1.
Quorum = Leader + Followers,
(2n+1) nodes can tolerate failure of 'n' nodes.
For example,
n=0, (2*0+1) -> 1 server = standalone. Here there is no quorum majority.
-> 2 servers = majority is 2. So it needs min 2 servers to form
quorum. Tolerated failure is 0, if >0 failure will drop quorum automatically.
n=1, (2*1+1) -> 3 servers = majority is 2. So it needs min 2 servers to form
quorum. Tolerated failure is 1, if >1 failures will drop quorum automatically.
-> 4 servers = majority is 3. So it needs min 3 servers to form
quorum. Tolerated failure is 1, if >1 failures will drop quorum automatically.
n=2, (2*2+1) -> 5 servers = majority is 3. So it needs min 3 servers to form
quorum. Tolerated failure is 2, if >2 failures will drop quorum automatically.
-> 6 servers = majority is 4. So it needs min 4 servers to form
quorum. Tolerated failure is 2, if >2 failures will drop quorum automatically.
n=3, (2*3+1) -> 7 servers = majority is 4. So it needs min 4 servers to form
quorum. Tolerated failure is 3, if >3 failures will drop quorum automatically.
-> 8 servers = majority is 5. So it needs min 5 servers to form
quorum. Tolerated failure is 3, if >3 failures will drop quorum automatically.
-Rakesh
-----Original Message-----
From: Srinivasan Veerapandian [mailto:[email protected]]
Sent: 13 July 2015 11:48
To: [email protected]
Subject: ZooKeeper ensemble. Size and Impact ?
Hi,
We know ZK demands odd number of servers to provide reliability.
My requirement on having zookeeper in my application is to "know the
application status" from all the clients(Max 100).
And today my application can support deployment 1+1(=2) to N+1(=100) Given this
I would like to go with 2 ZK servers in two different instances because adding
one more server for this purpose would be demand one more instance itself in my
1+1 deployment model.
Questions:
1. What would happen to ensemble formed ? Would the service goes down
automatically ?
2. What would be the impact if number ZK server instances are even (E.g.
2)
How do I size a Zoo Keeper ensemble (cluster)?
https://cwiki.apache.org/confluence/display/ZOOKEEPER/FAQ
Designing a Zoo Keeper Deployment
http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
Thanks,
Srini