That's great! Thanks for sharing. > Added benefit is that we can also control which data center gets the quorum > in case of a network outage between the two.
Can you explain how this works? In case of a network outage between two DCs, one of them has a quorum of participants and the other doesn't. The participants in the smaller set should not be operational at this time, since they can't get quorum. no ? Thanks, Alex On Wed, Aug 21, 2019 at 7:55 AM Cee Tee <c.turks...@gmail.com> wrote: > We have solved this by implementing a 'zookeeper cluster balancer', it > calls the admin server api of each zookeeper to get the current status and > will issue dynamic reconfigure commands to change dead servers into > observers so the quorum is not in danger. Once the dead servers reconnect, > they take the observer role and are then reconfigured into participants > again. > > Added benefit is that we can also control which data center gets the > quorum > in case of a network outage between the two. > Regards > Chris > > On 21 August 2019 16:42:37 Alexander Shraer <shra...@gmail.com> wrote: > > > Hi, > > > > Reconfiguration, as implemented, is not automatic. In your case, when > > failures happen, this doesn't change the ensemble membership. > > When 2 of 5 fail, this is still a minority, so everything should work > > normally, you just won't be able to handle an additional failure. If > you'd > > like > > to remove them from the ensemble, you need to issue an explicit > > reconfiguration command to do so. > > > > Please see details in the manual: > > https://zookeeper.apache.org/doc/r3.5.5/zookeeperReconfig.html > > > > Alex > > > > On Wed, Aug 21, 2019 at 7:29 AM Gao,Wei <wei....@arcserve.com> wrote: > > > >> Hi > >> I encounter a problem which blocks my development of load balance > using > >> ZooKeeper 3.5.5. > >> Actually, I have a ZooKeeper cluster which comprises of five zk > >> servers. And the dynamic configuration file is as follows: > >> > >> server.1=zk1:2888:3888:participant;0.0.0.0:2181 > >> server.2=zk2:2888:3888:participant;0.0.0.0:2181 > >> server.3=zk3:2888:3888:participant;0.0.0.0:2181 > >> server.4=zk4:2888:3888:participant;0.0.0.0:2181 > >> server.5=zk5:2888:3888:participant;0.0.0.0:2181 > >> > >> The zk cluster can work fine if every member works normally. However, > if > >> say two of them are suddenly down without previously being notified, > >> the dynamic configuration file shown above will not be synchronized > >> dynamically, which leads to the zk cluster fail to work normally. > >> I think this is a very common case which may happen at any time. If > so, > >> how can we resolve it? > >> Really look forward to hearing from you! > >> Thanks > >> > > > >