>> There were recently a post here from someone who has implemented this
Maybe this one? http://zookeeper-user.578899.n2.nabble.com/About-ZooKeeper-Dynamic-Reconfiguration-td7584271.html On Wed, Sep 25, 2019 at 9:19 PM Alexander Shraer <shra...@gmail.com> wrote: > There were recently a post here from someone who has implemented this, but > I couldn't find it for some reason. > > Essentially I think that you'd need to monitor the "health" and > connectivity of servers to the leader, and issue reconfig commands to > remove them when you suspect that they're down or add them back when you > think they're up. > Notice that you always have to have at least a quorum of the ensemble, so > issuing a reconfig command if a quorum is lost (or any other command) won't > work. > You could use the information exposed in ZK's 4 letter commands to decide > whether you think a server is up and connected to the quorum or down. > Ideally we could also use the leader's view on who is connected > but it doesn't look like this is being exposed right now. You can also > periodically issue test read/write operations on various servers to check > if they're really operational > > https://github.com/apache/zookeeper/blob/1ca627b5a3105d80ed4d851c6e9f1a1e2ac7d64a/zookeeper-docs/src/main/resources/markdown/zookeeperAdmin.md#sc_4lw > > As accurate failure detection is impossible in async. systems, you'll need > to decide how sensitive you are to potential failures vs false suspicions. > > Hope this helps... > > Alex > > On Wed, Sep 25, 2019 at 6:00 PM Gao,Wei <wei....@arcserve.com> wrote: > > > Hi Alexander Shraer, > > Could you please tell me how to implement automation on top? > > Thank you very much! > > > > -----Original Message----- > > From: Alexander Shraer (Jira) <j...@apache.org> > > Sent: Thursday, September 26, 2019 1:27 AM > > To: issues@zookeeper.apache.org > > Subject: [jira] [Commented] (ZOOKEEPER-3556) Dynamic configuration file > > can not be updated automatically after some zookeeper servers of zk > cluster > > are down > > > > > > [ > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D3556-3Fpage-3Dcom.atlassian.jira.plugin.system.issuetabpanels-3Acomment-2Dtabpanel-26focusedCommentId-3D16937925-23comment-2D16937925&d=DwIFaQ&c=ZmK7amRlbztwfC_NTU_hNw&r=bTmnMF5RGYcfg4qOcKQAYjkGGUtOB2jR22ryrk8hNWk&m=UNFnO3kfjtUL8Jievmh9VMXf_nTLKBCfuJsaxe6FshU&s=XxgusqUbHgFrxTfTTcYuxMWxol3W-1dJ7WVzUqh1HAE&e= > > ] > > > > Alexander Shraer commented on ZOOKEEPER-3556: > > --------------------------------------------- > > > > The described behavior is not a bug – currently reconfiguration requires > > explicit action by an operator. One could implement automation on top. We > > should consider this as a feature, since it sounds like several adopters > > have implemented such automation. Perhaps one of them could contribute > this > > upstream. > > > > > Dynamic configuration file can not be updated automatically after some > > > zookeeper servers of zk cluster are down > > > ---------------------------------------------------------------------- > > > ----------------------------------------- > > > > > > Key: ZOOKEEPER-3556 > > > URL: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D3556&d=DwIFaQ&c=ZmK7amRlbztwfC_NTU_hNw&r=bTmnMF5RGYcfg4qOcKQAYjkGGUtOB2jR22ryrk8hNWk&m=UNFnO3kfjtUL8Jievmh9VMXf_nTLKBCfuJsaxe6FshU&s=NQvX26JbBDNMmEtQhirmYk7ELe46vCjn4kbm1VqcNsA&e= > > > Project: ZooKeeper > > > Issue Type: Wish > > > Components: java client > > > Affects Versions: 3.5.5 > > > Reporter: Steven Chan > > > Priority: Major > > > Original Estimate: 12h > > > Remaining Estimate: 12h > > > > > > *I encountered a problem which blocks my development of load balance > > > using ZooKeeper 3.5.5.* > > > *Actually, I have a ZooKeeper cluster which comprises of five zk > > > servers. And the dynamic configuration file is as follows:* > > > ** > > > {color:#FF0000} > > > *server.1=zk1:2888:3888:participant;0.0.0.0:2181*{color} > > > {color:#FF0000} > > > *server.2=zk2:2888:3888:participant;0.0.0.0:2181*{color} > > > {color:#FF0000} > > > *server.3=zk3:2888:3888:participant;0.0.0.0:2181*{color} > > > {color:#FF0000} > > > *server.4=zk4:2888:3888:participant;0.0.0.0:2181*{color} > > > {color:#FF0000} > > > *server.5=zk5:2888:3888:participant;0.0.0.0:2181*{color} > > > ** > > > *The zk cluster can work fine if every member works normally. > > > However, if say two of them are suddenly down without previously being > > > notified,* *the dynamic configuration file shown above will not be > > > synchronized dynamically, which leads to the zk cluster fail to work > > > normally.* > > > *As far as I am concerned, the dynamic configuration file should be > > > modified to this if server 1 and server 5 are down suddenly as > > > follows:* {color:#FF0000} > > > *server.2=zk2:2888:3888:participant;0.0.0.0:2181*{color} > > > {color:#FF0000} > > > *server.3=zk3:2888:3888:participant;0.0.0.0:2181*{color} > > > {color:#FF0000} > > > *server.4=zk4:2888:3888:participant;0.0.0.0:2181*{color} > > > *But in this case, the dynamic configuration file will never change > > > automatically unless you manually revise it.* > > > *I think this is a very common case which may happen at any time. If > > > so, how can we handle with it?* > > > > > > > > -- > > This message was sent by Atlassian Jira > > (v8.3.4#803005) > > >