exactly, thank you Michael :) On Wed, Sep 25, 2019 at 9:32 PM Michael Han <h...@apache.org> wrote:
> >> There were recently a post here from someone who has implemented this > > Maybe this one? > > http://zookeeper-user.578899.n2.nabble.com/About-ZooKeeper-Dynamic-Reconfiguration-td7584271.html > > On Wed, Sep 25, 2019 at 9:19 PM Alexander Shraer <shra...@gmail.com> > wrote: > > > There were recently a post here from someone who has implemented this, > but > > I couldn't find it for some reason. > > > > Essentially I think that you'd need to monitor the "health" and > > connectivity of servers to the leader, and issue reconfig commands to > > remove them when you suspect that they're down or add them back when you > > think they're up. > > Notice that you always have to have at least a quorum of the ensemble, so > > issuing a reconfig command if a quorum is lost (or any other command) > won't > > work. > > You could use the information exposed in ZK's 4 letter commands to decide > > whether you think a server is up and connected to the quorum or down. > > Ideally we could also use the leader's view on who is connected > > but it doesn't look like this is being exposed right now. You can also > > periodically issue test read/write operations on various servers to check > > if they're really operational > > > > > https://github.com/apache/zookeeper/blob/1ca627b5a3105d80ed4d851c6e9f1a1e2ac7d64a/zookeeper-docs/src/main/resources/markdown/zookeeperAdmin.md#sc_4lw > > > > As accurate failure detection is impossible in async. systems, you'll > need > > to decide how sensitive you are to potential failures vs false > suspicions. > > > > Hope this helps... > > > > Alex > > > > On Wed, Sep 25, 2019 at 6:00 PM Gao,Wei <wei....@arcserve.com> wrote: > > > > > Hi Alexander Shraer, > > > Could you please tell me how to implement automation on top? > > > Thank you very much! > > > > > > -----Original Message----- > > > From: Alexander Shraer (Jira) <j...@apache.org> > > > Sent: Thursday, September 26, 2019 1:27 AM > > > To: iss...@zookeeper.apache.org > > > Subject: [jira] [Commented] (ZOOKEEPER-3556) Dynamic configuration file > > > can not be updated automatically after some zookeeper servers of zk > > cluster > > > are down > > > > > > > > > [ > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D3556-3Fpage-3Dcom.atlassian.jira.plugin.system.issuetabpanels-3Acomment-2Dtabpanel-26focusedCommentId-3D16937925-23comment-2D16937925&d=DwIFaQ&c=ZmK7amRlbztwfC_NTU_hNw&r=bTmnMF5RGYcfg4qOcKQAYjkGGUtOB2jR22ryrk8hNWk&m=UNFnO3kfjtUL8Jievmh9VMXf_nTLKBCfuJsaxe6FshU&s=XxgusqUbHgFrxTfTTcYuxMWxol3W-1dJ7WVzUqh1HAE&e= > > > ] > > > > > > Alexander Shraer commented on ZOOKEEPER-3556: > > > --------------------------------------------- > > > > > > The described behavior is not a bug – currently reconfiguration > requires > > > explicit action by an operator. One could implement automation on top. > We > > > should consider this as a feature, since it sounds like several > adopters > > > have implemented such automation. Perhaps one of them could contribute > > this > > > upstream. > > > > > > > Dynamic configuration file can not be updated automatically after > some > > > > zookeeper servers of zk cluster are down > > > > > ---------------------------------------------------------------------- > > > > ----------------------------------------- > > > > > > > > Key: ZOOKEEPER-3556 > > > > URL: > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ZOOKEEPER-2D3556&d=DwIFaQ&c=ZmK7amRlbztwfC_NTU_hNw&r=bTmnMF5RGYcfg4qOcKQAYjkGGUtOB2jR22ryrk8hNWk&m=UNFnO3kfjtUL8Jievmh9VMXf_nTLKBCfuJsaxe6FshU&s=NQvX26JbBDNMmEtQhirmYk7ELe46vCjn4kbm1VqcNsA&e= > > > > Project: ZooKeeper > > > > Issue Type: Wish > > > > Components: java client > > > > Affects Versions: 3.5.5 > > > > Reporter: Steven Chan > > > > Priority: Major > > > > Original Estimate: 12h > > > > Remaining Estimate: 12h > > > > > > > > *I encountered a problem which blocks my development of load balance > > > > using ZooKeeper 3.5.5.* > > > > *Actually, I have a ZooKeeper cluster which comprises of five zk > > > > servers. And the dynamic configuration file is as follows:* > > > > ** > > > > {color:#FF0000} > > > > *server.1=zk1:2888:3888:participant;0.0.0.0:2181*{color} > > > > {color:#FF0000} > > > > *server.2=zk2:2888:3888:participant;0.0.0.0:2181*{color} > > > > {color:#FF0000} > > > > *server.3=zk3:2888:3888:participant;0.0.0.0:2181*{color} > > > > {color:#FF0000} > > > > *server.4=zk4:2888:3888:participant;0.0.0.0:2181*{color} > > > > {color:#FF0000} > > > > *server.5=zk5:2888:3888:participant;0.0.0.0:2181*{color} > > > > ** > > > > *The zk cluster can work fine if every member works normally. > > > > However, if say two of them are suddenly down without previously > being > > > > notified,* *the dynamic configuration file shown above will not be > > > > synchronized dynamically, which leads to the zk cluster fail to work > > > > normally.* > > > > *As far as I am concerned, the dynamic configuration file should be > > > > modified to this if server 1 and server 5 are down suddenly as > > > > follows:* {color:#FF0000} > > > > *server.2=zk2:2888:3888:participant;0.0.0.0:2181*{color} > > > > {color:#FF0000} > > > > *server.3=zk3:2888:3888:participant;0.0.0.0:2181*{color} > > > > {color:#FF0000} > > > > *server.4=zk4:2888:3888:participant;0.0.0.0:2181*{color} > > > > *But in this case, the dynamic configuration file will never change > > > > automatically unless you manually revise it.* > > > > *I think this is a very common case which may happen at any time. > If > > > > so, how can we handle with it?* > > > > > > > > > > > > -- > > > This message was sent by Atlassian Jira > > > (v8.3.4#803005) > > > > > >