Hi Jordan, Correct, I know that the internal leader election has nothing to do with the leader election of my application through Curator.
What we are observing is that when restarting (or killing) 1 or 2 servers from a Zookeeper ensemble of 5 nodes this is triggering a leader election of my application. Our expectation is that this should not occur, since I still have quorum in the Zookeeper ensemble. Is that the correct expectation ? On Wed, Apr 18, 2018 at 6:08 PM, Jordan Zimmerman < jor...@jordanzimmerman.com> wrote: > The term "leader election" has two meanings here. The kind of leader > election that your application uses with Apache Curator is different from > the internal leader election that ZooKeeper does amongst its nodes. For > clarity, the internal leader election should probably be renamed to "master > election" or something. In a ZooKeeper ensemble one instance is always > chosen as the leader/master. All writes, etc. go through this master. If > this master instance goes down (due to crash, restart, chaos monkey, etc.) > then the ensemble must choose a new leader/master. This is simply how > ZooKeeper works. > > > - If the Zookeeper leader node fails, are all sessions lost? > > No. Sessions are transactions in the ZK database like any other. When a > new ZK leader/master is elected the sessions will continue. In fact, the > session time is reset as the leader/master sets the status of time "0". > > > - What parameters control how quickly the zookeeper nodes elect a new > > leader? > > I believe "initLimit" is the most important one here (others can correct > me). > > > - Can I have longer timeouts in my application before giving up > > leadership than that of the zookeeper nodes? > > I don't totally understand this question. The internal leader/master > election has nothing whatever to do with Apache Curator leaders. > > -Jordan > > > On Apr 19, 2018, at 7:32 AM, Tecno Brain <cerebrotecnolog...@gmail.com> > wrote: > > > > Hi, > > I have a cluster of five Zookeeper nodes. > > > > I have an application deployed in two other servers that execute a > leader > > election process using the Curator recipe ( > > https://curator.apache.org/curator-recipes/leader-election.html) > > > > My DevOps has been executing a ChaosMonkey type of test and they > > complained that my application triggered a change in leadership when > > they *restarted > > two of the Zookeeper* nodes (the leader node and an extra node). > > > > I find it normal, but they claim that the application should let the > > Zookeeper nodes elect its own new leader and my application should not > > change leadership because the current leader did not fail, the failure > was > > in the Zookeeper cluster. > > > > So, my question is: > > - If the Zookeeper leader node fails, are all sessions lost? > > - What parameters control how quickly the zookeeper nodes elect a new > > leader? > > - Can I have longer timeouts in my application before giving up > > leadership than that of the zookeeper nodes? > > > > My application currently runs an "expensive" task when taking leadership, > > therefore we want to minimize the change of leadership, specially if it > > wasn't because the application failed, but rather because the Zookeeper > > cluster was unstable. > > > > I want to understand Zookeeper own leadership election process to be able > > to either modify the Curator recipe or have a solid argument to explain > > that what I am asked to do is not possible. > > Any pointers are welcome. > > > > -J > >