It is not possible to achieve the level of consistency you're after in an 
eventually consistent system such as ZooKeeper. There will always be an edge 
case where two ZooKeeper clients will believe they are leaders (though for a 
short period of time). In terms of how it affects Apache Curator, we have this 
Tech Note on the subject: 
https://cwiki.apache.org/confluence/display/CURATOR/TN10 
<https://cwiki.apache.org/confluence/display/CURATOR/TN10> (the description is 
true for any ZooKeeper client, not just Curator clients). If you do still 
intend to use a ZooKeeper lock/leader I suggest you try Apache Curator as 
writing these "recipes" is not trivial and have many gotchas that aren't 
obvious. 

-Jordan

http://curator.apache.org <http://curator.apache.org/>


> On Dec 5, 2018, at 6:20 PM, Michael Borokhovich <michael...@gmail.com> wrote:
> 
> Hello,
> 
> We have a service that runs on 3 hosts for high availability. However, at
> any given time, exactly one instance must be active. So, we are thinking to
> use Leader election using Zookeeper.
> To this goal, on each service host we also start a ZK server, so we have a
> 3-nodes ZK cluster and each service instance is a client to its dedicated
> ZK server.
> Then, we implement a leader election on top of Zookeeper using a basic
> recipe:
> https://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_leaderElection.
> 
> I have the following questions doubts regarding the approach:
> 
> 1. It seems like we can run into inconsistency issues when network
> partition occurs. Zookeeper documentation says that the inconsistency
> period may last “tens of seconds”. Am I understanding correctly that during
> this time we may have 0 or 2 leaders?
> 2. Is it possible to reduce this inconsistency time (let's say to 3
> seconds) by tweaking tickTime and syncLimit parameters?
> 3. Is there a way to guarantee exactly one leader all the time? Should we
> implement a more complex leader election algorithm than the one suggested
> in the recipe (using ephemeral_sequential nodes)?
> 
> Thanks,
> Michael.

Reply via email to