[ https://issues.apache.org/jira/browse/CURATOR-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768091#comment-13768091 ]
Germán Blanco edited comment on CURATOR-3 at 9/16/13 6:04 AM: -------------------------------------------------------------- Isn't there a relation between this issue and CURATOR-45? was (Author: abranzyck): Isn't there a relation between this issue and CURATOR-54? > LeaderLatch race condition causing extra nodes to be added in Zookeeper Edit > ---------------------------------------------------------------------------- > > Key: CURATOR-3 > URL: https://issues.apache.org/jira/browse/CURATOR-3 > Project: Apache Curator > Issue Type: Bug > Components: Recipes > Affects Versions: 2.0.0-incubating > Reporter: Jordan Zimmerman > > From https://github.com/Netflix/curator/issues/265 > Looks like there's a race condition in LeaderLatch. If LeaderLatch.close() is > called at the right time while the latch's watch handler is running, the > latch will place another node in Zookeeper after the latch is closed. > Basically how it happens is this: > 1) I have two processes contesting a LeaderLatch, ProcessA and ProcessB. > ProcessA is leader. > 2) ProcessA loses leadership somehow (it releases, its connection goes down, > etc.) > 3) This causes ProcessB's watch to get called, check the state is still > STARTED, and if so the LeaderLatch will re-evaluate if it is leader. > 4) While the watch handler is running, close() is called on the LeaderLatch > on ProcessB. This sets the LeaderLatch state to CLOSED, removes the znode > from ZK and closes off the LeaderLatch. > 5) The watch handler has already checked that the state is STARTED, so it > does a getChildren() on the latch path, and finds the latch's znode is > missing. It goes ahead and calls reset(), which places a new znode in > Zookeeper. > Result: The LeaderLatch is closed, but there is still a node in Zookeeper > that isn't associated with any LeaderLatch and won't go away until the > session goes down. Subsequent LeaderLatches at this path can never get > leadership while that session is up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira