[
https://issues.apache.org/jira/browse/CURATOR-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13909734#comment-13909734
]
Jordan Zimmerman commented on CURATOR-3:
----------------------------------------
Is this related to CURATOR-73? Please test with the most recent push.
> LeaderLatch race condition causing extra nodes to be added in Zookeeper Edit
> ----------------------------------------------------------------------------
>
> Key: CURATOR-3
> URL: https://issues.apache.org/jira/browse/CURATOR-3
> Project: Apache Curator
> Issue Type: Bug
> Components: Recipes
> Affects Versions: 2.0.0-incubating
> Reporter: Jordan Zimmerman
> Fix For: TBD
>
>
> From https://github.com/Netflix/curator/issues/265
> Looks like there's a race condition in LeaderLatch. If LeaderLatch.close() is
> called at the right time while the latch's watch handler is running, the
> latch will place another node in Zookeeper after the latch is closed.
> Basically how it happens is this:
> 1) I have two processes contesting a LeaderLatch, ProcessA and ProcessB.
> ProcessA is leader.
> 2) ProcessA loses leadership somehow (it releases, its connection goes down,
> etc.)
> 3) This causes ProcessB's watch to get called, check the state is still
> STARTED, and if so the LeaderLatch will re-evaluate if it is leader.
> 4) While the watch handler is running, close() is called on the LeaderLatch
> on ProcessB. This sets the LeaderLatch state to CLOSED, removes the znode
> from ZK and closes off the LeaderLatch.
> 5) The watch handler has already checked that the state is STARTED, so it
> does a getChildren() on the latch path, and finds the latch's znode is
> missing. It goes ahead and calls reset(), which places a new znode in
> Zookeeper.
> Result: The LeaderLatch is closed, but there is still a node in Zookeeper
> that isn't associated with any LeaderLatch and won't go away until the
> session goes down. Subsequent LeaderLatches at this path can never get
> leadership while that session is up.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)