GitHub user Randgalt opened a pull request:
https://github.com/apache/curator/pull/299
[CURATOR-498] - Fix protection mode race with ephemeral nodes
"Protection" has a potential bug. If the connection is lost for long
enough, Curator will want to kill the session. Session deletions must be
handled by the Leader ZK instance. At the same time that the session kill is
being processed, Curator's protection mode handling could be calling the
follower that it's connected to get the current list of children - this can be
handled directly by the follower instance without needing to call the leader.
So, in this scenario, the client will get a list of children that includes the
ZNode that will get deleted as part of killing the session.
This bug has been in Curator since we added the protection feature to it
more than 6 years ago. The fix is to include the session ID in the protection
ID that is generated for the node name when the create mode is an ephemeral
type. Then, if findProtectedNodeInForeground() finds the node in the use-case
we've been discussing, it can compare the session ID to the current ZooKeeper
handle's session ID and disregard the found node if they don't match.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/curator CURATOR-498
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/curator/pull/299.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #299
----
commit 4b0bc85d066f8582b55d76092c391bad04bd48a5
Author: randgalt <randgalt@...>
Date: 2018-12-31T11:24:02Z
CURATOR-498 - include session ID in log message for injecting session
expiration
commit dafd091412a834a128c9882d2b9534d1a0ff7735
Author: randgalt <randgalt@...>
Date: 2019-01-02T03:34:41Z
CURATOR-498
"Protection" has a potential bug. If the connection is lost for long
enough, Curator will want to kill the session. Session deletions must be
handled by the Leader ZK instance. At the same time that the session kill is
being processed, Curator's protection mode handling could be calling the
follower that it's connected to get the current list of children - this can be
handled directly by the follower instance without needing to call the leader.
So, in this scenario, the client will get a list of children that includes the
ZNode that will get deleted as part of killing the session.
This bug has been in Curator since we added the protection feature to it
more than 6 years ago. The fix is to include the session ID in the protection
ID that is generated for the node name when the create mode is an ephemeral
type. Then, if findProtectedNodeInForeground() finds the node in the use-case
we've been discussing, it can compare the session ID to the current ZooKeeper
handle's session ID and disregard the found node if they don't match.
----
---