[
https://issues.apache.org/jira/browse/ZOOKEEPER-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927470#comment-16927470
]
Hudson commented on ZOOKEEPER-3145:
-----------------------------------
FAILURE: Integrated in Jenkins build ZooKeeper-trunk #690 (See
[https://builds.apache.org/job/ZooKeeper-trunk/690/])
ZOOKEEPER-3145: Fix potential watch missing issue due to stale pzxid
(enrico.olivelli: rev 42ea26b75105484ef0504396332c276952224158)
* (edit) zookeeper-jute/src/main/resources/zookeeper.jute
* (edit)
zookeeper-server/src/main/java/org/apache/zookeeper/server/util/SerializeUtils.java
* (edit)
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/FuzzySnapshotRelatedTest.java
* (add)
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/CloseSessionTxnTest.java
* (edit)
zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java
* (edit)
zookeeper-server/src/main/java/org/apache/zookeeper/server/PrepRequestProcessor.java
* (edit)
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java
* (edit)
zookeeper-server/src/test/java/org/apache/zookeeper/server/PrepRequestProcessorTest.java
> Potential watch missing issue due to stale pzxid when replaying CloseSession
> txn with fuzzy snapshot
> ----------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3145
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3145
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.4, 3.6.0, 3.4.13
> Reporter: Fangmin Lv
> Assignee: Fangmin Lv
> Priority: Critical
> Labels: pull-request-available
> Fix For: 3.6.0
>
> Time Spent: 7h 20m
> Remaining Estimate: 0h
>
> This is another issue I found recently, we haven't seen this problem on prod
> (or maybe we don't notice).
>
> Currently, the CloseSession is not idempotent, executing the CloseSession
> twice won't get the same result.
>
> The problem is that closeSession will only check what's the ephemeral nodes
> associated with that session bases on current states. Nodes deleted during
> taking fuzzy snapshot won't be deleted again when replay the txn.
>
> This looks fine, since it's already gone, but there is problem with the pzxid
> of the parent node. Snapshot is taken fuzzily, so it's possible that the
> parent had been serialized while the nodes are being deleted when executing
> the closeSession Txn. The pzxid will not be updated in the snapshot when
> replaying the closeSession txn, because doesn't know what's the paths being
> deleted, so it won't patch the pzxid like what we did in the deleteNode
> ZOOKEEPER-3125.
>
> The inconsistent pzxid will lead to potential watch notification missing when
> client reconnect with setWatches because of the staleness.
>
> This JIRA is going to fix those issues by adding the CloseSessionTxn, it will
> record all those nodes being deleted in that CloseSession txn, so that we
> know which nodes to update when replaying the txn.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)