[
https://issues.apache.org/jira/browse/ZOOKEEPER-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thawan Kooburat updated ZOOKEEPER-1624:
---------------------------------------
Attachment: ZOOKEEPER-1624.patch
This is a direct port of c-client test. However, I found that it cannot detect
the bug because of a timing issue. I think it is because both server and client
are in the same process for Java unit test, each multi request will get
committed before the next one arrive, so the bug won't occur.
On the other hand, with c-client test, the bug is always reproducible in my
box.
If anyone can help me on Java test I will be appreciated.
> PrepRequestProcessor abort multi-operation incorrectly
> ------------------------------------------------------
>
> Key: ZOOKEEPER-1624
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1624
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Reporter: Thawan Kooburat
> Assignee: Thawan Kooburat
> Priority: Critical
> Labels: zk-review
> Fix For: 3.5.0, 3.4.6
>
> Attachments: ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch,
> ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch
>
>
> We found this issue when trying to issue multiple instances of the following
> multi-op concurrently
> multi {
> 1. create sequential node /a-
> 2. create node /b
> }
> The expected result is that only the first multi-op request should success
> and the rest of request should fail because /b is already exist
> However, the reported result is that the subsequence multi-op failed because
> of sequential node creation failed which is not possible.
> Below is the return code for each sub-op when issuing 3 instances of the
> above multi-op asynchronously
> 1. ZOK, ZOK
> 2. ZOK, ZNODEEXISTS,
> 3. ZNODEEXISTS, ZRUNTIMEINCONSISTENCY,
> When I added more debug log. The cause is that PrepRequestProcessor rollback
> outstandingChanges of the second multi-op incorrectly causing sequential node
> name generation to be incorrect. Below is the sequential node name generated
> by PrepRequestProcessor
> 1. create /a-0001
> 2. create /a-0003
> 3. create /a-0001
> The bug is getPendingChanges() method. In failed to copied ChangeRecord for
> the parent node ("/"). So rollbackPendingChanges() cannot restore the right
> previous change record of the parent node when aborting the second multi-op
> The impact of this bug is that sequential node creation on the same parent
> node may fail until the previous one is committed. I am not sure if there is
> other implication or not.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira