[
https://issues.apache.org/jira/browse/ZOOKEEPER-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768874#comment-13768874
]
Flavio Junqueira commented on ZOOKEEPER-1624:
---------------------------------------------
@thawan, Could you give me some feedback here as well, please?
As for the Java test, I was thinking that to make a multi-op transaction fail
reliably, you could use check() with a znode, version pair that doesn't match,
which will cause the transaction to fail. If the transaction also includes the
creation of a sequential znode, then you should be able to trigger this bug, no?
> PrepRequestProcessor abort multi-operation incorrectly
> ------------------------------------------------------
>
> Key: ZOOKEEPER-1624
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1624
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Reporter: Thawan Kooburat
> Assignee: Thawan Kooburat
> Priority: Critical
> Labels: zk-review
> Fix For: 3.5.0, 3.4.6
>
> Attachments: ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch,
> ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch
>
>
> We found this issue when trying to issue multiple instances of the following
> multi-op concurrently
> multi {
> 1. create sequential node /a-
> 2. create node /b
> }
> The expected result is that only the first multi-op request should success
> and the rest of request should fail because /b is already exist
> However, the reported result is that the subsequence multi-op failed because
> of sequential node creation failed which is not possible.
> Below is the return code for each sub-op when issuing 3 instances of the
> above multi-op asynchronously
> 1. ZOK, ZOK
> 2. ZOK, ZNODEEXISTS,
> 3. ZNODEEXISTS, ZRUNTIMEINCONSISTENCY,
> When I added more debug log. The cause is that PrepRequestProcessor rollback
> outstandingChanges of the second multi-op incorrectly causing sequential node
> name generation to be incorrect. Below is the sequential node name generated
> by PrepRequestProcessor
> 1. create /a-0001
> 2. create /a-0003
> 3. create /a-0001
> The bug is getPendingChanges() method. In failed to copied ChangeRecord for
> the parent node ("/"). So rollbackPendingChanges() cannot restore the right
> previous change record of the parent node when aborting the second multi-op
> The impact of this bug is that sequential node creation on the same parent
> node may fail until the previous one is committed. I am not sure if there is
> other implication or not.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira