[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620009#comment-13620009
 ] 

Alexander Shraer commented on ZOOKEEPER-1629:
---------------------------------------------

I think the snapshotting is a problem by looking at the log above: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1439//testReport/org.apache.zookeeper.server/TruncateCorruptionTest/testTransactionLogCorruption/

zk1 still sees /test2 (39:59), even after its log is truncated (39:57). This is 
because earlier zk1 managed to make a snapshot (39:17). I think the explanation 
is this: truncate doesn't touch the snapshot. After the truncate when we load 
the database, we first start from the snapshot, then apply the truncated log. 
So /test2 is showing up, and we have an inconsistent state between the servers 
(only zk1 sees /test2). 

The reason why I think this is not a ZooKeeper bug but a bug of this test is 
that the test explicitly erases the state of 2 out of 3 servers in order to 
cause truncate on the log of zk1. This situation is not something that was 
considered possible in ZooKeeper (for good reasons), and it doesn't work as can 
be seen from the log.

I suggest to first figure out if this is something we can fix. If not, there is 
no point in explaining this test better :)
                
> testTransactionLogCorruption occasionally fails
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-1629
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1629
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: tests
>            Reporter: Alexander Shraer
>         Attachments: TruncateCorruptionTest-patch.patch
>
>
> It seems that testTransactionLogCorruption is very flaky,for example fails 
> here:
> https://builds.apache.org/job/ZooKeeper-trunk-jdk7/500/
> https://builds.apache.org/job/ZooKeeper-trunk-jdk7/502/
> https://builds.apache.org/job/ZooKeeper-trunk-jdk7/503/#showFailuresLink
> also fails for older builds (no longer on the website), for example all 
> builds from 381 to 399.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to