[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493763#comment-14493763
 ] 

Yasuhito Fukuda commented on ZOOKEEPER-1797:
--------------------------------------------


I tried to reproduce this issue using the unit-test code of the attachment 
patch(testPurgeWhenLogRollingInProgress) in zookeeper ver 3.4.5.
But I could not reproduce.

So, I tried to reproduce the bug with earthquake.
Earthquake is a framework of fault injectors and implementation level model 
checkers. It aims to find bugs in distributed system related to hardware faults 
and non-determinism.
For more information please refer to the github.
https://github.com/osrg/earthquake

As a result, znode loss could be reproduced successfully. I'd like to share the 
test case with zookeeper community because it is useful for checking degrading 
which can be caused during development.
The test case can be found in the below branch of our zookeeper repository:
https://github.com/osrg/zookeeper/tree/zookeeper-1797

Useful information for using the test case and reproducing the bug can be found 
in the below file:
https://github.com/osrg/zookeeper/tree/zookeeper-1797/earthquake/zookeeper-1797/README.txt

Because of immaturity of earthquake, the test code isn't user friendly. But the 
test case itself is useful and earthquake can be used for testing zookeeper and 
find other bugs potentially, so I'd like to hear your comments.

Regards,
Fukuda


> PurgeTxnLog may delete data logs during roll
> --------------------------------------------
>
>                 Key: ZOOKEEPER-1797
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1797
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.5
>            Reporter: Derek Dagit
>            Assignee: Rakesh R
>            Priority: Blocker
>             Fix For: 3.4.7, 3.5.0
>
>         Attachments: ZOOKEEPER-1797.patch, ZOOKEEPER-1797.patch, 
> ZOOKEEPER-1797.patch, ZOOKEEPER-1797.patch
>
>
> org.apache.zookeeper.server.PurgeTxnLog deletes old data logs and snapshots, 
> keeping the newest N snapshots and any data logs that have been written since 
> the snapshot.
> It does this by listing the available snapshots & logs and creates a 
> blacklist of snapshots and logs that should not be deleted.  Then, it 
> searches for and deletes all logs and snapshots that are not in this list.
> It appears that if logs are rolling or a new snapshot is created during this 
> process, then these newer files will be unintentionally deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to