[
https://issues.apache.org/jira/browse/ZOOKEEPER-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493763#comment-14493763
]
Yasuhito Fukuda commented on ZOOKEEPER-1797:
--------------------------------------------
I tried to reproduce this issue using the unit-test code of the attachment
patch(testPurgeWhenLogRollingInProgress) in zookeeper ver 3.4.5.
But I could not reproduce.
So, I tried to reproduce the bug with earthquake.
Earthquake is a framework of fault injectors and implementation level model
checkers. It aims to find bugs in distributed system related to hardware faults
and non-determinism.
For more information please refer to the github.
https://github.com/osrg/earthquake
As a result, znode loss could be reproduced successfully. I'd like to share the
test case with zookeeper community because it is useful for checking degrading
which can be caused during development.
The test case can be found in the below branch of our zookeeper repository:
https://github.com/osrg/zookeeper/tree/zookeeper-1797
Useful information for using the test case and reproducing the bug can be found
in the below file:
https://github.com/osrg/zookeeper/tree/zookeeper-1797/earthquake/zookeeper-1797/README.txt
Because of immaturity of earthquake, the test code isn't user friendly. But the
test case itself is useful and earthquake can be used for testing zookeeper and
find other bugs potentially, so I'd like to hear your comments.
Regards,
Fukuda
> PurgeTxnLog may delete data logs during roll
> --------------------------------------------
>
> Key: ZOOKEEPER-1797
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1797
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.4.5
> Reporter: Derek Dagit
> Assignee: Rakesh R
> Priority: Blocker
> Fix For: 3.4.7, 3.5.0
>
> Attachments: ZOOKEEPER-1797.patch, ZOOKEEPER-1797.patch,
> ZOOKEEPER-1797.patch, ZOOKEEPER-1797.patch
>
>
> org.apache.zookeeper.server.PurgeTxnLog deletes old data logs and snapshots,
> keeping the newest N snapshots and any data logs that have been written since
> the snapshot.
> It does this by listing the available snapshots & logs and creates a
> blacklist of snapshots and logs that should not be deleted. Then, it
> searches for and deletes all logs and snapshots that are not in this list.
> It appears that if logs are rolling or a new snapshot is created during this
> process, then these newer files will be unintentionally deleted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)