[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311816#comment-15311816
 ] 

Ed Rowe commented on ZOOKEEPER-2420:
------------------------------------

My original description was not 100% accurate - I've updated it to point out 
that loading the DB from logs actually only looks at a prior log if there isn't 
a log with the same zxid as the snapshot. However, whether there is equivalency 
between log zxid and snapshot zxid is timing dependent so the == case at best 
reduces occurrences of the bug.

I'm not sure why this issue hasn't been found before now. I discovered it by 
reading the code and then empirically reproduced it. 

The patch adds new test case testSnapFilesEqualsToRetainWithPrecedingLog to 
test this specific case in isolation, and updates existing test cases 
testSnapFilesGreaterThanToRetain and testSnapFilesLessThanToRetain (all in 
PurgeTxnTest.java) to test the corrected behavior. If these new/updated tests 
are run against the original code (modulo the refactoring I did in the patch) 
they will all fail. 

Re: patch naming convention for updated patches - I hadn't noticed that. Will 
follow it in the future.


> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> --------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2420
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>            Reporter: Ed Rowe
>            Assignee: Ed Rowe
>         Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to