[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-11-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662934#comment-15662934
 ] 

Hadoop QA commented on ZOOKEEPER-2420:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12806310/ZOOKEEPER-2420.patch_v3
  against trunk revision 73e102a58d01b27bc6208bbfbde2d12f0deba1f4.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 20 new Findbugs (version 
3.0.1) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3532//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3532//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3532//console

This message is automatically generated.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-11-13 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662891#comment-15662891
 ] 

Rakesh R commented on ZOOKEEPER-2420:
-

Thanks [~EdRowe] for the good work.

Hi All,  

Like [~hanm] pointed out, both this jira and ZOOKEEPER-2574 are addressing same 
case. Could you please look at [my comments in ZK-2574 
jira|https://issues.apache.org/jira/browse/ZOOKEEPER-2574?focusedCommentId=15662873=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15662873]
 and let me know your thoughts. Based on your feedback will take an action. 
Thanks!

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-14 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331098#comment-15331098
 ] 

Patrick Hunt commented on ZOOKEEPER-2420:
-

Agree - kudos Ed!

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-02 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313341#comment-15313341
 ] 

Michael Han commented on ZOOKEEPER-2420:


Yes it's 3 and restore only uses the latest snapshot.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-02 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313185#comment-15313185
 ] 

Flavio Junqueira commented on ZOOKEEPER-2420:
-

My recollection is that the minimum is a value greater than 1, I think it is 3.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-02 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313124#comment-15313124
 ] 

Michael Han commented on ZOOKEEPER-2420:


bq. However, in most cases, the snapshot that matters is the latest and not the 
oldest, so even if we screw up with respect to the oldest, we will very likely 
have all txn logs to make it work.
My guess is in practice production clusters usually retain more than one 
snapshot; otherwise, with one snapshot it's both the old and the new one, which 
might make the issue easier to appear.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-02 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313099#comment-15313099
 ] 

Flavio Junqueira commented on ZOOKEEPER-2420:
-

I'm not saying we shouldn't fix this, in fact, [~EdRowe] has been doing a 
pretty good job at spotting corner cases with the snapshot/txn log code. 
However, in most cases, the snapshot that matters is the latest and not the 
oldest, so even if we screw up with respect to the oldest, we will very likely 
have all txn logs to make it work. I'm clearly putting aside the issue 
described in ZOOKEEPER-1549, which is the king of the blockers for me. 

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-02 Thread Ed Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311816#comment-15311816
 ] 

Ed Rowe commented on ZOOKEEPER-2420:


My original description was not 100% accurate - I've updated it to point out 
that loading the DB from logs actually only looks at a prior log if there isn't 
a log with the same zxid as the snapshot. However, whether there is equivalency 
between log zxid and snapshot zxid is timing dependent so the == case at best 
reduces occurrences of the bug.

I'm not sure why this issue hasn't been found before now. I discovered it by 
reading the code and then empirically reproduced it. 

The patch adds new test case testSnapFilesEqualsToRetainWithPrecedingLog to 
test this specific case in isolation, and updates existing test cases 
testSnapFilesGreaterThanToRetain and testSnapFilesLessThanToRetain (all in 
PurgeTxnTest.java) to test the corrected behavior. If these new/updated tests 
are run against the original code (modulo the refactoring I did in the patch) 
they will all fail. 

Re: patch naming convention for updated patches - I hadn't noticed that. Will 
follow it in the future.


> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). However, unless there is a log file with the same 
> zxid as the oldest snapshot file being retained (and whether log file and 
> snapshot file zxids are equal is timing dependent), loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid. 
> Thus, to avoid data loss autopurge should retain the log file prior to the 
> oldest retained snapshot as well, unless it verifies that it contains no 
> zxids beyond what the snapshot contains or there is a log file whose zxid == 
> snapshot zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-01 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311349#comment-15311349
 ] 

Patrick Hunt commented on ZOOKEEPER-2420:
-

ps. please follow the naming convention for patches defined in htc: 
https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute as it 
simplifies downloading/review.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-06-01 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311345#comment-15311345
 ] 

Patrick Hunt commented on ZOOKEEPER-2420:
-

Hm. I'm trying to understand why is this just showing up now, given the 
severity of the problem (fwiw I believe we should make this a blocker). iirc 
this code has been in place forever, even before "autopurge" existed - it was 
used for the external "clean" mechanism. It seems unusual that this is just 
cropping up now - why?

One reason I can think of, is it related to the fact that sending of diffs was 
fixed around 3.4.(6?) timeframe? ([~fpj]?)

Does the test fail w/o the fix? In the sense, does the test recreate the 
situation you are trying fix in this jira.


> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>Assignee: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-26 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302318#comment-15302318
 ] 

Michael Han commented on ZOOKEEPER-2420:


+1 on v3 patch. Thanks [~EdRowe]!

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-26 Thread Ed Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301716#comment-15301716
 ] 

Ed Rowe commented on ZOOKEEPER-2420:


v3 patch uploaded. [~hanm] it adds the test (augments existing test for 
findNRecentSnapshots), fixes the == and revises the comment as you suggested.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2, 
> ZOOKEEPER-2420.patch_v3
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-25 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301216#comment-15301216
 ] 

Michael Han commented on ZOOKEEPER-2420:


Hi [~EdRowe]. The v2 patch LGTM. A few more comments:

L184: Good finding and fix on the bug in findNRecentSnapshots. Do you think if 
it's easy to add a test that explicitly cover the bug you fixed?

L189: (nit) missing spaces around '=='. This is not introduced in your patch, 
but since you are here we may just fix it so it's consistent with rest of code 
base :)

L355-356 : Should we revise the comment by adding one case, so it looks like: 
"get the snapshot logs that are greater than, or equal to, or just before the 
given zxid' ?

Also FYI, after you uploaded a new patch, you can click the "Submit Patch" 
patch in the JIRA issue, this will change the JIRA status to "Patch Available" 
(some folks may monitoring that list to pick items to review), also it will 
trigger pre-commit builds that will apply your patch and run tests. 




> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-25 Thread Ed Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299714#comment-15299714
 ] 

Ed Rowe commented on ZOOKEEPER-2420:


New patch uploaded. [~hanm] as you suggested I am now using 
FileTxnLog.getLogFiles() to find the zxid of the oldest log file we need to 
retain. This change keeps the logic about "prior logs" confined to 
getLogFiles() which is definitely cleaner. I did not remove the filter or the 
comparison by zxid because it is required to prevent incorrectly deleting new 
snapshots/logs in a race condition - I added a comment in the new patch to 
discuss this case. I also refactored/renamed retainNRecentSnapshots() -> 
purgeOlderSnapshots() and changed it to take a single snapshot file rather than 
a list because it makes the interface cleaner and more obvious (the list was 
solely used to provide the oldest snapshot).

As part of this patch I also fixed a bug in FileSnap.findNRecentSnapshots() - 
if there were fewer than n snapshots available then it would also return log 
files.  

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch, ZOOKEEPER-2420.patch_v2
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-19 Thread Ed Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290613#comment-15290613
 ] 

Ed Rowe commented on ZOOKEEPER-2420:


Thanks Michael. I'm looking into your suggestion. In the process I think I've 
discovered another bug. Stay tuned.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-16 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284878#comment-15284878
 ] 

Michael Han commented on ZOOKEEPER-2420:


Hi [~EdRowe], thanks for the patch! 
retainNRecentSnapshots will miss an edge case because the current selection 
logic rely on a filter like this:
{code:title=retainNRecentSnapshots.java|borderStyle=solid}
File snapShot = snaps.get(snaps.size() -1);
final long leastZxidToBeRetain = Util.getZxidFromName(
snapShot.getName(), PREFIX_SNAPSHOT);

class MyFileFilter implements FileFilter{
private final String prefix;
MyFileFilter(String prefix){
this.prefix=prefix;
}
public boolean accept(File f){
if(!f.getName().startsWith(prefix + "."))
return false;
long fZxid = Util.getZxidFromName(f.getName(), prefix);
if (fZxid >= leastZxidToBeRetain) {
return false;
}
return true;
}
}
{code}
This logic will miss one case when the zxid we need to retain is contained in a 
transaction log file whose name (fZxid here) is smaller than 
leastZxidToBeRetain. This log file is the 'prior' log file you mentioned.

Looking more around in the code, I think we already have a method that we can 
use to safely pick up the list of transaction log files we are about to retain:
{code:title=FileTxnLog.java.java|borderStyle=solid}
/**
 * Find the log file that starts at, or just before, the snapshot. Return
 * this and all subsequent logs. Results are ordered by zxid of file,
 * ascending order.
 * @param logDirList array of files
 * @param snapshotZxid return files at, or before this zxid
 * @return
 */
public static File[] getLogFiles(File[] logDirList,long snapshotZxid) {
List files = Util.sortDataDir(logDirList, "log", true);
long logZxid = 0;
// Find the log file that starts before or at the same time as the
// zxid of the snapshot
for (File f : files) {
long fzxid = Util.getZxidFromName(f.getName(), "log");
if (fzxid > snapshotZxid) {
continue;
}
// the files
// are sorted with zxid's
if (fzxid > logZxid) {
logZxid = fzxid;
}
}
List v=new ArrayList(5);
for (File f : files) {
long fzxid = Util.getZxidFromName(f.getName(), "log");
if (fzxid < logZxid) {
continue;
}
v.add(f);
}
return v.toArray(new File[0]);

}
{code}

This method essentially returns the full list of transaction log files a given 
snapshot depends on for recovery purpose. How about we remove the filter in 
retainNRecentSnapshots and use this method instead? What do you think about 
this?

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-13 Thread Ed Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283034#comment-15283034
 ] 

Ed Rowe commented on ZOOKEEPER-2420:


I've posted a patch for review. Please have a look.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
> Attachments: ZOOKEEPER-2420.patch
>
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-04 Thread Ed Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270317#comment-15270317
 ] 

Ed Rowe commented on ZOOKEEPER-2420:


Sure. I won't be able to get to it for a few days but I can put something 
together.

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2420) Autopurge deletes log file prior to oldest retained snapshot even though restore may need it

2016-05-03 Thread Arshad Mohammad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268693#comment-15268693
 ] 

Arshad Mohammad commented on ZOOKEEPER-2420:


Very important issue, it leads to data loss. Thanks [~EdRowe] for reporting it. 
are u willing to submit a patch?

> Autopurge deletes log file prior to oldest retained snapshot even though 
> restore may need it
> 
>
> Key: ZOOKEEPER-2420
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Ed Rowe
>
> Autopurge retains all log files whose zxid are >= the zxid of the oldest 
> snapshot file that it is going to retain (in PurgeTxnLog 
> retainNRecentSnapshots()). Given that loading the database from 
> snapshots/logs will start with the log file _prior_ to the snapshot's zxid, 
> autopurge should retain the log file prior to the oldest retained snapshot as 
> well, unless it verifies that it contains no zxids beyond what the snapshot 
> contains. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)