[ https://issues.apache.org/jira/browse/ZOOKEEPER-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284878#comment-15284878 ]
Michael Han commented on ZOOKEEPER-2420: ---------------------------------------- Hi [~EdRowe], thanks for the patch! retainNRecentSnapshots will miss an edge case because the current selection logic rely on a filter like this: {code:title=retainNRecentSnapshots.java|borderStyle=solid} File snapShot = snaps.get(snaps.size() -1); final long leastZxidToBeRetain = Util.getZxidFromName( snapShot.getName(), PREFIX_SNAPSHOT); class MyFileFilter implements FileFilter{ private final String prefix; MyFileFilter(String prefix){ this.prefix=prefix; } public boolean accept(File f){ if(!f.getName().startsWith(prefix + ".")) return false; long fZxid = Util.getZxidFromName(f.getName(), prefix); if (fZxid >= leastZxidToBeRetain) { return false; } return true; } } {code} This logic will miss one case when the zxid we need to retain is contained in a transaction log file whose name (fZxid here) is smaller than leastZxidToBeRetain. This log file is the 'prior' log file you mentioned. Looking more around in the code, I think we already have a method that we can use to safely pick up the list of transaction log files we are about to retain: {code:title=FileTxnLog.java.java|borderStyle=solid} /** * Find the log file that starts at, or just before, the snapshot. Return * this and all subsequent logs. Results are ordered by zxid of file, * ascending order. * @param logDirList array of files * @param snapshotZxid return files at, or before this zxid * @return */ public static File[] getLogFiles(File[] logDirList,long snapshotZxid) { List<File> files = Util.sortDataDir(logDirList, "log", true); long logZxid = 0; // Find the log file that starts before or at the same time as the // zxid of the snapshot for (File f : files) { long fzxid = Util.getZxidFromName(f.getName(), "log"); if (fzxid > snapshotZxid) { continue; } // the files // are sorted with zxid's if (fzxid > logZxid) { logZxid = fzxid; } } List<File> v=new ArrayList<File>(5); for (File f : files) { long fzxid = Util.getZxidFromName(f.getName(), "log"); if (fzxid < logZxid) { continue; } v.add(f); } return v.toArray(new File[0]); } {code} This method essentially returns the full list of transaction log files a given snapshot depends on for recovery purpose. How about we remove the filter in retainNRecentSnapshots and use this method instead? What do you think about this? > Autopurge deletes log file prior to oldest retained snapshot even though > restore may need it > -------------------------------------------------------------------------------------------- > > Key: ZOOKEEPER-2420 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2420 > Project: ZooKeeper > Issue Type: Bug > Components: server > Reporter: Ed Rowe > Attachments: ZOOKEEPER-2420.patch > > > Autopurge retains all log files whose zxid are >= the zxid of the oldest > snapshot file that it is going to retain (in PurgeTxnLog > retainNRecentSnapshots()). Given that loading the database from > snapshots/logs will start with the log file _prior_ to the snapshot's zxid, > autopurge should retain the log file prior to the oldest retained snapshot as > well, unless it verifies that it contains no zxids beyond what the snapshot > contains. -- This message was sent by Atlassian JIRA (v6.3.4#6332)