Todd Lipcon created HDFS-3653:
---------------------------------
Summary: 1.x: Add a retention period for purged edit logs
Key: HDFS-3653
URL: https://issues.apache.org/jira/browse/HDFS-3653
Project: Hadoop HDFS
Issue Type: Improvement
Components: name-node
Affects Versions: 1.1.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Occasionally we have a bug which causes something to go wrong with edits files.
Even more occasionally the bug is such that the namenode mistakenly deletes an
{{edits}} file without merging it into {{fsimage}} properly -- e.g if the bug
mistakenly writes an OP_INVALID at the top of the log.
In trunk/2.0 we retain many edit log segments going back in time to be more
robust to this kind of error. I'd like to implement something similar (but much
simpler) in 1.x, which would be used only by HDFS developers in root-causing or
repairing from these rare scenarios: the NN should never directly delete an
edit log file. Instead, it should rename the file into some kind of "trash"
directory inside the name dir, and associate it with a timestamp. Then,
periodically a separate thread should scan the trash dirs and delete any logs
older than a configurable time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira