[
https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831259#action_12831259
]
Todd Lipcon commented on HDFS-955:
----------------------------------
Woops, somehow managed to mangle my edit buffer there... repasting, sorry for
the spam.
{noformat}
loadFSImage:
- find a pair of EDITS and IMAGE that have the same checkpoint time and are
from the latest checkpointTime
(this ignores *_NEW)
- recoverInterruptedCheckpoint:
if there is an IMAGE_NEW:
if there is EDITS_NEW:
delete IMAGE_NEW (since we assume we can replay from IMAGE + EDITS +
EDITS_NEW?
else:
replace IMAGE with IMAGE_NEW, delete IMAGE_NEW
- load IMAGE
- load EDITS
- load EDITS_NEW
- if need to save:
saveFSImage:
save IMAGE_NEW
truncate EDITS
if EDITS_NEW exists:
truncate EDITS_NEW
rollFSImage:
purgeEditLog:
replace EDITS with EDITS_NEW
renameCheckpoint:
replace IMAGE with IMAGE_NEW
{noformat}
> FSImage.saveFSImage can lose edits
> ----------------------------------
>
> Key: HDFS-955
> URL: https://issues.apache.org/jira/browse/HDFS-955
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 0.21.0, 0.22.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Blocker
>
> This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage
> function (implementing dfsadmin -saveNamespace) can corrupt the NN storage
> such that all current edits are lost.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.