[ https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831201#action_12831201 ]
Todd Lipcon commented on HDFS-955: ---------------------------------- Reproducing the comment from HDFS-909: FSImage.saveFSImage has this code: {noformat} 1240 if (dirType.isOfType(NameNodeDirType.IMAGE)) 1241 saveFSImage(getImageFile(sd, NameNodeFile.IMAGE_NEW)); 1242 if (dirType.isOfType(NameNodeDirType.EDITS)) { 1243 editLog.createEditLogFile(getImageFile(sd, NameNodeFile.EDITS)); 1244 File editsNew = getImageFile(sd, NameNodeFile.EDITS_NEW); 1245 if (editsNew.exists()) 1246 editLog.createEditLogFile(editsNew); 1247 } {noformat} On line 1243 we truncate EDITS. Then if EDITS_NEW exists, we truncate it on 1246. All of this happens when the NN is in safe mode, so there shouldn't be any new edits coming in in the first place. I'm contending that line 1243 and 1245 should both be deleted. We should always create the image as IMAGE_NEW (line 1241). Touching EDITS seems incorrect - what if the order of storage dirs is EDITS then IMAGE, so we run line 1243, kill our current edit log, and then crash before saving the current image? > FSImage.saveFSImage can lose edits > ---------------------------------- > > Key: HDFS-955 > URL: https://issues.apache.org/jira/browse/HDFS-955 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.21.0, 0.22.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Blocker > > This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage > function (implementing dfsadmin -saveNamespace) can corrupt the NN storage > such that all current edits are lost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.