[ https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835992#action_12835992 ]
Todd Lipcon commented on HDFS-955: ---------------------------------- bq. I don't understand why you mixed in this patch the code from HDFS-957. It does not help the test cases, right? I was just working on the two in the same tree - with a bit of modification to recoverInterruptedCheckpoint I think the second test can be fixed using the functionality from HDFS-957. I can demonstrate this with a patch if you would like. bq. The criteria that IMAGE_NEW was written completely and successfully is the existence of EDITS_NEW I think you misspoke here - EDITS_NEW exists _before_ IMAGE_NEW is saved. In my opinion the cleanest way of knowing that IMAGE_NEW is complete is the HDFS-957 patch. You may be able to know that info from the state of some other files, but why not be explicit about it to avoid some classes of errors? bq. I don't know what you were trying to achieve with this. I don't see the expected exception thrown. Ah, sloppy copy paste there on my part. I don't expect an exception to actually be caught there. The failed restart with corrupted edits is indeed the failure I expected to provoke with that test. > FSImage.saveFSImage can lose edits > ---------------------------------- > > Key: HDFS-955 > URL: https://issues.apache.org/jira/browse/HDFS-955 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.20.1, 0.21.0, 0.22.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Blocker > Attachments: hdfs-955-moretests.txt, hdfs-955-unittest.txt, > PurgeEditsBeforeImageSave.patch > > > This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage > function (implementing dfsadmin -saveNamespace) can corrupt the NN storage > such that all current edits are lost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.