[ https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852660#action_12852660 ]
Konstantin Shvachko commented on HDFS-955: ------------------------------------------ > loadFSImage() - why do we need check for directory type, while comparing > latest checkpoint times? IMAGE latest checkpoint can be larger than EDITS latest checkpoint only if the two directories are image-only and edits-only respectively. If this is one directory which is IMAGE_AND_EDITS then the condition should be treated as an error. This could be an assert in current code, but being overprotective seems to be appropriate here. > saveCurrent(), moveCurrent(), moveLastCheckpoint() could all be static. > Better still, they should all be methods in StorageDirectory. saveCurrent() cannot be static, it uses editLog member. All three methods are very much name-node specific, while StorageDirectory is intended to have methods common for all storage types data-node or name-node. So I decided to keep them in FSImage. > TestSaveNamespace - we should add other tests I believe these cases are covered in updated TestDFSStorageStateRecovery. The test emulates failures on different stages by creating corresponding storage directory configurations and verifies that the nodes recover or fail correctly. The rest is corrected / updated. Suresh, thank you for the meticulous review. > FSImage.saveFSImage can lose edits > ---------------------------------- > > Key: HDFS-955 > URL: https://issues.apache.org/jira/browse/HDFS-955 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.20.1, 0.21.0, 0.22.0 > Reporter: Todd Lipcon > Assignee: Konstantin Shvachko > Priority: Blocker > Attachments: FSStateTransition7.htm, hdfs-955-moretests.txt, > hdfs-955-unittest.txt, PurgeEditsBeforeImageSave.patch, > saveNamespace-0.20.patch, saveNamespace-0.21.patch, saveNamespace.patch, > saveNamespace.patch, saveNamespace.txt > > > This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage > function (implementing dfsadmin -saveNamespace) can corrupt the NN storage > such that all current edits are lost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.