[ 
https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831806#action_12831806
 ] 

Todd Lipcon commented on HDFS-955:
----------------------------------

This issue also occurs if there are multiple name/edit dirs and an exception is 
thrown while saving any fsimage other than the first.

To reproduce, I added:
{code}
    if (savedCount++ > 0 && new File("/tmp/inject").exists()) {
      throw new IOException("Injected fault");
    } else {
      LOG.warn("Not injecting", new IOException());
    }
{code}

along with a static int savedCount = 0 in FSImage.

I started a NN, created some directories, shut it down. I then created 
/tmp/inject, and started the NN again. It failed while saving the second image, 
as planned. This left the edit dir in such a state that starting the NN up 
again recovered from the first name dir where the edits had been blown away.

This should be unit testable with mockito. I'll try to form such a test.

> FSImage.saveFSImage can lose edits
> ----------------------------------
>
>                 Key: HDFS-955
>                 URL: https://issues.apache.org/jira/browse/HDFS-955
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>
> This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage 
> function (implementing dfsadmin -saveNamespace) can corrupt the NN storage 
> such that all current edits are lost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to