[ 
https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16407124#comment-16407124
 ] 

Daryn Sharp commented on HDFS-13314:
------------------------------------

I think Rushabh thought the "don't exit" option didn't delete prior images and 
edits after checkpointing based on the claim +"The purge step is skipped if a 
bad image was written"+.  However, the code appears to only skip the purge if 
it's configured to shutdown on corruption.

{quote}
bq. I'm inclined to agree with Arpit that we should not change the default 
behavior, though. In the extreme case where someone really wants the checkpoint 
done (e.g. has not checkpointed for a long time so lots of edits, etc.), 
keeping the old behavior seems better - you cannot let them reconfigure and do 
it again.
Yes, this is a good explanation. Not changing the default, and ensuring we 
write a new image is the safe choice.
{quote}

No, that is a terrible explanation.  How is the "safe" choice to knowingly 
write a corrupt image?  One that renders the NN incapable of starting up?  
There's no "reconfigure" your way out of that.  How is it safe to allow the NN 
to start obliterating data? (see HDFS-9406, 9300 blocks invalidated).

bq. I think it may also be possible if the workflow deletes a bunch of stuff 
(e.g. the problematic snapshot, parent dir, etc.), and checkpoint, the 
corruption may not happen at all - although this is an untested guess.

Wishful thinking + data durability = russian data roulette.  I'd predict data 
loss due to incorrect invalidations, further corruption of the in-memory state, 
probably corrupted edits, and eventual crash.

We need to immediately do a full stop anytime data structures are known to be 
corrupt.


> NameNode should optionally exit if it detects FsImage corruption
> ----------------------------------------------------------------
>
>                 Key: HDFS-13314
>                 URL: https://issues.apache.org/jira/browse/HDFS-13314
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>            Priority: Major
>         Attachments: HDFS-13314.01.patch, HDFS-13314.02.patch, 
> HDFS-13314.03.patch
>
>
> The NameNode should optionally exit after writing an FsImage if it detects 
> the following kinds of corruptions:
> # INodeReference pointing to non-existent INode
> # Duplicate entries in snapshot deleted diff list.
> This behavior is controlled via an undocumented configuration setting, and 
> disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to