[ 
https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409895#comment-16409895
 ] 

Arpit Agarwal commented on HDFS-13314:
--------------------------------------

bq. Yes, no config option. Detected corruption = unconditional hard stop.
Ok, will do.

bq. The in-memory state is corrupt but the edit stream (hopefully) isn't. Which 
is easier to do: Hack up the NN to attempt to load the bad image? Or replay a 
partial edit stream perhaps w/o the snapshot removal? 
I'd suggest that the former is safer as it exposes the problem quicker e.g. if 
the administrator has configured auto-restart which many of our customers do. 
If we don't write an image, the NN shuts down but it can be restarted easily 
and continue to run with corrupted state (note that we don't yet know how to 
detect the corruption when replaying edit logs).

> NameNode should optionally exit if it detects FsImage corruption
> ----------------------------------------------------------------
>
>                 Key: HDFS-13314
>                 URL: https://issues.apache.org/jira/browse/HDFS-13314
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>            Priority: Major
>         Attachments: HDFS-13314.01.patch, HDFS-13314.02.patch, 
> HDFS-13314.03.patch, HDFS-13314.04.patch
>
>
> The NameNode should optionally exit after writing an FsImage if it detects 
> the following kinds of corruptions:
> # INodeReference pointing to non-existent INode
> # Duplicate entries in snapshot deleted diff list.
> This behavior is controlled via an undocumented configuration setting, and 
> disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to