[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

Xiao Chen (JIRA) Mon, 19 Mar 2018 23:57:39 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405890#comment-16405890
 ]


Xiao Chen commented on HDFS-13314:
----------------------------------

Thanks [~arpitagarwal] and all for the effort here. Also ping [~yzhangal] for 
interest.

I echo the difficulty and sometimes frustration on not able to reproduce a 
corruption. The idea here sounds good.

I'm inclined to agree with Arpit that we should not change the default 
behavior, though. In the extreme case where someone really wants the checkpoint 
done (e.g. has not checkpointed for a long time so lots of edits, etc.), 
keeping the old behavior seems better - you cannot let them reconfigure and do 
it again. I think it may also be possible if the workflow deletes a bunch of 
stuff (e.g. the problematic snapshot, parent dir, etc.), and checkpoint, the 
corruption may not happen at all - although this is an untested guess. 

> NameNode should optionally exit if it detects FsImage corruption
> ----------------------------------------------------------------
>
>                 Key: HDFS-13314
>                 URL: https://issues.apache.org/jira/browse/HDFS-13314
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>            Priority: Major
>         Attachments: HDFS-13314.01.patch, HDFS-13314.02.patch
>
>
> The NameNode should optionally exit after writing an FsImage if it detects 
> the following kinds of corruptions:
> # INodeReference pointing to non-existent INode
> # Duplicate entries in snapshot deleted diff list.
> This behavior is controlled via an undocumented configuration setting, and 
> disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-13314) NameNode should optionally exit if it detects FsImage corruption

Reply via email to