[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420978#comment-16420978 ]
Yongjun Zhang commented on HDFS-13314: -------------------------------------- {quote} Hi Yongjun, thanks for looking at the Jira! Please post your comments in the Jira also for support. # Yes we saw duplicate entries. # The crash we saw was a NPE due to the referred INode being absent. The check looks for such dangling references. I don’t think we have seen a crash at the location you pointed out. private INodeReference loadINodeReference( INodeReferenceSection.INodeReference r) throws IOException { long referredId = r.getReferredId(); INode referred = fsDir.getInode(referredId); *WithCount withCount = (WithCount) referred.getParentReference(); <<<<<< Crashes here as referred is null.* # We have not seen misordered entries yet. Also, the *!misordered* check was deliberate. Once there is one such entry the whole list is compromised. # The Assertion actually results in a runtime exception which fails the request. However we suspect that the list was somehow corrupted by other means, not the insert call. We are not sure how it happened. Let me know if you have any concerns or ideas for improving the checks. We can certainly do a follow up jira. {quote} > NameNode should optionally exit if it detects FsImage corruption > ---------------------------------------------------------------- > > Key: HDFS-13314 > URL: https://issues.apache.org/jira/browse/HDFS-13314 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: Arpit Agarwal > Assignee: Arpit Agarwal > Priority: Major > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.2 > > Attachments: HDFS-13314.01.patch, HDFS-13314.02.patch, > HDFS-13314.03.patch, HDFS-13314.04.patch, HDFS-13314.05.patch > > > The NameNode should optionally exit after writing an FsImage if it detects > the following kinds of corruptions: > # INodeReference pointing to non-existent INode > # Duplicate entries in snapshot deleted diff list. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org