[ https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405475#comment-16405475 ]
Tsz Wo Nicholas Sze commented on HDFS-13314: -------------------------------------------- Thanks [~arpitagarwal], some comments on the patch: - Print also the fsimage file name in the log messages below. {code:java} + LOG.error("Detected " + numErrors + " errors while saving FsImage."); {code} {code:java} + LOG.fatal("NameNode process will exit now... The saved FsImage is " + + "potentially corrupted."); {code} - Add numErrors in the log message below. {code:java} + long numErrors = saveInternal(fout, compression, file.getAbsolutePath()); LOG.info("Image file {} of size {} bytes saved in {} seconds.", file, file.length(), (monotonicNow() - startTime) / 1000); + return numErrors; {code} - Print the full path in the log message below {code:java} + FSImage.LOG.error("FSImageFormatPBSnapshot: Missing referred INodeId " + + ref.getId() + " for INodeReference index " + refIndex); {code} - Let's not only check INodeReference but all INodes. Also, let's use compareTo to detect also out-of-order cases. {code:java} INode previous = null; for (INode d : deleted) { if (previous != null) { final int cmp = d.compareTo(previous.getLocalNameBytes()); if (cmp <= 0) { final String err = cmp == 0? "repeated": "out-of-order"; FSImage.LOG.error("Names " + err + " in the 'deleted' difflist of directory " ...); ++numImageErrors; } } previous = d; {code} > NameNode should optionally exit if it detects FsImage corruption > ---------------------------------------------------------------- > > Key: HDFS-13314 > URL: https://issues.apache.org/jira/browse/HDFS-13314 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: Arpit Agarwal > Assignee: Arpit Agarwal > Priority: Major > Attachments: HDFS-13314.01.patch, HDFS-13314.02.patch > > > The NameNode should optionally exit after writing an FsImage if it detects > the following kinds of corruptions: > # INodeReference pointing to non-existent INode > # Duplicate entries in snapshot deleted diff list. > This behavior is controlled via an undocumented configuration setting, and > disabled by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org