[
https://issues.apache.org/jira/browse/HDFS-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405475#comment-16405475
]
Tsz Wo Nicholas Sze commented on HDFS-13314:
--------------------------------------------
Thanks [~arpitagarwal], some comments on the patch:
- Print also the fsimage file name in the log messages below.
{code:java}
+ LOG.error("Detected " + numErrors + " errors while saving FsImage.");
{code}
{code:java}
+ LOG.fatal("NameNode process will exit now... The saved FsImage is " +
+ "potentially corrupted.");
{code}
- Add numErrors in the log message below.
{code:java}
+ long numErrors = saveInternal(fout, compression,
file.getAbsolutePath());
LOG.info("Image file {} of size {} bytes saved in {} seconds.", file,
file.length(), (monotonicNow() - startTime) / 1000);
+ return numErrors;
{code}
- Print the full path in the log message below
{code:java}
+ FSImage.LOG.error("FSImageFormatPBSnapshot: Missing referred INodeId "
+
+ ref.getId() + " for INodeReference index " + refIndex);
{code}
- Let's not only check INodeReference but all INodes. Also, let's use
compareTo to detect also out-of-order cases.
{code:java}
INode previous = null;
for (INode d : deleted) {
if (previous != null) {
final int cmp = d.compareTo(previous.getLocalNameBytes());
if (cmp <= 0) {
final String err = cmp == 0? "repeated": "out-of-order";
FSImage.LOG.error("Names " + err + " in the 'deleted' difflist
of directory " ...);
++numImageErrors;
}
}
previous = d;
{code}
> NameNode should optionally exit if it detects FsImage corruption
> ----------------------------------------------------------------
>
> Key: HDFS-13314
> URL: https://issues.apache.org/jira/browse/HDFS-13314
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Arpit Agarwal
> Assignee: Arpit Agarwal
> Priority: Major
> Attachments: HDFS-13314.01.patch, HDFS-13314.02.patch
>
>
> The NameNode should optionally exit after writing an FsImage if it detects
> the following kinds of corruptions:
> # INodeReference pointing to non-existent INode
> # Duplicate entries in snapshot deleted diff list.
> This behavior is controlled via an undocumented configuration setting, and
> disabled by default.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]