[
https://issues.apache.org/jira/browse/HDFS-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065499#comment-15065499
]
Chris Nauroth commented on HDFS-9569:
-------------------------------------
Patch v002 causes a subtle change in the control flow that will cause a lot of
extraneous logging. For example, assume there are 3 fsimage directories, and
they're all working correctly. With the current code, the first attempt to
load finds reserved paths in the first directory and immediately exits by
throwing an unchecked {{IllegalArgumentException}}. After applying the patch,
this is converted to a checked {{IOException}}, which is not sufficient to
terminate the loop. It will try all 3 fsimage directories. Each iteration
will log an error, with full stack trace. Then, it will get logged a 4th time
after the loop.
This extra logging is not useful for end users. It's a better user experience
to exit as soon as reserved names are encountered, because we already know at
that point that user action is required.
Maybe at this point we need a more specific exception type, like
{{IllegalReservedPathException}}. That one could be allowed to propagate out
of the loop and terminate early.
> Log the name of the fsimage being loaded for better supportability
> ------------------------------------------------------------------
>
> Key: HDFS-9569
> URL: https://issues.apache.org/jira/browse/HDFS-9569
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
> Priority: Trivial
> Labels: supportability
> Fix For: 2.7.3
>
> Attachments: HDFS-9569.001.patch, HDFS-9569.002.patch
>
>
> When NN starts to load fsimage, it does
> {code}
> void loadFSImageFile(FSNamesystem target, MetaRecoveryContext recovery,
> FSImageFile imageFile, StartupOption startupOption) throws IOException {
> LOG.debug("Planning to load image :\n" + imageFile);
> ......
> long txId = loader.getLoadedImageTxId();
> LOG.info("Loaded image for txid " + txId + " from " + curFile);
> {code}
> A debug msg is issued at the beginning with the fsimage file name, then at
> the end an info msg is issued after loading.
> If the fsimage loading failed due to corrupted fsimage (see HDFS-9406), we
> don't see the first msg. It'd be helpful to always be able to see from NN
> logs what fsimage file it's loading.
> Two improvements:
> 1. Change the above debug to info
> 2. If exception happens when loading fsimage, be sure to report the fsimage
> name being loaded in the error message.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)