[ 
https://issues.apache.org/jira/browse/HDFS-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065499#comment-15065499
 ] 

Chris Nauroth commented on HDFS-9569:
-------------------------------------

Patch v002 causes a subtle change in the control flow that will cause a lot of 
extraneous logging.  For example, assume there are 3 fsimage directories, and 
they're all working correctly.  With the current code, the first attempt to 
load finds reserved paths in the first directory and immediately exits by 
throwing an unchecked {{IllegalArgumentException}}.  After applying the patch, 
this is converted to a checked {{IOException}}, which is not sufficient to 
terminate the loop.  It will try all 3 fsimage directories.  Each iteration 
will log an error, with full stack trace.  Then, it will get logged a 4th time 
after the loop.

This extra logging is not useful for end users.  It's a better user experience 
to exit as soon as reserved names are encountered, because we already know at 
that point that user action is required.

Maybe at this point we need a more specific exception type, like 
{{IllegalReservedPathException}}.  That one could be allowed to propagate out 
of the loop and terminate early.

> Log the name of the fsimage being loaded for better supportability
> ------------------------------------------------------------------
>
>                 Key: HDFS-9569
>                 URL: https://issues.apache.org/jira/browse/HDFS-9569
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>            Priority: Trivial
>              Labels: supportability
>             Fix For: 2.7.3
>
>         Attachments: HDFS-9569.001.patch, HDFS-9569.002.patch
>
>
> When NN starts to load fsimage, it does
> {code}
>  void loadFSImageFile(FSNamesystem target, MetaRecoveryContext recovery,
>       FSImageFile imageFile, StartupOption startupOption) throws IOException {
>       LOG.debug("Planning to load image :\n" + imageFile);
>       ......
>     long txId = loader.getLoadedImageTxId();
>     LOG.info("Loaded image for txid " + txId + " from " + curFile);
> {code}
> A debug msg is issued at the beginning with the fsimage file name, then at 
> the end an info msg is issued after loading.
> If the fsimage loading failed due to corrupted fsimage (see HDFS-9406), we 
> don't see the first msg. It'd be helpful to always be able to see from NN 
> logs what fsimage file it's loading.
> Two improvements:
> 1. Change the above debug to info
> 2. If exception happens when loading fsimage, be sure to report the fsimage 
> name being loaded in the error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to