[ 
https://issues.apache.org/jira/browse/HDFS-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798912#action_12798912
 ] 

Steve Loughran commented on HDFS-890:
-------------------------------------

Looking at the code, I don't think we need to change the method at all. Every 
other reason for startup to fail, -bad configuration arguments, ports in use- 
cause {{makeInstance()}} to fail with an exception, one that translates to a -1 
exit code. 

It's only the no-valid-data dirs that causes a silent "no exception, no error 
code" failure, one that 3 of 5 callers in the Hadoop codebase assume does not 
happen, callers who will NPE if it does. 

The simplest solution here would just go: throw an exception in this situation, 
the main method exits with -1; all the code that would have NPEd will now fail 
with meaningful errors. This wouldn't need any new/deprecated interfaces, just 
a change to note "hdfs datanode fails with -1 exit code if there is no 
directory"

We could also add a new Exception: NoDataDirsException extends IOException, 
includes a list of data dirs and their individual exceptions. This could be 
used by callers to handle this specific problem, if they were really interested 
in handling it differently. 

> Have a way of creating datanodes that throws an meaningful exception on 
> failure
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-890
>                 URL: https://issues.apache.org/jira/browse/HDFS-890
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>    Affects Versions: 0.22.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>
> In HDFS-884, I proposed printing out more details on why things fail. This is 
> hard to test, because you need to subvert the log4j back end that your test 
> harness will itself have grabbed.
> There is a way to make it testable, and to make it easier for anyone creating 
> datanodes in process to recognise and handle failure: have a static 
> CreateDatanode() method that throws exceptions when directories cannot be 
> created or other problems arise. Right now some problems trigger failure, 
> others just return a null reference saying "something went wrong but we won't 
> tell you what -hope you know where the logs go". 
> The HDFS-884 patch would be replaced by something that threw an exception; 
> the existing methods would catch this, log it and return null. The new method 
> would pass it straight up. 
> This is easier to test, better for others. If people think this is good, I 
> will code it up and mark the old API as deprecated. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to