[
https://issues.apache.org/jira/browse/HDFS-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798912#action_12798912
]
Steve Loughran commented on HDFS-890:
-------------------------------------
Looking at the code, I don't think we need to change the method at all. Every
other reason for startup to fail, -bad configuration arguments, ports in use-
cause {{makeInstance()}} to fail with an exception, one that translates to a -1
exit code.
It's only the no-valid-data dirs that causes a silent "no exception, no error
code" failure, one that 3 of 5 callers in the Hadoop codebase assume does not
happen, callers who will NPE if it does.
The simplest solution here would just go: throw an exception in this situation,
the main method exits with -1; all the code that would have NPEd will now fail
with meaningful errors. This wouldn't need any new/deprecated interfaces, just
a change to note "hdfs datanode fails with -1 exit code if there is no
directory"
We could also add a new Exception: NoDataDirsException extends IOException,
includes a list of data dirs and their individual exceptions. This could be
used by callers to handle this specific problem, if they were really interested
in handling it differently.
> Have a way of creating datanodes that throws an meaningful exception on
> failure
> -------------------------------------------------------------------------------
>
> Key: HDFS-890
> URL: https://issues.apache.org/jira/browse/HDFS-890
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node
> Affects Versions: 0.22.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
>
> In HDFS-884, I proposed printing out more details on why things fail. This is
> hard to test, because you need to subvert the log4j back end that your test
> harness will itself have grabbed.
> There is a way to make it testable, and to make it easier for anyone creating
> datanodes in process to recognise and handle failure: have a static
> CreateDatanode() method that throws exceptions when directories cannot be
> created or other problems arise. Right now some problems trigger failure,
> others just return a null reference saying "something went wrong but we won't
> tell you what -hope you know where the logs go".
> The HDFS-884 patch would be replaced by something that threw an exception;
> the existing methods would catch this, log it and return null. The new method
> would pass it straight up.
> This is easier to test, better for others. If people think this is good, I
> will code it up and mark the old API as deprecated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.