[ 
https://issues.apache.org/jira/browse/HADOOP-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681278#action_12681278
 ] 

Steve Loughran commented on HADOOP-5453:
----------------------------------------

K > Going through the logs is a routine if you ask anybody dealing with 
clusters. And the logs in this case have all information you need i believe. 
Failing on startup in this case is a correct behavior in my opinion.

+1 to fail on startup, I just want that failure to be visible without having to 
scan the logs for the word FATAL just before the NN went away. 

D>  One option would be to ensure that all Namenode threads are shutdown 
without killing the entire JVM. In that case, if an application is running the 
Namenode within its own JVM, that application can detect that the Namenode has 
exited and than take appropriate action.

the lifecycle stuff can do that;  this FSEdit failure is something we could 
integrate in *after* the basic lifecycle is done. With tests; right now you 
can't test that MiniMR does a fatal exit on FSEdit problems, as the test runner 
goes down.

> Could FSEditLog report problems more elegantly than with System.exit(-1)
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-5453
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5453
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> When FSEdit encounters problems, it prints something and then exits.
> It would be better for any in-JVM deployments of FSEdit for these to be 
> raised in some other way (such as throwing an exception), rather than taking 
> down the whole JVM. That could be in JUnit tests, or it could be inside other 
> applications. Test runners and the like can intercept those System.exit() 
> calls with their own Security Manager -often turning the System.exit() 
> operation into an exception there and then. If FSEdit did that itself, it may 
> be easier to stay in control. 
> The current approach has some benefits -it can exit regardless of which 
> thread has encountered problems, but it is tricky to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to