[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281517#comment-15281517
 ] 

Junping Du commented on MAPREDUCE-6657:
---------------------------------------

bq. Do you think we should create a subclass of RetriableException for this 
instead?
It is up to you. IMO, it is not necessary to do so just for a special case or 
it could be too many sub-exceptions.

bq. The message is derived from a instance method this.nn.getRole(), and doing 
string matching is probably not the cleanest way.
You can make a static method for {{this.nn.getRole() + " still not started"}} 
with input of daemon's name ("NameNode" here) which is accessible from both 
HDFS and MAPREDUCE (JHS). In JHS, just put "NameNode" (or move NamenodeRole 
from HdfsServerConstants to HdfsConstants and share to JHS) and get the same 
string with HDFS. That could be much cleaner.

> job history server can fail on startup when NameNode is in start phase
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6657
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>         Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to