[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280564#comment-15280564
 ] 

Junping Du commented on MAPREDUCE-6657:
---------------------------------------

Thanks for updating the patch, [~haibochen].
My above comments is actually trying to say we should define static string in 
where exception get throw. 
In this case, we should also change NameNodeRpcServer.java:
{noformat}
  private void checkNNStartup() throws IOException {
    if (!this.nn.isStarted()) {
      throw new RetriableException(this.nn.getRole() + " still not started");
    }
  }
{noformat}
If we define some static string in HDFS and use in both side (NameNodeRpcServer 
and HistoryFileManager), that can make sure we won't hit this issue again in 
future if we update exception string.

> job history server can fail on startup when NameNode is in start phase
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6657
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>         Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to