[
https://issues.apache.org/jira/browse/HBASE-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009850#comment-13009850
]
Jonathan Gray commented on HBASE-3687:
--------------------------------------
I think it's fine for now. The real fix should be having the RS not check in
the master until it is fully online (agree, outside scope of this jira).
> Bulk assign on startup should handle a ServerNotRunningException
> ----------------------------------------------------------------
>
> Key: HBASE-3687
> URL: https://issues.apache.org/jira/browse/HBASE-3687
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.90.2
>
> Attachments: 3687.txt
>
>
> On startup, we do bulk assign. At the moment, if any problem during bulk
> assign, we consider startup failed and expectation is that you need to retry
> (We need to make this better but that is not what this issue is about). One
> exception that we should handle is the case where a RS is slow coming up and
> its rpc is not yet up listening. In this case it will throw:
> ServerNotRunningException. We should retry at least this one exception
> during bulk assign.
> We had this happen to us starting up a prod cluster.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira