[ 
https://issues.apache.org/jira/browse/HBASE-12467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12467:
---------------------------------
    Attachment: HBASE-12467.01.patch

> Master joins cluster but never completes initialization
> -------------------------------------------------------
>
>                 Key: HBASE-12467
>                 URL: https://issues.apache.org/jira/browse/HBASE-12467
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 2.0.0, 0.98.9, 0.99.2
>
>         Attachments: HBASE-12467.00.patch, HBASE-12467.00.patch, 
> HBASE-12467.01.patch, HBASE-12467.01.patch
>
>
> While diagnosing a rare failure in IntegrationTestLoadAndVerify, I discovered 
> this scenario. Master was restarted by CM. Upon rejoining the cluster it 
> successfully assumes responsibility as active master, but apparently the 
> finishInitialization method never completes. The last log line from that 
> thread is
> {noformat}
> 2014-11-10 17:01:29,940 INFO  [master:ip-172-31-9-135:60000] master.HMaster: 
> hbase:meta with replicaId 0 assigned=0, rit=false, 
> location=ip-172-31-9-136.ec2.internal,60020,1415638551951
> {noformat}
> I see region states populated from existing znodes. AM inventoried the online 
> regions, acknowledged that this was master failover. There it sits, 
> responding to RPC's with {{PleaseHoldException: Master is initializing}}.
> For the sake of resiliency, we should detect this scenario and at least 
> release control as active master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to