Nick Dimiduk created HBASE-12467:
------------------------------------
Summary: Master joins cluster but never completes initialization
Key: HBASE-12467
URL: https://issues.apache.org/jira/browse/HBASE-12467
Project: HBase
Issue Type: Bug
Components: master
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Fix For: 2.0.0, 0.98.9, 0.99.2
While diagnosing a rare failure in IntegrationTestLoadAndVerify, I discovered
this scenario. Master was restarted by CM. Upon rejoining the cluster it
successfully assumes responsibility as active master, but apparently the
finishInitialization method never completes. The last log line from that thread
is
{noformat}
2014-11-10 17:01:29,940 INFO [master:ip-172-31-9-135:60000] master.HMaster:
hbase:meta with replicaId 0 assigned=0, rit=false,
location=ip-172-31-9-136.ec2.internal,60020,1415638551951
{noformat}
I see region states populated from existing znodes. AM inventoried the online
regions, acknowledged that this was master failover. There it sits, responding
to RPC's with {{PleaseHoldException: Master is initializing}}.
For the sake of resiliency, we should detect this scenario and at least release
control as active master.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)