[
https://issues.apache.org/jira/browse/HBASE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Busbey updated HBASE-21624:
--------------------------------
Issue Type: Improvement (was: Bug)
> master startup should not wait (or die) on assigning meta replicas
> ------------------------------------------------------------------
>
> Key: HBASE-21624
> URL: https://issues.apache.org/jira/browse/HBASE-21624
> Project: HBase
> Issue Type: Improvement
> Reporter: Sergey Shelukhin
> Priority: Major
>
> Due to some other bug, a meta replica is stuck in transition forever.
> Master is running fine without it, however the initializer thread hasn't
> finished initialization for ~19 hours now and is stuck in the below state.
> Doesn't seem to be necessary to wait for them - could just be
> fire-and-forget, normal region handling should handle it after that.
> {noformat}
> Thread 118 (master/...:17000:becomeActiveMaster):
> State: TIMED_WAITING
> Blocked count: 281
> Waited count: 67059
> Stack:
> java.lang.Thread.sleep(Native Method)
>
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:209)
>
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:192)
>
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:151)
>
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:140)
>
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:133)
>
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:569)
>
> org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84)
>
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146)
>
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2342)
> {noformat}
> Additionally and semi related, if the meta-hosting server dies during replica
> assignment, master also immediately dies, which is unnecessary.
> {noformat}
> 2018-12-14 21:00:55,331 ERROR [master/...:17000:becomeActiveMaster]
> master.HMaster: Failed to become active master
> org.apache.hadoop.hbase.HBaseIOException: rit=OFFLINE, location=null,
> table=hbase:meta, region=534574363 is currently in transition
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:545)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:563)
> at
> org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84)
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146)
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2342)
> at
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:591)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)