[ https://issues.apache.org/jira/browse/HBASE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergey Shelukhin updated HBASE-21624: ------------------------------------- Summary: master startup should not wait (or die) on assigning meta replicas (was: master startup should not wait on assigning meta replicas) > master startup should not wait (or die) on assigning meta replicas > ------------------------------------------------------------------ > > Key: HBASE-21624 > URL: https://issues.apache.org/jira/browse/HBASE-21624 > Project: HBase > Issue Type: Bug > Reporter: Sergey Shelukhin > Priority: Major > > Due to some other bug, a meta replica is stuck in transition forever. > Master is running fine without it, however the initializer thread hasn't > finished initialization for ~19 hours now and is stuck in the below state. > Doesn't seem to be necessary to wait for them - could just be > fire-and-forget, normal region handling should handle it after that. > {noformat} > Thread 118 (master/...:17000:becomeActiveMaster): > State: TIMED_WAITING > Blocked count: 281 > Waited count: 67059 > Stack: > java.lang.Thread.sleep(Native Method) > > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:209) > > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:192) > > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:151) > > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:140) > > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:133) > > org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:569) > > org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84) > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146) > > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2342) > {noformat} > Additionally and semi related, if the meta-hosting server dies during replica > assignment, master also immediately dies, which is unnecessary. > {noformat} > 2018-12-14 21:00:55,331 ERROR [master/...:17000:becomeActiveMaster] > master.HMaster: Failed to become active master > org.apache.hadoop.hbase.HBaseIOException: rit=OFFLINE, location=null, > table=hbase:meta, region=534574363 is currently in transition > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:545) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:563) > at > org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146) > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2342) > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:591) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)