[
https://issues.apache.org/jira/browse/HBASE-25255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227906#comment-17227906
]
Duo Zhang commented on HBASE-25255:
-----------------------------------
OK, I think the problem here is because a race between loading meta and create
rs group table.
In CreateTableProcedure.waitInitialized, we have this
{code}
@Override
protected boolean waitInitialized(MasterProcedureEnv env) {
if (getTableName().isSystemTable()) {
// Creating system table is part of the initialization, so do not wait
here.
return false;
}
return super.waitInitialized(env);
}
{code}
Which means when creating rs group table, we will not wait for meta loaded, so
it is possible that before meta loaded is finished, we could add the region
state node for rsgroup table to AssignmentManager with OFFLINE state, and then
we will call processOfflineRegions to assign the offline regions, but at the
same time, the CreateTableProcedure will assign it too, thus we get this
assertion error.
On master branch, meta table does not rely on any other system tables so I
think we could just change the above method to let CreateTableProcedure always
wait meta loaded, as we do not use CreateTableProcedure to create meta, and
also carefully change the order of when to call processOfflineRegions.
But on branch-2, we still need to create namespace table... So we need to find
another solution for branch-2.
[~stack] FYI.
> Master fails to initialize when creating rs group table
> -------------------------------------------------------
>
> Key: HBASE-25255
> URL: https://issues.apache.org/jira/browse/HBASE-25255
> Project: HBase
> Issue Type: Bug
> Components: master, rsgroup
> Reporter: Duo Zhang
> Priority: Critical
> Attachments:
> TEST-org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS.xml
>
>
> Saw this when setup TestRSGroupsKillRS
> {noformat}
> 2020-11-07 16:29:54,565 ERROR [master/e476f4f509a7:0:becomeActiveMaster]
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.AssertionError
> at
> org.apache.hadoop.hbase.master.assignment.RegionStateNode.setProcedure(RegionStateNode.java:198)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createAssignProcedure(AssignmentManager.java:647)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$null$6(AssignmentManager.java:878)
> at
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
> at
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
> at
> java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1699)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:546)
> at
> java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
> at
> java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:505)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createAssignProcedures(AssignmentManager.java:879)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createRoundRobinAssignProcedures(AssignmentManager.java:759)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createRoundRobinAssignProcedures(AssignmentManager.java:775)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processOfflineRegions(AssignmentManager.java:1513)
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1012)
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2116)
> at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:515)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)