[ 
https://issues.apache.org/jira/browse/HBASE-25255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227906#comment-17227906
 ] 

Duo Zhang commented on HBASE-25255:
-----------------------------------

OK, I think the problem here is because a race between loading meta and create 
rs group table.

In CreateTableProcedure.waitInitialized, we have this

{code}
  @Override
  protected boolean waitInitialized(MasterProcedureEnv env) {
    if (getTableName().isSystemTable()) {
      // Creating system table is part of the initialization, so do not wait 
here.
      return false;
    }
    return super.waitInitialized(env);
  }
{code}

Which means when creating rs group table, we will not wait for meta loaded, so 
it is possible that before meta loaded is finished, we could add the region 
state node for rsgroup table to AssignmentManager with OFFLINE state, and then 
we will call processOfflineRegions to assign the offline regions, but at the 
same time, the CreateTableProcedure will assign it too, thus we get this 
assertion error.

On master branch, meta table does not rely on any other system tables so I 
think we could just change the above method to let CreateTableProcedure always 
wait meta loaded, as we do not use CreateTableProcedure to create meta, and 
also carefully change the order of when to call processOfflineRegions.

But on branch-2, we still need to create namespace table... So we need to find 
another solution for branch-2.

[~stack] FYI.

> Master fails to initialize when creating rs group table
> -------------------------------------------------------
>
>                 Key: HBASE-25255
>                 URL: https://issues.apache.org/jira/browse/HBASE-25255
>             Project: HBase
>          Issue Type: Bug
>          Components: master, rsgroup
>            Reporter: Duo Zhang
>            Priority: Critical
>         Attachments: 
> TEST-org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS.xml
>
>
> Saw this when setup TestRSGroupsKillRS
> {noformat}
> 2020-11-07 16:29:54,565 ERROR [master/e476f4f509a7:0:becomeActiveMaster] 
> helpers.MarkerIgnoringBase(159): Failed to become active master
> java.lang.AssertionError
>       at 
> org.apache.hadoop.hbase.master.assignment.RegionStateNode.setProcedure(RegionStateNode.java:198)
>       at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createAssignProcedure(AssignmentManager.java:647)
>       at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$null$6(AssignmentManager.java:878)
>       at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>       at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>       at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>       at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>       at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>       at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>       at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>       at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>       at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
>       at 
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
>       at 
> java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1699)
>       at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>       at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>       at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:546)
>       at 
> java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
>       at 
> java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:505)
>       at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createAssignProcedures(AssignmentManager.java:879)
>       at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createRoundRobinAssignProcedures(AssignmentManager.java:759)
>       at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createRoundRobinAssignProcedures(AssignmentManager.java:775)
>       at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processOfflineRegions(AssignmentManager.java:1513)
>       at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1012)
>       at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2116)
>       at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:515)
>       at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to