Guanghao Zhang created HBASE-25225:
--------------------------------------
Summary: Create table very slowly if there are multi regions
Key: HBASE-25225
URL: https://issues.apache.org/jira/browse/HBASE-25225
Project: HBase
Issue Type: Bug
Affects Versions: 2.2.6
Reporter: Guanghao Zhang
Run the same UT TestRegionReplicaFailover on my local PC, mvn clean test
-Dtest=TestRegionReplicaFailover, branch-2.2 takes 8 mins but branch-2.3 only
needs 2 mins.
I found the problem is related to procedure schedule. See the below log:
2020-10-21 13:52:28,097 INFO [PEWorker-1] procedure2.ProcedureExecutor(1427):
Finished pid=296, ppid=45, state=SUCCESS;
org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure in 1.6250sec
2020-10-21 13:52:28,538 INFO [PEWorker-3] procedure2.ProcedureExecutor(1427):
Finished pid=45, ppid=20, state=SUCCESS; TransitRegionStateProcedure
table=testLotsOfRegionRepli2, region=50703895da3cb8c942d3197600d549bc, ASSIGN
in 59.4330sec
The real assign procedure only cost 1.6 seconds but the
TransitRegionStateProcedure cost 59.4 seconds. The pid=45 procedure was
initialized at 2020-10-21 13:51:28,666. It was added to TableQueue at
2020-10-21 13:51:28,789. But took xlock to run at 2020-10-21 13:52:24,761. See
the below log:
{color:#ff0000}2020-10-21 13:51:28,789{color} DEBUG [PEWorker-4]
procedure.MasterProcedureScheduler(352): Add TableQueue(testLotsOfRegionRepli2,
xlock=true (20) sharedLock=0 size=25) to run queue because: pid=45, ppid=20,
state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE;
TransitRegionStateProcedure table=testLotsOfRegionRepli2,
region=50703895da3cb8c942d3197600d549bc, ASSIGN has the excusive lock access
{color:#ff0000}2020-10-21 13:52:24,761{color} INFO [PEWorker-2]
procedure.MasterProcedureScheduler(737): Took xlock for pid=45, ppid=20,
state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE;
TransitRegionStateProcedure table=testLotsOfRegionRepli2,
region=50703895da3cb8c942d3197600d549bc, ASSIGN
But when I tried this UT on another PC, it only cost 2 mins, which is the same
with branch-2.3. It is weird.
Marked this as blocker for release 2.2.7.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)