More procedure workers (hbase.master.procedure.threads 5 => 16) can reduce
the time from 8 mins to 6 mins. But branch-2.3 used the same 5 procedure
workers.
Guanghao Zhang 于2020年10月22日周四 下午2:22写道:
> And I am sure that it is not the lock problem, because there is no
> "Waiting on xlock for"
And I am sure that it is not the lock problem, because there is no "Waiting
on xlock for" log.
LOG.info("Waiting on xlock for {} held by pid={}", procedure,
regionLocks[i].getExclusiveLockProcIdOwner());
Guanghao Zhang 于2020年10月22日周四 下午2:12写道:
> Run the same UT
Run the same UT TestRegionReplicaFailover on my local PC, mvn clean test
-Dtest=TestRegionReplicaFailover, branch-2.2 takes 8 mins but branch-2.3
only needs 2 mins.
I found the problem is related to procedure schedule. See the below log:
2020-10-21 13:52:28,097 INFO [PEWorker-1]