[
https://issues.apache.org/jira/browse/HBASE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644395#comment-16644395
]
Duo Zhang commented on HBASE-21278:
-----------------------------------
I think the problem is that, we will rollback after the
MERGE_TABLE_REGIONS_CHECK_CLOSED_REGIONS state, but we will not wait for the
completion of the sub procedures, which are two TRSPs.
In the code of ProcedureExecutor, we will wait before the sub procedures are
finished, before executing rollback, and in the upload log we can see that the
TRSP with pid=43 was finished, but finally we still wanted to rollback it,
which is a bit strange. Need to dig more.
> TestMergeTableRegionsProcedure is flaky
> ---------------------------------------
>
> Key: HBASE-21278
> URL: https://issues.apache.org/jira/browse/HBASE-21278
> Project: HBase
> Issue Type: Bug
> Reporter: Duo Zhang
> Priority: Major
> Attachments:
> org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt
>
>
> https://builds.apache.org/job/HBase-Flaky-Tests/job/master/1235/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.assignment.TestMergeTableRegionsProcedure-output.txt/*view*/
> I think the problem is
> {noformat}
> 2018-10-08 03:44:30,315 INFO [PEWorker-1]
> procedure.MasterProcedureScheduler(689): pid=43, ppid=42, state=SUCCESS,
> hasLock=false; TransitRegionStateProcedure
> table=testRollbackAndDoubleExecution,
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN checking lock on
> 9bac7c539ac0cff6dc5706ed375a3bfb
> 2018-10-08 03:44:30,320 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159):
> CODE-BUG: Uncaught runtime exception for pid=43, ppid=42, state=SUCCESS,
> hasLock=true; TransitRegionStateProcedure
> table=testRollbackAndDoubleExecution,
> region=9bac7c539ac0cff6dc5706ed375a3bfb, UNASSIGN
> java.lang.UnsupportedOperationException
> at
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:458)
> at
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.rollbackState(TransitRegionStateProcedure.java:97)
> at
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:208)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:957)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1605)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1567)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1446)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)
> {noformat}
> Typically there is no rollback for TRSP. Need to dig more.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)