stack created HBASE-20015:
-----------------------------

             Summary: TestMergeTableRegionsProcedure and 
TestRegionMergeTransactionOnCluster flakey
                 Key: HBASE-20015
                 URL: https://issues.apache.org/jira/browse/HBASE-20015
             Project: HBase
          Issue Type: Sub-task
          Components: flakey
            Reporter: stack
            Assignee: stack
             Fix For: 2.0.0-beta-2


MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can run 
in a test mode such that it kills the Procedure before it can persist state and 
it does this repeatedly to shake out areas where Procedures may not be 
preserving all needed state at each Procedural step. The kill will cause the 
Procedure to 'fail'. It'll then run the rollback procedure. The 
MergeRegionProcedure is not able to roll back the last few steps of Merge.... 
It throws an UnsupportedException (the hope was that the missing steps would be 
filled in ... but they are hard to complete in that they themselves are 
stepped).

So....

Well it turns out that Split has a mechanism where it will not fail the 
Procedure if gets to a stage from which it cannot rollback. Instead, it will 
just retry and keep retrying till it succeeds.... eventually. Merge has this 
facility half-implemented. Merge tests are therefore flakey. They do stuff like 
this:


{code}
2018-02-17 04:04:02,999 WARN  [PEWorker-1] 
assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step 
MERGE_TABLE_REGIONS_UPDATE_META for merging the regions 
[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table 
testRollbackAndDoubleExecution
java.lang.UnsupportedOperationException: pid=44, 
state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
 abort requested; MergeTableRegionsProcedure 
table=testRollbackAndDoubleExecution, 
regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META
        at 
org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291)
        at 
org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78)
        at 
org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199)
        at 
org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159): 
CODE-BUG: Uncaught runtime exception for pid=44, 
state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
 abort requested; MergeTableRegionsProcedure 
table=testRollbackAndDoubleExecution, 
regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
forcibly=false
java.lang.UnsupportedOperationException: pid=44, 
state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, 
exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
 abort requested; MergeTableRegionsProcedure 
table=testRollbackAndDoubleExecution, 
regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c], 
forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META
        at 
org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291)
        at 
org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78)
        at 
org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199)
        at 
org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
{code}

i.e. throw up their hands which makes for a CODE-BUG... a condition the 
framework can not process.... The test fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to