[
https://issues.apache.org/jira/browse/HBASE-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368718#comment-16368718
]
stack commented on HBASE-20015:
-------------------------------
Hmm.. These have dropped off the top list in the flakies dashboard but still on
the bottom set, as though the patch were not in place... Looking.
> TestMergeTableRegionsProcedure and TestRegionMergeTransactionOnCluster flakey
> -----------------------------------------------------------------------------
>
> Key: HBASE-20015
> URL: https://issues.apache.org/jira/browse/HBASE-20015
> Project: HBase
> Issue Type: Sub-task
> Components: flakey
> Reporter: stack
> Assignee: stack
> Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-20015.branch-2.001.patch
>
>
> MergeRegionProcedure seems incomplete. The ProcedureExecutor framework can
> run in a test mode such that it kills the Procedure before it can persist
> state and it does this repeatedly to shake out areas where Procedures may not
> be preserving all needed state at each Procedural step. The kill will cause
> the Procedure to 'fail'. It'll then run the rollback procedure. The
> MergeRegionProcedure is not able to roll back the last few steps of Merge....
> It throws an UnsupportedException (the hope was that the missing steps would
> be filled in ... but they are hard to complete in that they themselves are
> stepped).
> So....
> Well it turns out that Split has a mechanism where it will not fail the
> Procedure if gets to a stage from which it cannot rollback. Instead, it will
> just retry and keep retrying till it succeeds.... eventually. Merge has this
> facility half-implemented. Merge tests are therefore flakey. They do stuff
> like this:
> {code}
> 2018-02-17 04:04:02,999 WARN [PEWorker-1]
> assignment.MergeTableRegionsProcedure(311): Failed rollback attempt step
> MERGE_TABLE_REGIONS_UPDATE_META for merging the regions
> [485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c] in table
> testRollbackAndDoubleExecution
> java.lang.UnsupportedOperationException: pid=44,
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META,
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
> abort requested; MergeTableRegionsProcedure
> table=testRollbackAndDoubleExecution,
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c],
> forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META
> at
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291)
> at
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78)
> at
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
> 2018-02-17 04:04:03,007 ERROR [PEWorker-1] helpers.MarkerIgnoringBase(159):
> CODE-BUG: Uncaught runtime exception for pid=44,
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META,
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
> abort requested; MergeTableRegionsProcedure
> table=testRollbackAndDoubleExecution,
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c],
> forcibly=false
> java.lang.UnsupportedOperationException: pid=44,
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META,
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via
> MergeTableRegionsProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException:
> abort requested; MergeTableRegionsProcedure
> table=testRollbackAndDoubleExecution,
> regions=[485dd0c2a5d14601d61fed791f793158, 8af34a614f064c162ab1d05eac7fca4c],
> forcibly=false unhandled state=MERGE_TABLE_REGIONS_UPDATE_META
> at
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:291)
> at
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.rollbackState(MergeTableRegionsProcedure.java:78)
> at
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:199)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:859)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1356)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1312)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1181)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
> {code}
> i.e. throw up their hands which makes for a CODE-BUG... a condition the
> framework can not process.... The test fails.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)