[
https://issues.apache.org/jira/browse/HBASE-20024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379077#comment-16379077
]
Umesh Agashe commented on HBASE-20024:
--------------------------------------
Here are my findings:
* We assume that step/ state transition is sequential. So, we run for loop
from 0 to lastStep. MergeTableRegionsProcedure and SplitTableRegionProcedure
doesn't use (skip) step 2.
* killAndToggleBeforeStoreUpdate is currently framework wide. This causes all
procedures running in framework (including sub-procedures) execute one step at
a time followed by restart. I have created a Jira HBASE-20099 to add per
procedure toggle.
* #1 and #2 above, make asserting lastStep in a sequential for loop
non-deterministic.
I have created a patch to get procedure stateId instead and comparing with last
step. This change will affect all tests calling
MasterProcedureTestingUtility#testRecoveryAndDoubleExecution. Lets give it a go
and see if this works.
> TestMergeTableRegionsProcedure is STILL flakey
> ----------------------------------------------
>
> Key: HBASE-20024
> URL: https://issues.apache.org/jira/browse/HBASE-20024
> Project: HBase
> Issue Type: Sub-task
> Reporter: stack
> Assignee: Umesh Agashe
> Priority: Major
> Attachments: HBASE-20024.branch-2.001.patch
>
>
> This is a follow-on from HBASE-20015. Root issue is that merge does not
> support rollback once it has hit the point-of-no-return; it can only
> roll-forward at this point.
> HBASE-18018 added abort to all procedures. HBASE-18016 added ignoring abort
> to the truncate procedure to get around flakeyness. HBASE-20022 is a new
> issue to figure what to do w/ unabortable procedures.
> Meantime, merge and split have PONR and the procedure executor test harness
> does abort regardless making these tests flakies. Adding an ignore of the
> abort once into the PONR makes sense (Always? HBASE-20022 is for those as yet
> unknown cases where it does not). Let me ignore to merge and split in this
> issue to address flakeyness.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)