[ 
https://issues.apache.org/jira/browse/HBASE-20024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379077#comment-16379077
 ] 

Umesh Agashe commented on HBASE-20024:
--------------------------------------

Here are my findings:
 * We assume that step/ state transition is sequential. So, we run for loop 
from 0 to lastStep. MergeTableRegionsProcedure and SplitTableRegionProcedure 
doesn't use (skip) step 2.
 * killAndToggleBeforeStoreUpdate is currently framework wide. This causes all 
procedures running in framework (including sub-procedures) execute one step at 
a time followed by restart. I have created a Jira HBASE-20099 to add per 
procedure toggle.
 * #1 and #2 above, make asserting lastStep in a sequential for loop 
non-deterministic.

 

I have created a patch to get procedure stateId instead and comparing with last 
step. This change will affect all tests calling 
MasterProcedureTestingUtility#testRecoveryAndDoubleExecution. Lets give it a go 
and see if this works.

> TestMergeTableRegionsProcedure is STILL flakey
> ----------------------------------------------
>
>                 Key: HBASE-20024
>                 URL: https://issues.apache.org/jira/browse/HBASE-20024
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: stack
>            Assignee: Umesh Agashe
>            Priority: Major
>         Attachments: HBASE-20024.branch-2.001.patch
>
>
> This is a follow-on from HBASE-20015. Root issue is that merge does not 
> support rollback once it has hit the point-of-no-return; it can only 
> roll-forward at this point.
> HBASE-18018 added abort to all procedures. HBASE-18016 added ignoring abort 
> to the truncate procedure to get around flakeyness. HBASE-20022 is a new 
> issue to figure what to do w/ unabortable procedures.
> Meantime, merge and split have PONR and the procedure executor test harness 
> does abort regardless making these tests flakies. Adding an ignore of the 
> abort once into the PONR makes sense (Always? HBASE-20022 is for those as yet 
> unknown cases where it does not). Let me ignore to merge and split in this 
> issue to address flakeyness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to