[ 
https://issues.apache.org/jira/browse/HBASE-24015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136392#comment-17136392
 ] 

Viraj Jasani commented on HBASE-24015:
--------------------------------------

[~stack] the patch provided by Sandeep is committed to master and branch-2. I 
don't feel these could be flaky test, I tried running them in loop yesterday 
also. Should we wait for the nightly build to stay green for next couple of 
days before we can commit this to branch-2.3?

Thanks [~sandeep.pal] for this nice coverage.

> Coverage for Assign and Unassign of Regions on RegionServer on failure
> ----------------------------------------------------------------------
>
>                 Key: HBASE-24015
>                 URL: https://issues.apache.org/jira/browse/HBASE-24015
>             Project: HBase
>          Issue Type: Test
>          Components: amv2
>            Reporter: Michael Stack
>            Assignee: Sandeep Pal
>            Priority: Major
>
> Looking at 'HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in 
> teardown', and at UnassignRegionHandler, AssignRegionHandler, 
> CloseRegionHandler, and the work that is done inline w/ request vs that which 
> to the side in executors, we need more coverage and specification of what 
> happens around the edges. This coverage would be more to see if holes in our 
> handling currently in a unit test case context before we see it out on 
> clusters.
> HBASE-23984  addresses holes where UnassignRegionHandler and 
> AssignRegionHandler could skip out w/o clearing Regions from the 
> RegionServer#regionsInTransitionInRS Map of Regions In Transition if failed 
> open or close because the RegionServer is aborting.
> Other holes seem lurking. On exception, we were leaving entries in the 
> RegionServer# submittedRegionProcedure map added by HBASE-2204; not the end 
> of the world but they should be cleared on error? HBASE-23984 adds clearning 
> from submittedRegionProcedure but then procedures even if failed get added to 
> the cache of procedures... so if we try to run the procedure again against 
> this server it won't be scheduled.
> interesting stuff.
> This issue is about adding tests that fail assign/unassign/close on the 
> RegionServer side making sure RS state is left in a good condition on fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to