[
https://issues.apache.org/jira/browse/HBASE-24015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136392#comment-17136392
]
Viraj Jasani commented on HBASE-24015:
--------------------------------------
[~stack] the patch provided by Sandeep is committed to master and branch-2. I
don't feel these could be flaky test, I tried running them in loop yesterday
also. Should we wait for the nightly build to stay green for next couple of
days before we can commit this to branch-2.3?
Thanks [~sandeep.pal] for this nice coverage.
> Coverage for Assign and Unassign of Regions on RegionServer on failure
> ----------------------------------------------------------------------
>
> Key: HBASE-24015
> URL: https://issues.apache.org/jira/browse/HBASE-24015
> Project: HBase
> Issue Type: Test
> Components: amv2
> Reporter: Michael Stack
> Assignee: Sandeep Pal
> Priority: Major
>
> Looking at 'HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in
> teardown', and at UnassignRegionHandler, AssignRegionHandler,
> CloseRegionHandler, and the work that is done inline w/ request vs that which
> to the side in executors, we need more coverage and specification of what
> happens around the edges. This coverage would be more to see if holes in our
> handling currently in a unit test case context before we see it out on
> clusters.
> HBASE-23984 addresses holes where UnassignRegionHandler and
> AssignRegionHandler could skip out w/o clearing Regions from the
> RegionServer#regionsInTransitionInRS Map of Regions In Transition if failed
> open or close because the RegionServer is aborting.
> Other holes seem lurking. On exception, we were leaving entries in the
> RegionServer# submittedRegionProcedure map added by HBASE-2204; not the end
> of the world but they should be cleared on error? HBASE-23984 adds clearning
> from submittedRegionProcedure but then procedures even if failed get added to
> the cache of procedures... so if we try to run the procedure again against
> this server it won't be scheduled.
> interesting stuff.
> This issue is about adding tests that fail assign/unassign/close on the
> RegionServer side making sure RS state is left in a good condition on fail.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)