Michael Stack created HBASE-24015:
-------------------------------------

             Summary: Coverage for Assign and Unassign of Regions on 
RegionServer on failure
                 Key: HBASE-24015
                 URL: https://issues.apache.org/jira/browse/HBASE-24015
             Project: HBase
          Issue Type: Bug
          Components: amv2
            Reporter: Michael Stack


Looking at 'HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in 
teardown', and at UnassignRegionHandler, AssignRegionHandler, 
CloseRegionHandler, and the work that is done inline w/ request vs that which 
to the side in executors, we need more coverage and specification of what 
happens around the edges. This coverage would be more to see if holes in our 
handling currently in a unit test case context before we see it out on clusters.

HBASE-23984  addresses holes where UnassignRegionHandler and 
AssignRegionHandler could skip out w/o clearing Regions from the 
RegionServer#regionsInTransitionInRS Map of Regions In Transition if failed 
open or close because the RegionServer is aborting.

Other holes seem lurking. On exception, we were leaving entries in the 
RegionServer# submittedRegionProcedure map added by HBASE-2204; not the end of 
the world but they should be cleared on error? HBASE-23984 adds clearning from 
submittedRegionProcedure but then procedures even if failed get added to the 
cache of procedures... so if we try to run the procedure again against this 
server it won't be scheduled.

interesting stuff.

This issue is about adding tests that fail assign/unassign/close on the 
RegionServer side making sure RS state is left in a good condition on fail.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to