Apache9 commented on a change in pull request #1311: HBASE-23984 [Flakey Tests]
TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395407484
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java
##########
@@ -94,40 +93,42 @@ public void process() throws IOException {
}
return;
}
- HRegion region = rs.getRegion(encodedName);
- if (region == null) {
- LOG.debug(
- "Received CLOSE for a region {} which is not online, and we're not
opening/closing.",
- encodedName);
- rs.getRegionsInTransitionInRS().remove(encodedNameBytes, Boolean.FALSE);
- return;
- }
- String regionName = region.getRegionInfo().getEncodedName();
- LOG.info("Close {}", regionName);
- if (region.getCoprocessorHost() != null) {
- // XXX: The behavior is a bit broken. At master side there is no
FAILED_CLOSE state, so if
- // there are exception thrown from the CP, we can not report the error
to master, and if here
- // we just return without calling reportRegionStateTransition, the TRSP
at master side will
- // hang there for ever. So here if the CP throws an exception out, the
only way is to abort
- // the RS...
- region.getCoprocessorHost().preClose(abort);
- }
- if (region.close(abort) == null) {
- // XXX: Is this still possible? The old comment says about split, but
now split is done at
- // master side, so...
- LOG.warn("Can't close region {}, was already closed during close()",
regionName);
+ try {
+ HRegion region = rs.getRegion(encodedName);
+ if (region == null) {
+ LOG.debug(
+ "Received CLOSE for a region {} which is not online, and we're not
opening/closing.",
+ encodedName);
+ return;
+ }
+ String regionName = region.getRegionInfo().getEncodedName();
+ LOG.info("Close {}", regionName);
+ if (region.getCoprocessorHost() != null) {
+ // XXX: The behavior is a bit broken. At master side there is no
FAILED_CLOSE state, so if
+ // there are exception thrown from the CP, we can not report the error
to master, and if
+ // here we just return without calling reportRegionStateTransition,
the TRSP at master side
+ // will hang there for ever. So here if the CP throws an exception
out, the only way is to
+ // abort the RS...
+ region.getCoprocessorHost().preClose(abort);
+ }
+ if (region.close(abort) == null) {
+ // XXX: Is this still possible? The old comment says about split, but
now split is done at
+ // master side, so...
+ LOG.warn("Can't close region {}, was already closed during close()",
regionName);
+ return;
+ }
+ rs.removeRegion(region, destination);
+ if (!rs.reportRegionStateTransition(
+ new RegionStateTransitionContext(TransitionCode.CLOSED,
HConstants.NO_SEQNUM, closeProcId,
+ -1, region.getRegionInfo()))) {
+ throw new IOException("Failed to report close to master: " +
regionName);
+ }
+ // Cache the close region procedure id after report region transition
succeed.
+ rs.finishRegionProcedure(closeProcId);
+ LOG.info("Closed {}", regionName);
+ } finally {
Review comment:
So this is the actual fix here?
If you really want to do this to let the test pass, I suggest you add the
removal in the handleException method, and add a FIXME or TODO comment to say
that this is just for making test pass, should be addressed later.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services