Inconsistency between the "regions" map and the "servers" map in
AssignmentManager
----------------------------------------------------------------------------------
Key: HBASE-5829
URL: https://issues.apache.org/jira/browse/HBASE-5829
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.92.1, 0.90.6
Reporter: Maryann Xue
There are occurrences in AM where this.servers is not kept consistent with
this.regions. This might cause balancer to offline a region from the RS that
already returned NotServingRegionException at a previous offline attempt.
In AssignmentManager.unassign(HRegionInfo, boolean)
try {
// TODO: We should consider making this look more like it does for the
// region open where we catch all throwables and never abort
if (serverManager.sendRegionClose(server, state.getRegion(),
versionOfClosingNode)) {
LOG.debug("Sent CLOSE to " + server + " for region " +
region.getRegionNameAsString());
return;
}
// This never happens. Currently regionserver close always return true.
LOG.warn("Server " + server + " region CLOSE RPC returned false for " +
region.getRegionNameAsString());
} catch (NotServingRegionException nsre) {
LOG.info("Server " + server + " returned " + nsre + " for " +
region.getRegionNameAsString());
// Presume that master has stale data. Presume remote side just split.
// Presume that the split message when it comes in will fix up the
master's
// in memory cluster state.
} catch (Throwable t) {
if (t instanceof RemoteException) {
t = ((RemoteException)t).unwrapRemoteException();
if (t instanceof NotServingRegionException) {
if (checkIfRegionBelongsToDisabling(region)) {
// Remove from the regionsinTransition map
LOG.info("While trying to recover the table "
+ region.getTableNameAsString()
+ " to DISABLED state the region " + region
+ " was offlined but the table was in DISABLING state");
synchronized (this.regionsInTransition) {
this.regionsInTransition.remove(region.getEncodedName());
}
// Remove from the regionsMap
synchronized (this.regions) {
this.regions.remove(region);
}
deleteClosingOrClosedNode(region);
}
}
// RS is already processing this region, only need to update the
timestamp
if (t instanceof RegionAlreadyInTransitionException) {
LOG.debug("update " + state + " the timestamp.");
state.update(state.getState());
}
}
In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, boolean)
synchronized (this.regions) {
this.regions.put(plan.getRegionInfo(), plan.getDestination());
}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira