[
https://issues.apache.org/jira/browse/HBASE-12131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell resolved HBASE-12131.
-----------------------------------------
Assignee: (was: Esteban Gutierrez)
Resolution: Implemented
Implemented by subtask
> [hbck] undeployRegions should handle gracefully network partitions and other
> exceptions to avoid the same region deployed multiple times
> ----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-12131
> URL: https://issues.apache.org/jira/browse/HBASE-12131
> Project: HBase
> Issue Type: Bug
> Components: hbck
> Affects Versions: 0.94.23
> Reporter: Esteban Gutierrez
> Priority: Critical
>
> If we get an IOE (we currently ignore it) while regions are being undeployed
> by hbck we should make sure that we don't re-assign that region in the master
> before we know that RS was marked as dead and optionally let the user to
> confirm that action or we will end in a split brain situation with clients
> talking to different RSs serving the same region.
> The offending part is here in HBaseFsck.undeployRegions():
> {code}
> private void undeployRegions(HbckInfo hi) throws IOException,
> InterruptedException {
> for (OnlineEntry rse : hi.deployedEntries) {
> LOG.debug("Undeploy region " + rse.hri + " from " + rse.hsa);
> try {
> HBaseFsckRepair.closeRegionSilentlyAndWait(admin, rse.hsa, rse.hri);
> offline(rse.hri.getRegionName());
> } catch (IOException ioe) {
> LOG.warn("Got exception when attempting to offline region "
> + Bytes.toString(rse.hri.getRegionName()), ioe);
> }
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)