[jira] [Created] (HBASE-12131) [hbck] undeployRegions should handle gracefully network partitions and other exceptions to avoid the same region deployed multiple times

Esteban Gutierrez (JIRA) Tue, 30 Sep 2014 18:12:12 -0700

Esteban Gutierrez created HBASE-12131:
-----------------------------------------


             Summary: [hbck] undeployRegions should handle gracefully network 
partitions and other exceptions to avoid the same region deployed multiple times
                 Key: HBASE-12131
                 URL: https://issues.apache.org/jira/browse/HBASE-12131
             Project: HBase
          Issue Type: Bug
          Components: hbck
    Affects Versions: 0.94.23
            Reporter: Esteban Gutierrez
            Assignee: Esteban Gutierrez
            Priority: Critical


If we get an IOE (we currently ignore it) while regions are being undeployed by 
hbck we should make sure that we don't re-assign that region in the master 
before we know that RS was marked as dead and optionally let the user to 
confirm that action or we will end in a split brain situation with clients 
talking to different RSs serving the same region.

The offending part is here in HBaseFsck.undeployRegions():

{code}
 private void undeployRegions(HbckInfo hi) throws IOException, 
InterruptedException {
    for (OnlineEntry rse : hi.deployedEntries) {
      LOG.debug("Undeploy region "  + rse.hri + " from " + rse.hsa);
      try {
        HBaseFsckRepair.closeRegionSilentlyAndWait(admin, rse.hsa, rse.hri);
        offline(rse.hri.getRegionName());
      } catch (IOException ioe) {
        LOG.warn("Got exception when attempting to offline region "
            + Bytes.toString(rse.hri.getRegionName()), ioe);
      }
    }
  }
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-12131) [hbck] undeployRegions should handle gracefully network partitions and other exceptions to avoid the same region deployed multiple times

Reply via email to