[ 
https://issues.apache.org/jira/browse/HBASE-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray resolved HBASE-3057.
----------------------------------

    Hadoop Flags: [Reviewed]
      Resolution: Fixed

Committed to trunk

> Race condition when closing regions that causes flakiness in 
> TestRestartCluster
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-3057
>                 URL: https://issues.apache.org/jira/browse/HBASE-3057
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3057-v1.patch
>
>
> In {{TestRestartCluster.testClusterRestart()}} we spin up cluster, create 
> three tables, shut it down, start it back up, and ensure we still have three 
> regions.
> A subtle race condition during the first shutdown makes it so the flush of 
> META doesn't finish so when we start back up there are no user regions.
> I'm not sure if there are reasons the ordering is as such, but the is the 
> section of code in CloseRegionHandler around line 118:
> {noformat}
>       this.rsServices.removeFromOnlineRegions(regionInfo.getEncodedName());
>       region.close(abort);
> {noformat}
> We remove from the online map of regions before actually closing.  But what 
> the main run() loop in the RS is waiting on to determine when it can shut 
> down is that the online region map is empty.
> {noformat}
>   private void waitOnAllRegionsToClose() {
>     // Wait till all regions are closed before going out.
>     int lastCount = -1;
>     while (!this.onlineRegions.isEmpty()) {
> {noformat}
> Any reason not to swap these two and do the close before removing from online 
> regions?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to