Race condition when closing regions that causes flakiness in TestRestartCluster
-------------------------------------------------------------------------------
Key: HBASE-3057
URL: https://issues.apache.org/jira/browse/HBASE-3057
Project: HBase
Issue Type: Bug
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Fix For: 0.90.0
In {{TestRestartCluster.testClusterRestart()}} we spin up cluster, create three
tables, shut it down, start it back up, and ensure we still have three regions.
A subtle race condition during the first shutdown makes it so the flush of META
doesn't finish so when we start back up there are no user regions.
I'm not sure if there are reasons the ordering is as such, but the is the
section of code in CloseRegionHandler around line 118:
{noformat}
this.rsServices.removeFromOnlineRegions(regionInfo.getEncodedName());
region.close(abort);
{noformat}
We remove from the online map of regions before actually closing. But what the
main run() loop in the RS is waiting on to determine when it can shut down is
that the online region map is empty.
{noformat}
private void waitOnAllRegionsToClose() {
// Wait till all regions are closed before going out.
int lastCount = -1;
while (!this.onlineRegions.isEmpty()) {
{noformat}
Any reason not to swap these two and do the close before removing from online
regions?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.