[
https://issues.apache.org/jira/browse/HBASE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-4306:
-------------------------
Priority: Minor (was: Blocker)
Fix Version/s: (was: 0.90.5)
(was: 0.92.0)
> Race between CatalogJanitor and LoadBalancer
> --------------------------------------------
>
> Key: HBASE-4306
> URL: https://issues.apache.org/jira/browse/HBASE-4306
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.4
> Reporter: Jean-Daniel Cryans
> Priority: Minor
>
> It is possible for the LoadBalancer to try to assign an offline/split region
> while it is waiting to be CatalogJanitor'ed. It goes like this:
> {quote}
> 2011-08-25 00:32:07,137 INFO org.apache.hadoop.hbase.master.ServerManager:
> Received REGION_SPLIT: parent: Daughters; d1, d2 from
> sv4r22s16,60020,1314211225331
> ...
> (cleaning never happens or whatever)
> ...
> 2011-08-29 13:45:14,561 INFO org.apache.hadoop.hbase.master.HMaster: balance
> hri=parent, src=sv4r22s16,60020,1314211225331,
> dest=sv4r19s17,60020,1314218170402
> 2011-08-29 13:45:14,561 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of
> region parent (offlining)
> 2011-08-29 13:45:14,588 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Server
> serverName=sv4r22s16,60020,1314211225331, load=(requests=0, regions=0,
> usedHeap=0, maxHeap=0) returned
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Received close for parent
> but we are not serving it for parent
> {quote}
> Here it took 4 days of balancing to finally get to try to balance the parent
> (that was never deleted because of HBASE-4238), but it can also happen if the
> balancer decides to balance the parent just before it's cleaned. The end
> effect is that the balancer will be disabled _forever_ until that's fixed.
> The culprit here is that the master keeps the region "online" until
> AssignmentManager.regionOffline is called by the CJ, which means it's still
> treated like any other region although it's offline.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira