[ 
https://issues.apache.org/jira/browse/HBASE-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658891#action_12658891
 ] 

Jim Kellerman commented on HBASE-543:
-------------------------------------

@Andrew

Well the good news is that this problem prevented an inconsistent state in the 
master, as ProcessRegionOpen would have updated the meta with the original 
server when, in fact it was being to close that region.

The bad news, of course is that the region rebalancing did not work properly. 
unassignSomeRegions should not choose regions that are unassigned, assigned or 
pending.

@Stack
Yes, the lock on RegionManager is broad, however it was the only way I could 
see to guard multiple operations that effect both the regionsInTransition map 
and the onlineMetaRegions map, which happen in a couple of places. Separate 
locks for regionsInTransition and onlineMetaRegions would be more deadlock 
prone I thought. With this approach, every method that performs multiple 
operations on either map either grabs the RegionManager's monitor or waits 
while the current owner of the monitor does its thing and gets out. I don't 
think I grab RegionManager's monitor over any long running operation, but I 
will reverify that.

> A region's state is kept in several places in the master opening the 
> possibility for race conditions
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-543
>                 URL: https://issues.apache.org/jira/browse/HBASE-543
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.1.0, 0.1.1, 0.2.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: 543.patch, 543.patch, 543.patch, 543.patch-4, 
> 543.patch-5, apurtell-HMaster-20081223-1.log.zip, recent-changes.patch, 
> regionstate.txt
>
>
> A region's state exists in multiple maps in the RegionManager: 
> unassignedRegions, pendingRegions, regionsToClose, closingRegions, 
> regionsToDelete, etc.
> One of these race conditions was found in HBASE-534.
> For HBase-0.1.x, we should just patch the holes we find.
> The ultimate solution (which requires a lot of changes in HMaster) should be 
> applied to HBase trunk.
> Proposed solution:
> Create a class that encapsulates a region's state and provide synchronized 
> access to the class that validates state changes.
> There should be a single structure that holds regions in these transitional 
> states and it should be a synchronized collection of some kind.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to