[ 
https://issues.apache.org/jira/browse/HBASE-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658359#action_12658359
 ] 

stack commented on HBASE-543:
-----------------------------

How do I know this fixes the issues?  There are no tests nor description of how 
this addresses issues seen in referenced issues.  For example, it would be 
helpful if you explained how you replicated said issues so could test this 
patch actually addresses them.

There is a bunch of redoing of state processing but no explanation as to why?  
For example:

{code}
+   * Remove a region from the region state map.
+   * 
+   * @param info
+   */
+  public void removeRegion(HRegionInfo info) {
{code}

Under what circumstance would I remove a region from state map?

The region state map itself has no explanation:

{code}
+  // Needs to be SortedMap so we can specify a comparator
+  private final SortedMap<byte[], RegionState> regionState =
+    Collections.synchronizedSortedMap(
+        new TreeMap<byte[], RegionState>(Bytes.BYTES_COMPARATOR));
{code}

The maps it would replace tried to explain what they were about.

Nor does the new RegionState map.

With above said, looks like this could be an improvement in that state is all 
in one place.

Should RegionState be looking for illegal states?  It doesn't seem to do any 
checking.  This would be a good place to check we're doing transitions properly.

Should resetting of connection root region be done inside unsetRootRegion in 
below so the two actions are tied together:

{code}
+      master.connection.setRootRegionLocation(null);
+      master.regionManager.unsetRootRegion();
{code}

Does unsetRootRegion set root region to null in regionManger?  Maybe connection 
and regionManager both need an unsetRootRegion method (or both a 
setRootRegionLocation that takes null) so same action in two places uses 
similarily named methods (This stuff preexisted your patch).

At first I thought all these things unsafe:

{code}
+      for (RegionState s: regionsToAssign) {
{code}

.. but now I see your comment that there is a lock held higher up on 
regionsToAssign.... good.

Enough for now.

Good stuff.

> A region's state is kept in several places in the master opening the 
> possibility for race conditions
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-543
>                 URL: https://issues.apache.org/jira/browse/HBASE-543
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.1.0, 0.1.1, 0.2.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: 543.patch, 543.patch, recent-changes.patch, 
> regionstate.txt
>
>
> A region's state exists in multiple maps in the RegionManager: 
> unassignedRegions, pendingRegions, regionsToClose, closingRegions, 
> regionsToDelete, etc.
> One of these race conditions was found in HBASE-534.
> For HBase-0.1.x, we should just patch the holes we find.
> The ultimate solution (which requires a lot of changes in HMaster) should be 
> applied to HBase trunk.
> Proposed solution:
> Create a class that encapsulates a region's state and provide synchronized 
> access to the class that validates state changes.
> There should be a single structure that holds regions in these transitional 
> states and it should be a synchronized collection of some kind.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to