[ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100386#comment-13100386
 ] 

ramkrishna.s.vasudevan commented on HBASE-4153:
-----------------------------------------------

Pls find the analysis for the following state transitions

This is how i tried to simulate the scenarios
Create some 7 or 8 regions.
Using HBaseAdmin call Unassign(regionname, false) and assign(regionname, false) 
parallely.
See what happens when both operations go on parallel.

Correct me if am wrong.  Pls provide your suggestions.

1) Close        Close -> No problem
2) Close        Open 
Here we depend on the timeout
 Assume the closing is in partial state
 -> After setting the node to CLOSED state 
        Here the closing is done successfully but the problem is to open we 
need to
        wait for the timeout monitor to deduce that the region is in RIT as the 
inmemory
        state is put to OFFLINE once RegionAlreadyInTransitionExceptionHappens
 -> Before setting the node to CLOSED state 
        Here the problem is that closing is not done properly and also open 
also fails
        putting the inmemory state to OFFLINE
        The closing itself fails because when we try to assign the region it 
forcefully
        moves the znode to OFFLINE. so close is not able to move from CLOSING 
to CLOSED
May be if we get an RegionAlreadyInTransition just dont update the memory state 
to OFFLINE.
Either the previous open should be successful or even if it fails the 
PENDING_OPEN state 
timeout transition will any way happen

3) Open         Open
This is causing problem.
The thing here is assume one open region is in progress.
The next open region just fails and adds in memory state to OFFLINE.
Now the first open region gets completed and moves it to OPENED.
In handling of OPENED state
{code}
          if (regionState == null ||
              (!regionState.isPendingOpen() && !regionState.isOpening())) {
            LOG.warn("Received OPENED for region " +
                prettyPrintedRegionName +
                " from server " + data.getOrigin() + " but region was in " +
                " the state " + regionState + " and not " +
                "in expected PENDING_OPEN or OPENING states");
            return;
{code}
we have the above code.  Hence never the region can be added to master's online 
list.
This scenario is what has been handled in HBASE-4015 patch when a race happens 
between
forcing the node to OFFLINE and by the time OPENING has happened.
{code}
+      // If we are reassigning the node do not force in-memory state to 
OFFLINE.
+      // Based on the znode state we will decide if to change
+      // in-memory state to OFFLINE or not. It will
+      // be done before setting the znode to OFFLINE state.
+      if (!hijackAndPreempt) {
+        LOG.debug("Forcing OFFLINE; was=" + state);
+        state.update(RegionState.State.OFFLINE);
+      }
{code}
4)Open          Close
This will not be a seperate case in my testing.  As once we call unassign() 
region it will any way
call assign once closing is successful.  Hence it ends up in any one of the 
three.


> Handle RegionAlreadyInTransitionException in AssignmentManager
> --------------------------------------------------------------
>
>                 Key: HBASE-4153
>                 URL: https://issues.apache.org/jira/browse/HBASE-4153
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>
> Comment from Stack over in HBASE-3741:
> {quote}
> Question: Looking at this patch again, if we throw a 
> RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
> though RegionAlreadyInTransitionException in at least one case here is saying 
> that the region is already open on this regionserver?
> {quote}
> Indeed looking at the code it's going to be handled the same way other 
> exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to