Bing Jiang created HBASE-7101:
---------------------------------

             Summary: HBase stuck in Region SPLIT 
                 Key: HBASE-7101
                 URL: https://issues.apache.org/jira/browse/HBASE-7101
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.92.1
            Reporter: Bing Jiang


I found this issue from a zknode which has existed for a long time in the 
unassigned parent.And HMaster report warnning log increasingly.The loop log is 
at below. 

WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
1a1c950ad45812d7b4b9b90ebf268468 not found on server 
sev0040,60020,1350378314041; failed processing
WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 
but it doesn't exist anymore, probably already processed its split
WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
1a1c950ad45812d7b4b9b90ebf268468 not found on server 
gs-dpo-sev0040,60020,1350378314041; failed processing
WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 
but it doesn't exist anymore, probably already processed its split

we use Hbase-0.92.1, and I trace back to the source code. HMaster 
AssignmentManager have already deleted the SPLIT_Region in its memory 
structure,but HRegionServer SplitTransaction has found the 
unassigned/parent-node existed in a transient state, precisely SplitTransaction 
executes tickleNodeSplit to update a new version a little later than  
AssignmentManager deleting unassigned/parent-znode. After updating a version of 
the znode, it will intrigue the handleRegion operation again, however, 
AssignmentManager assert that the RegionState in Memory has been deleted, and 
transaction goes into a retry loop.

In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after 
sleeping 100ms. In my opinion, if the time is much longger than 100ms, all the 
operation from AssignmentManagement will finish off completely.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to