[ 
https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526592#comment-13526592
 ] 

Dave Latham commented on HBASE-7101:
------------------------------------

Yes, I'm using 0.92.  I should note that until I intervened that the region 
server in question had it's split thread stuck waiting for the master to 
complete, and so it stopped processing other splits which eventually led to 
some huge regions and some other problems.  Relevant regionserver log output 
like:
{noformat}
2012-12-06 00:00:03,346 DEBUG 
org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
master to process the split for 374f57cc18a7f8ee54b322350c009169
2012-12-06 00:00:03,449 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13ae81c303ef0c2-0x13ae81c303ef0c2-0x13ae81c303ef0c2 
Attempting to transition node 374f57cc18a7f8ee54b322350c009169 from 
RS_ZK_REGION_SPLIT to RS_ZK_REGION_SPLIT
2012-12-06 00:00:03,451 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13ae81c303ef0c2-0x13ae81c303ef0c2-0x13ae81c303ef0c2 
Successfully transitioned node 374f57cc18a7f8ee54b322350c009169 from 
RS_ZK_REGION_SPLIT to RS_ZK_REGION_SPLIT
2012-12-06 00:00:03,553 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13ae81c303ef0c2-0x13ae81c303ef0c2-0x13ae81c303ef0c2 
Attempting to transition node 374f57cc18a7f8ee54b322350c009169 from 
RS_ZK_REGION_SPLIT to RS_ZK_REGION_SPLIT
2012-12-06 00:00:03,554 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x13ae81c303ef0c2-0x13ae81c303ef0c2-0x13ae81c303ef0c2 
Successfully transitioned node 374f57cc18a7f8ee54b322350c009169 from 
RS_ZK_REGION_SPLIT to RS_ZK_REGION_SPLIT
{noformat}
                
> HBase stuck in Region SPLIT 
> ----------------------------
>
>                 Key: HBASE-7101
>                 URL: https://issues.apache.org/jira/browse/HBASE-7101
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: Bing Jiang
>             Fix For: 0.96.0, 0.94.4
>
>
> I found this issue from a zknode which has existed for a long time in the 
> unassigned parent.And HMaster report warnning log increasingly.The loop log 
> is at below. 
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
> 1a1c950ad45812d7b4b9b90ebf268468 not found on server 
> sev0040,60020,1350378314041; failed processing
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
> region 1a1c950ad45812d7b4b9b90ebf268468 from server 
> sev0040,60020,1350378314041 but it doesn't exist anymore, probably already 
> processed its split
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
> 1a1c950ad45812d7b4b9b90ebf268468 not found on server 
> gs-dpo-sev0040,60020,1350378314041; failed processing
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
> region 1a1c950ad45812d7b4b9b90ebf268468 from server 
> sev0040,60020,1350378314041 but it doesn't exist anymore, probably already 
> processed its split
> we use Hbase-0.92.1, and I trace back to the source code. HMaster 
> AssignmentManager have already deleted the SPLIT_Region in its memory 
> structure,but HRegionServer SplitTransaction has found the 
> unassigned/parent-node existed in a transient state, precisely 
> SplitTransaction executes tickleNodeSplit to update a new version a little 
> later than  AssignmentManager deleting unassigned/parent-znode. After 
> updating a version of the znode, it will intrigue the handleRegion operation 
> again, however, AssignmentManager assert that the RegionState in Memory has 
> been deleted, and transaction goes into a retry loop.
> In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after 
> sleeping 100ms. In my opinion, if the time is much longger than 100ms, all 
> the operation from AssignmentManagement will finish off completely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to