[jira] [Resolved] (HBASE-7101) HBase stuck in Region SPLIT
[ https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7101. -- Resolution: Fixed Fix Version/s: (was: 0.94.5) (was: 0.96.0) I think this is a dup. HBase stuck in Region SPLIT Key: HBASE-7101 URL: https://issues.apache.org/jira/browse/HBASE-7101 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Bing Jiang I found this issue from a zknode which has existed for a long time in the unassigned parent.And HMaster report warnning log increasingly.The loop log is at below. WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server gs-dpo-sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split we use Hbase-0.92.1, and I trace back to the source code. HMaster AssignmentManager have already deleted the SPLIT_Region in its memory structure,but HRegionServer SplitTransaction has found the unassigned/parent-node existed in a transient state, precisely SplitTransaction executes tickleNodeSplit to update a new version a little later than AssignmentManager deleting unassigned/parent-znode. After updating a version of the znode, it will intrigue the handleRegion operation again, however, AssignmentManager assert that the RegionState in Memory has been deleted, and transaction goes into a retry loop. In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after sleeping 100ms. In my opinion, if the time is much longger than 100ms, all the operation from AssignmentManagement will finish off completely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7101) HBase stuck in Region SPLIT
[ https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7101. -- Resolution: Duplicate HBase stuck in Region SPLIT Key: HBASE-7101 URL: https://issues.apache.org/jira/browse/HBASE-7101 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Bing Jiang I found this issue from a zknode which has existed for a long time in the unassigned parent.And HMaster report warnning log increasingly.The loop log is at below. WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server gs-dpo-sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split we use Hbase-0.92.1, and I trace back to the source code. HMaster AssignmentManager have already deleted the SPLIT_Region in its memory structure,but HRegionServer SplitTransaction has found the unassigned/parent-node existed in a transient state, precisely SplitTransaction executes tickleNodeSplit to update a new version a little later than AssignmentManager deleting unassigned/parent-znode. After updating a version of the znode, it will intrigue the handleRegion operation again, however, AssignmentManager assert that the RegionState in Memory has been deleted, and transaction goes into a retry loop. In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after sleeping 100ms. In my opinion, if the time is much longger than 100ms, all the operation from AssignmentManagement will finish off completely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira