[jira] [Resolved] (HBASE-7101) HBase stuck in Region SPLIT

2013-01-27 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-7101.
--

   Resolution: Fixed
Fix Version/s: (was: 0.94.5)
   (was: 0.96.0)

I think this is a dup.

 HBase stuck in Region SPLIT 
 

 Key: HBASE-7101
 URL: https://issues.apache.org/jira/browse/HBASE-7101
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
Reporter: Bing Jiang

 I found this issue from a zknode which has existed for a long time in the 
 unassigned parent.And HMaster report warnning log increasingly.The loop log 
 is at below. 
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
 1a1c950ad45812d7b4b9b90ebf268468 not found on server 
 sev0040,60020,1350378314041; failed processing
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
 region 1a1c950ad45812d7b4b9b90ebf268468 from server 
 sev0040,60020,1350378314041 but it doesn't exist anymore, probably already 
 processed its split
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
 1a1c950ad45812d7b4b9b90ebf268468 not found on server 
 gs-dpo-sev0040,60020,1350378314041; failed processing
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
 region 1a1c950ad45812d7b4b9b90ebf268468 from server 
 sev0040,60020,1350378314041 but it doesn't exist anymore, probably already 
 processed its split
 we use Hbase-0.92.1, and I trace back to the source code. HMaster 
 AssignmentManager have already deleted the SPLIT_Region in its memory 
 structure,but HRegionServer SplitTransaction has found the 
 unassigned/parent-node existed in a transient state, precisely 
 SplitTransaction executes tickleNodeSplit to update a new version a little 
 later than  AssignmentManager deleting unassigned/parent-znode. After 
 updating a version of the znode, it will intrigue the handleRegion operation 
 again, however, AssignmentManager assert that the RegionState in Memory has 
 been deleted, and transaction goes into a retry loop.
 In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after 
 sleeping 100ms. In my opinion, if the time is much longger than 100ms, all 
 the operation from AssignmentManagement will finish off completely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7101) HBase stuck in Region SPLIT

2013-01-27 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-7101.
--

Resolution: Duplicate

 HBase stuck in Region SPLIT 
 

 Key: HBASE-7101
 URL: https://issues.apache.org/jira/browse/HBASE-7101
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
Reporter: Bing Jiang

 I found this issue from a zknode which has existed for a long time in the 
 unassigned parent.And HMaster report warnning log increasingly.The loop log 
 is at below. 
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
 1a1c950ad45812d7b4b9b90ebf268468 not found on server 
 sev0040,60020,1350378314041; failed processing
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
 region 1a1c950ad45812d7b4b9b90ebf268468 from server 
 sev0040,60020,1350378314041 but it doesn't exist anymore, probably already 
 processed its split
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 
 1a1c950ad45812d7b4b9b90ebf268468 not found on server 
 gs-dpo-sev0040,60020,1350378314041; failed processing
 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for 
 region 1a1c950ad45812d7b4b9b90ebf268468 from server 
 sev0040,60020,1350378314041 but it doesn't exist anymore, probably already 
 processed its split
 we use Hbase-0.92.1, and I trace back to the source code. HMaster 
 AssignmentManager have already deleted the SPLIT_Region in its memory 
 structure,but HRegionServer SplitTransaction has found the 
 unassigned/parent-node existed in a transient state, precisely 
 SplitTransaction executes tickleNodeSplit to update a new version a little 
 later than  AssignmentManager deleting unassigned/parent-znode. After 
 updating a version of the znode, it will intrigue the handleRegion operation 
 again, however, AssignmentManager assert that the RegionState in Memory has 
 been deleted, and transaction goes into a retry loop.
 In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after 
 sleeping 100ms. In my opinion, if the time is much longger than 100ms, all 
 the operation from AssignmentManagement will finish off completely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira