[jira] [Commented] (HBASE-4265) zookeeper.KeeperException$NodeExistsException if HMaster restarts while table is being disabled

[email protected] (JIRA) Thu, 01 Sep 2011 23:22:02 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095794#comment-13095794
 ]

[email protected] commented on HBASE-4265:
------------------------------------------------------

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1685/#review1732
-----------------------------------------------------------

Patch looks fine to me. 

- ramkrishna

On 2011-08-31 01:37:46, Ming Ma wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1685/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-31 01:37:46)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and ramkrishna vasudevan.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  The issue is disableTable tries to work on those regions in transition. 
disableTable already has code to bypass those regions in transition. The issue 
is recoverTableInDisablingState is called before 
processRegionsInTransition(which updates regions-in-transition list) is called 
at startup. Thus the regions-in-transition list hasn't been updated when 
recoverTableInDisablingState is called.
bq.  
bq.  The fix is to postpone recoverTableInDisablingState, after 
processRegionsInTransition is called.
bq.  
bq.  
bq.  This addresses bug hbase-4265.
bq.      https://issues.apache.org/jira/browse/hbase-4265
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1163346 
bq.  
bq.  Diff: https://reviews.apache.org/r/1685/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  On a small cluster, stop HMaster when disableTable is in progress. Make 
sure there are some regions-in-transition in zk when the HMaster shudown 
occurs. Without the fix, we get such exception. With the fix, HMaster can 
continue disabling process after restart and table can get to disabled state.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.

> zookeeper.KeeperException$NodeExistsException if HMaster restarts while table 
> is being disabled
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4265
>                 URL: https://issues.apache.org/jira/browse/HBASE-4265
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>             Fix For: 0.92.0
>
>
> There seems to be more than just one issue regarding the following scenario. 
> I would provide a fix later just for this exception.
> 1. A table is being disabled.
> 2. HMaster restarted.
> 3. At HMaster startup, it tries to transition from disabling to disabled 
> state. It got the following exception.
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /hbase/unassigned/419b902243c836c285108ba555b712fa
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>       at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>       at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:475)
>       at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:457)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:742)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:461)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1440)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1406)
>       at 
> org.apache.hadoop.hbase.master.handler.DisableTableHandler$BulkDisabler$1.run(DisableTableHandler.java:141)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> This issue is this specific region is in a special state before HMaster 
> restarts; it has been closed by RS properly thus the zk state is 
> RS_ZK_REGION_CLOSED. However, HMaster hasn't got a chance to process 
> ClosedRegionHandler yet and thus the node remains at zk. After RS restarts, 
> this node is added to online region list first in 
> AssignmentManager.rebuildUserRegions and tries to unassign it later.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4265) zookeeper.KeeperException$NodeExistsException if HMaster restarts while table is being disabled

Reply via email to