[ 
https://issues.apache.org/jira/browse/HBASE-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092555#comment-13092555
 ] 

Ming Ma commented on HBASE-4265:
--------------------------------

Perhaps the regions that are in RS_ZK_REGION_CLOSED or RS_ZK_REGION_CLOSING 
states shouldn't be added to AssignmentManager.regions at the first place in 
AssignmentManager.rebuildUserRegions? Those regions that have been closed by RS 
are asked to be closed again by AssignmentManager. Thus the states of those 
regions remain as CLOSING after restart.
 
11/08/28 14:31:23 INFO master.AssignmentManager: Region has been CLOSING for 
too long, this should eventually complete or the server will expire, doing 
nothing

> zookeeper.KeeperException$NodeExistsException if HMaster restarts while table 
> is being disabled
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4265
>                 URL: https://issues.apache.org/jira/browse/HBASE-4265
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>             Fix For: 0.92.0
>
>
> There seems to be more than just one issue regarding the following scenario. 
> I would provide a fix later just for this exception.
> 1. A table is being disabled.
> 2. HMaster restarted.
> 3. At HMaster startup, it tries to transition from disabling to disabled 
> state. It got the following exception.
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /hbase/unassigned/419b902243c836c285108ba555b712fa
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>       at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>       at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:475)
>       at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:457)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:742)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:461)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1440)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1406)
>       at 
> org.apache.hadoop.hbase.master.handler.DisableTableHandler$BulkDisabler$1.run(DisableTableHandler.java:141)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> This issue is this specific region is in a special state before HMaster 
> restarts; it has been closed by RS properly thus the zk state is 
> RS_ZK_REGION_CLOSED. However, HMaster hasn't got a chance to process 
> ClosedRegionHandler yet and thus the node remains at zk. After RS restarts, 
> this node is added to online region list first in 
> AssignmentManager.rebuildUserRegions and tries to unassign it later.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to