[ 
https://issues.apache.org/jira/browse/HBASE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006614#comment-13006614
 ] 

Todd Lipcon commented on HBASE-3637:
------------------------------------

2011-03-11 06:42:58,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
master:60000-0x22ea55e0f670002 Retrieved 65 byte(s) of data from znode 
/hbase/unassigned/1028785192 and set watcher; region=.META.,,1, 
server=trek08.sf.cloudera.com,60020,1299853933073, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:58,301 INFO org.apache.hadoop.hbase.master.AssignmentManager: 
Processing region .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
2011-03-11 06:42:58,302 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Region in transition 1028785192 references a server no longer up 
trek08.sf.cloudera.com,60020,1299853933073; letting RIT timeout so will be 
assigned elsewhere
2011-03-11 06:42:58,304 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
master:60000-0x22ea55e0f670002 Received ZooKeeper Event, type=NodeDataChanged, 
state=SyncConnected, path=/hbase/unassigned/70236052
2011-03-11 06:42:58,305 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
master:60000-0x22ea55e0f670002 Retrieved 65 byte(s) of data from znode 
/hbase/unassigned/70236052 and set watcher; region=-ROOT-,,0, 
server=trek10.sf.cloudera.com,60020,1299854562169, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:58,305 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENED, 
server=trek10.sf.cloudera.com,60020,1299854562169, region=70236052/-ROOT-
2011-03-11 06:42:58,307 DEBUG 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
event for 70236052; deleting unassigned node
2011-03-11 06:42:58,308 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x22ea55e0f670002 Deleting existing unassigned node for 70236052 
that is in expected state RS_ZK_REGION_OPENED
2011-03-11 06:42:58,313 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
master:60000-0x22ea55e0f670002 Retrieved 65 byte(s) of data from znode 
/hbase/unassigned/70236052; data=region=-ROOT-,,0, 
server=trek10.sf.cloudera.com,60020,1299854562169, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:58,315 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
master:60000-0x22ea55e0f670002 Received ZooKeeper Event, type=NodeDeleted, 
state=SyncConnected, path=/hbase/unassigned/70236052
2011-03-11 06:42:58,315 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x22ea55e0f670002 Successfully deleted unassigned node for region 
70236052 in expected state RS_ZK_REGION_OPENED
2011-03-11 06:42:58,316 DEBUG 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
-ROOT-,,0.70236052 on trek10.sf.cloudera.com,60020,1299854562169
2011-03-11 06:42:59,097 INFO org.apache.hadoop.hbase.master.AssignmentManager: 
Regions in transition timed out:  .META.,,1.1028785192 state=OPENING, 
ts=1299854016886
2011-03-11 06:42:59,097 INFO org.apache.hadoop.hbase.master.AssignmentManager: 
Region has been OPENING for too long, reassigning region=.META.,,1.1028785192
2011-03-11 06:42:59,098 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
master:60000-0x22ea55e0f670002 Retrieved 65 byte(s) of data from znode 
/hbase/unassigned/1028785192; data=region=.META.,,1, 
server=trek08.sf.cloudera.com,60020,1299853933073, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:59,099 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Region has transitioned to OPENED, allowing watched event handlers to process


> Region stuck in OPENED state
> ----------------------------
>
>                 Key: HBASE-3637
>                 URL: https://issues.apache.org/jira/browse/HBASE-3637
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> I don't 100% understand how this happened, but the following was observed:
> - META is in OPENED state in ZK, for a server which no longer exists
> - Handler sees that server is dead, and figures that the RIT timeout will 
> handle it
> - RIT timeout sees that it's already in OPENED state, and assumes that the 
> OPENED handler will handle it
> - loops in timeout state forever, never actually getting reassigned

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to