[ 
https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857871#comment-13857871
 ] 

Lars Hofhansl commented on HBASE-8912:
--------------------------------------

In my case I found that the AssignmentManager gets two notification for the 
same region in short succession:
{code}
2013-12-26 07:55:11,398 DEBUG [pool-1-thread-1-EventThread] 
zookeeper.ZooKeeperWatcher(294): master:36597-0x1432de5ff150000 Received 
ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
path=/hbase/unassigned/50d8f699ee870d7af05aa4f4b6824e8c
2013-12-26 07:55:11,399 DEBUG [pool-1-thread-1-EventThread] 
zookeeper.ZKUtil(1595): master:36597-0x1432de5ff150000 Retrieved 116 byte(s) of 
data from znode /hbase/unassigned/50d8f699ee870d7af05aa4f4b6824e8c and set 
watcher; 
region=testRetrying,ttt,1388044498231.50d8f699ee870d7af05aa4f4b6824e8c., 
origin=janus.apache.org,50758,1388044485760, state=RS_ZK_REGION_FAILED_OPEN
2013-12-26 07:55:11,399 DEBUG [pool-1-thread-1-EventThread] 
master.AssignmentManager(743): Handling transition=RS_ZK_REGION_FAILED_OPEN, 
{code}
and then 
{code}
2013-12-26 07:55:11,401 DEBUG [pool-1-thread-1-EventThread] 
zookeeper.ZooKeeperWatcher(294): master:36597-0x1432de5ff150000 Received 
ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, 
path=/hbase/unassigned/50d8f699ee870d7af05aa4f4b6824e8c
2013-12-26 07:55:11,401 DEBUG 
[MASTER_CLOSE_REGION-janus.apache.org,36597,1388044485155-0] 
handler.ClosedRegionHandler(92): Handling CLOSED event for 
50d8f699ee870d7af05aa4f4b6824e8c
2013-12-26 07:55:11,401 DEBUG 
[MASTER_CLOSE_REGION-janus.apache.org,36597,1388044485155-0] 
master.AssignmentManager(1665): Forcing OFFLINE; 
was=testRetrying,ttt,1388044498231.50d8f699ee870d7af05aa4f4b6824e8c. 
state=CLOSED, ts=1388044511364, server=janus.apache.org,50758,1388044485760
2013-12-26 07:55:11,401 DEBUG 
[MASTER_CLOSE_REGION-janus.apache.org,36597,1388044485155-0] 
zookeeper.ZKAssign(264): master:36597-0x1432de5ff150000 Creating (or updating) 
unassigned node for 50d8f699ee870d7af05aa4f4b6824e8c with OFFLINE state
2013-12-26 07:55:11,402 DEBUG [pool-1-thread-1-EventThread] 
zookeeper.ZKUtil(1595): master:36597-0x1432de5ff150000 Retrieved 116 byte(s) of 
data from znode /hbase/unassigned/50d8f699ee870d7af05aa4f4b6824e8c and set 
watcher; 
region=testRetrying,ttt,1388044498231.50d8f699ee870d7af05aa4f4b6824e8c., 
origin=janus.apache.org,50758,1388044485760, state=RS_ZK_REGION_FAILED_OPEN
2013-12-26 07:55:11,402 DEBUG [pool-1-thread-1-EventThread] 
master.AssignmentManager(743): Handling transition=RS_ZK_REGION_FAILED_OPEN, 
server=janus.apache.org,50758,1388044485760, 
region=50d8f699ee870d7af05aa4f4b6824e8c
{code}

Note in the 2nd set how the AM already started to react to the 1st event.

> [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to 
> OFFLINE
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-8912
>                 URL: https://issues.apache.org/jira/browse/HBASE-8912
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>             Fix For: 0.94.16
>
>         Attachments: HBase-0.94 #1036 test - testRetrying [Jenkins].html, 
> log.txt
>
>
> AM throws this exception which subsequently causes the master to abort: 
> {code}
> java.lang.IllegalStateException: Unexpected state : 
> testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. 
> state=PENDING_OPEN, ts=1372891751912, 
> server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE.
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>       at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>       at 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>       at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>       at java.lang.Thread.run(Thread.java:662)
> {code}
> This exception trace is from the failing test TestMetaReaderEditor which is 
> failing pretty frequently, but looking at the test code, I think this is not 
> a test-only issue, but affects the main code path. 
> https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to