[ 
https://issues.apache.org/jira/browse/HBASE-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115249#comment-13115249
 ] 

Ted Yu edited comment on HBASE-4492 at 9/27/11 5:26 PM:
--------------------------------------------------------

Found the following in output for the above timeout case:
{code}
2011-09-27 05:28:59,047 DEBUG 
[RegionServer:3;us.ciq.com,57539,1317101335695-EventThread] 
zookeeper.ZooKeeperWatcher(233): regionserver:57539-0x132a95afa18000d Received 
ZooKeeper Event, type=NodeDeleted, state=SyncConnected, 
path=/hbase/root-region-server
2011-09-27 05:28:59,047 DEBUG 
[RegionServer:3;us.ciq.com,58748,1317101327777-EventThread] 
zookeeper.ZKUtil(226): regionserver:58748-0x132a95afa18000a 
/hbase/root-region-server does not exist. Watcher is set.
2011-09-27 05:28:59,047 DEBUG 
[Master:0;us.ciq.com,56327,1317101304726-EventThread] zookeeper.ZKUtil(226): 
hconnection-0x132a95afa180005 /hbase/root-region-server does not exist. Watcher 
is set.
2011-09-27 05:28:59,048 DEBUG [Thread-1-EventThread] 
zookeeper.ZooKeeperWatcher(233): master:51567-0x132a95afa180008 Received 
ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, 
path=/hbase/unassigned
2011-09-27 05:28:59,048 DEBUG 
[RegionServer:3;us.ciq.com,57539,1317101335695-EventThread] 
zookeeper.ZKUtil(226): regionserver:57539-0x132a95afa18000d 
/hbase/root-region-server does not exist. Watcher is set.
2011-09-27 05:28:59,049 INFO  
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.AssignmentManager(1485): No previous transition plan was found (or we 
are ignoring an existing plan) for -ROOT-,,0.70236052 so generated a random 
one; hri=-ROOT-,,0.70236052, src=, dest=us.ciq.com,57500,1317101330748; 3 
(online=3, exclude=null) available servers
2011-09-27 05:28:59,049 INFO  
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.AssignmentManager(1485): Assigning region -ROOT-,,0.70236052 to 
us.ciq.com,57500,1317101330748
2011-09-27 05:28:59,049 DEBUG 
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.ServerManager(448): New connection to us.ciq.com,57500,1317101330748
2011-09-27 05:28:59,049 DEBUG [Thread-1-EventThread] zookeeper.ZKUtil(224): 
master:51567-0x132a95afa180008 Set watcher on existing znode 
/hbase/unassigned/70236052
2011-09-27 05:28:59,049 FATAL 
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.HMaster(1181): Master server abort: loaded coprocessors are: []
2011-09-27 05:28:59,050 FATAL 
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.HMaster(1186): Unexpected state trying to OFFLINE; -ROOT-,,0.70236052 
state=PENDING_OPEN, ts=1317101339049, server=us.ciq.com,57500,1317101330748
java.lang.IllegalStateException
        at 
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1517)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1392)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1169)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1144)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1139)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1816)
        at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRoot(ServerShutdownHandler.java:105)
        at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRootWithRetries(ServerShutdownHandler.java:123)
        at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:186)
{code}
                
      was (Author: [email protected]):
    Found the following in output for the above timeout case:
{code}
2011-09-27 05:28:59,047 DEBUG 
[RegionServer:3;us.ciq.com,57539,1317101335695-EventThread] 
zookeeper.ZooKeeperWatcher(233): regionserver:57539-0x132a95afa18000d Received 
ZooKeeper Event, type=NodeDeleted, state=SyncConnected, 
path=/hbase/root-region-server2011-09-27 05:28:59,047 DEBUG 
[RegionServer:3;us.ciq.com,58748,1317101327777-EventThread] 
zookeeper.ZKUtil(226): regionserver:58748-0x132a95afa18000a 
/hbase/root-region-server does not exist. Watcher is set.2011-09-27 
05:28:59,047 DEBUG [Master:0;us.ciq.com,56327,1317101304726-EventThread] 
zookeeper.ZKUtil(226): hconnection-0x132a95afa180005 /hbase/root-region-server 
does not exist. Watcher is set.2011-09-27 05:28:59,048 DEBUG 
[Thread-1-EventThread] zookeeper.ZooKeeperWatcher(233): 
master:51567-0x132a95afa180008 Received ZooKeeper Event, 
type=NodeChildrenChanged, state=SyncConnected, path=/hbase/unassigned2011-09-27 
05:28:59,048 DEBUG [RegionServer:3;us.ciq.com,57539,1317101335695-EventThread] 
zookeeper.ZKUtil(226): regionserver:57539-0x132a95afa18000d 
/hbase/root-region-server does not exist. Watcher is set.2011-09-27 
05:28:59,049 INFO  
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.AssignmentManager(1485): No previous transition plan was found (or we 
are ignoring an existing plan) for -ROOT-,,0.70236052 so generated a random 
one; hri=-ROOT-,,0.70236052, src=, dest=us.ciq.com,57500,1317101330748; 3 
(online=3, exclude=null) available servers2011-09-27 05:28:59,049 INFO  
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.AssignmentManager(1485): Assigning region -ROOT-,,0.70236052 to 
us.ciq.com,57500,13171013307482011-09-27 05:28:59,049 DEBUG 
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.ServerManager(448): New connection to 
us.ciq.com,57500,13171013307482011-09-27 05:28:59,049 DEBUG 
[Thread-1-EventThread] zookeeper.ZKUtil(224): master:51567-0x132a95afa180008 
Set watcher on existing znode /hbase/unassigned/702360522011-09-27 05:28:59,049 
FATAL [MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.HMaster(1181): Master server abort: loaded coprocessors are: 
[]2011-09-27 05:28:59,050 FATAL 
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3] 
master.HMaster(1186): Unexpected state trying to OFFLINE; -ROOT-,,0.70236052 
state=PENDING_OPEN, ts=1317101339049, server=us.ciq.com,57500,1317101330748
java.lang.IllegalStateException
        at 
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1517)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1392)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1169)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1144)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1139)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1816)
        at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRoot(ServerShutdownHandler.java:105)
        at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRootWithRetries(ServerShutdownHandler.java:123)
        at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:186)
{code}
                  
> TestRollingRestart fails intermittently
> ---------------------------------------
>
>                 Key: HBASE-4492
>                 URL: https://issues.apache.org/jira/browse/HBASE-4492
>             Project: HBase
>          Issue Type: Test
>            Reporter: Ted Yu
>            Assignee: Jonathan Gray
>         Attachments: 4492.txt
>
>
> I got the following when running test suite on TRUNK:
> {code}
> testBasicRollingRestart(org.apache.hadoop.hbase.master.TestRollingRestart)  
> Time elapsed: 300.28 sec  <<< ERROR!
> java.lang.Exception: test timed out after 300000 milliseconds
>         at java.lang.Thread.sleep(Native Method)
>         at 
> org.apache.hadoop.hbase.master.TestRollingRestart.waitForRSShutdownToStartAndFinish(TestRollingRestart.java:313)
>         at 
> org.apache.hadoop.hbase.master.TestRollingRestart.testBasicRollingRestart(TestRollingRestart.java:210)
> {code}
> I ran TestRollingRestart#testBasicRollingRestart manually afterwards which 
> wiped out test output file for the failed test.
> Similar failure can be found on Jenkins:
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/19/testReport/junit/org.apache.hadoop.hbase.master/TestRollingRestart/testBasicRollingRestart/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to