[
https://issues.apache.org/jira/browse/HBASE-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115249#comment-13115249
]
Ted Yu edited comment on HBASE-4492 at 9/27/11 5:26 PM:
--------------------------------------------------------
Found the following in output for the above timeout case:
{code}
2011-09-27 05:28:59,047 DEBUG
[RegionServer:3;us.ciq.com,57539,1317101335695-EventThread]
zookeeper.ZooKeeperWatcher(233): regionserver:57539-0x132a95afa18000d Received
ZooKeeper Event, type=NodeDeleted, state=SyncConnected,
path=/hbase/root-region-server
2011-09-27 05:28:59,047 DEBUG
[RegionServer:3;us.ciq.com,58748,1317101327777-EventThread]
zookeeper.ZKUtil(226): regionserver:58748-0x132a95afa18000a
/hbase/root-region-server does not exist. Watcher is set.
2011-09-27 05:28:59,047 DEBUG
[Master:0;us.ciq.com,56327,1317101304726-EventThread] zookeeper.ZKUtil(226):
hconnection-0x132a95afa180005 /hbase/root-region-server does not exist. Watcher
is set.
2011-09-27 05:28:59,048 DEBUG [Thread-1-EventThread]
zookeeper.ZooKeeperWatcher(233): master:51567-0x132a95afa180008 Received
ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected,
path=/hbase/unassigned
2011-09-27 05:28:59,048 DEBUG
[RegionServer:3;us.ciq.com,57539,1317101335695-EventThread]
zookeeper.ZKUtil(226): regionserver:57539-0x132a95afa18000d
/hbase/root-region-server does not exist. Watcher is set.
2011-09-27 05:28:59,049 INFO
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.AssignmentManager(1485): No previous transition plan was found (or we
are ignoring an existing plan) for -ROOT-,,0.70236052 so generated a random
one; hri=-ROOT-,,0.70236052, src=, dest=us.ciq.com,57500,1317101330748; 3
(online=3, exclude=null) available servers
2011-09-27 05:28:59,049 INFO
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.AssignmentManager(1485): Assigning region -ROOT-,,0.70236052 to
us.ciq.com,57500,1317101330748
2011-09-27 05:28:59,049 DEBUG
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.ServerManager(448): New connection to us.ciq.com,57500,1317101330748
2011-09-27 05:28:59,049 DEBUG [Thread-1-EventThread] zookeeper.ZKUtil(224):
master:51567-0x132a95afa180008 Set watcher on existing znode
/hbase/unassigned/70236052
2011-09-27 05:28:59,049 FATAL
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.HMaster(1181): Master server abort: loaded coprocessors are: []
2011-09-27 05:28:59,050 FATAL
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.HMaster(1186): Unexpected state trying to OFFLINE; -ROOT-,,0.70236052
state=PENDING_OPEN, ts=1317101339049, server=us.ciq.com,57500,1317101330748
java.lang.IllegalStateException
at
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1517)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1392)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1169)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1144)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1139)
at
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1816)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRoot(ServerShutdownHandler.java:105)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRootWithRetries(ServerShutdownHandler.java:123)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:186)
{code}
was (Author: [email protected]):
Found the following in output for the above timeout case:
{code}
2011-09-27 05:28:59,047 DEBUG
[RegionServer:3;us.ciq.com,57539,1317101335695-EventThread]
zookeeper.ZooKeeperWatcher(233): regionserver:57539-0x132a95afa18000d Received
ZooKeeper Event, type=NodeDeleted, state=SyncConnected,
path=/hbase/root-region-server2011-09-27 05:28:59,047 DEBUG
[RegionServer:3;us.ciq.com,58748,1317101327777-EventThread]
zookeeper.ZKUtil(226): regionserver:58748-0x132a95afa18000a
/hbase/root-region-server does not exist. Watcher is set.2011-09-27
05:28:59,047 DEBUG [Master:0;us.ciq.com,56327,1317101304726-EventThread]
zookeeper.ZKUtil(226): hconnection-0x132a95afa180005 /hbase/root-region-server
does not exist. Watcher is set.2011-09-27 05:28:59,048 DEBUG
[Thread-1-EventThread] zookeeper.ZooKeeperWatcher(233):
master:51567-0x132a95afa180008 Received ZooKeeper Event,
type=NodeChildrenChanged, state=SyncConnected, path=/hbase/unassigned2011-09-27
05:28:59,048 DEBUG [RegionServer:3;us.ciq.com,57539,1317101335695-EventThread]
zookeeper.ZKUtil(226): regionserver:57539-0x132a95afa18000d
/hbase/root-region-server does not exist. Watcher is set.2011-09-27
05:28:59,049 INFO
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.AssignmentManager(1485): No previous transition plan was found (or we
are ignoring an existing plan) for -ROOT-,,0.70236052 so generated a random
one; hri=-ROOT-,,0.70236052, src=, dest=us.ciq.com,57500,1317101330748; 3
(online=3, exclude=null) available servers2011-09-27 05:28:59,049 INFO
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.AssignmentManager(1485): Assigning region -ROOT-,,0.70236052 to
us.ciq.com,57500,13171013307482011-09-27 05:28:59,049 DEBUG
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.ServerManager(448): New connection to
us.ciq.com,57500,13171013307482011-09-27 05:28:59,049 DEBUG
[Thread-1-EventThread] zookeeper.ZKUtil(224): master:51567-0x132a95afa180008
Set watcher on existing znode /hbase/unassigned/702360522011-09-27 05:28:59,049
FATAL [MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.HMaster(1181): Master server abort: loaded coprocessors are:
[]2011-09-27 05:28:59,050 FATAL
[MASTER_META_SERVER_OPERATIONS-us.ciq.com,51567,1317101319637-3]
master.HMaster(1186): Unexpected state trying to OFFLINE; -ROOT-,,0.70236052
state=PENDING_OPEN, ts=1317101339049, server=us.ciq.com,57500,1317101330748
java.lang.IllegalStateException
at
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1517)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1392)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1169)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1144)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1139)
at
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1816)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRoot(ServerShutdownHandler.java:105)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRootWithRetries(ServerShutdownHandler.java:123)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:186)
{code}
> TestRollingRestart fails intermittently
> ---------------------------------------
>
> Key: HBASE-4492
> URL: https://issues.apache.org/jira/browse/HBASE-4492
> Project: HBase
> Issue Type: Test
> Reporter: Ted Yu
> Assignee: Jonathan Gray
> Attachments: 4492.txt
>
>
> I got the following when running test suite on TRUNK:
> {code}
> testBasicRollingRestart(org.apache.hadoop.hbase.master.TestRollingRestart)
> Time elapsed: 300.28 sec <<< ERROR!
> java.lang.Exception: test timed out after 300000 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at
> org.apache.hadoop.hbase.master.TestRollingRestart.waitForRSShutdownToStartAndFinish(TestRollingRestart.java:313)
> at
> org.apache.hadoop.hbase.master.TestRollingRestart.testBasicRollingRestart(TestRollingRestart.java:210)
> {code}
> I ran TestRollingRestart#testBasicRollingRestart manually afterwards which
> wiped out test output file for the failed test.
> Similar failure can be found on Jenkins:
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/19/testReport/junit/org.apache.hadoop.hbase.master/TestRollingRestart/testBasicRollingRestart/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira