[
https://issues.apache.org/jira/browse/HBASE-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116966#comment-13116966
]
Ted Yu commented on HBASE-4212:
-------------------------------
@Stack:
Can you take a look at Jinchao's response @ 09/Sep/11 13:57 ?
@Jinchao:
Have you run your patches for test suites of 0.90 and TRUNK recently ?
Thanks
> TestMasterFailover fails occasionally
> -------------------------------------
>
> Key: HBASE-4212
> URL: https://issues.apache.org/jira/browse/HBASE-4212
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.4
> Reporter: gaojinchao
> Assignee: gaojinchao
> Fix For: 0.90.5
>
> Attachments: HBASE-4212_TrunkV1.patch, HBASE-4212_branch90V1.patch
>
>
> It seems a bug. The root in RIT can't be moved..
> In the failover process, it enforces root on-line. But not clean zk node.
> test will wait forever.
> void processFailover() throws KeeperException, IOException,
> InterruptedException {
>
> // we enforce on-line root.
> HServerInfo hsi =
>
> this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
> regionOnline(HRegionInfo.FIRST_META_REGIONINFO, hsi);
> hsi =
> this.serverManager.getHServerInfo(this.catalogTracker.getRootLocation());
> regionOnline(HRegionInfo.ROOT_REGIONINFO, hsi);
> It seems that we should wait finished as meta region
> int assignRootAndMeta()
> throws InterruptedException, IOException, KeeperException {
> int assigned = 0;
> long timeout = this.conf.getLong("hbase.catalog.verification.timeout",
> 1000);
> // Work on ROOT region. Is it in zk in transition?
> boolean rit = this.assignmentManager.
>
> processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
> if (!catalogTracker.verifyRootRegionLocation(timeout)) {
> this.assignmentManager.assignRoot();
> this.catalogTracker.waitForRoot();
> //we need add this code and guarantee that the transition has completed
> this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
> assigned++;
> }
> logs:
> 2011-08-16 07:45:40,715 DEBUG
> [RegionServer:0;C4S2.site,47710,1313495126115-EventThread]
> zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004
> Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected,
> path=/hbase/unassigned/70236052
> 2011-08-16 07:45:40,715 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0]
> zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully
> transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
> 2011-08-16 07:45:40,715 DEBUG [Thread-760-EventThread]
> zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received
> ZooKeeper Event, type=NodeDataChanged, state=SyncConnected,
> path=/hbase/unassigned/70236052
> 2011-08-16 07:45:40,716 INFO [PostOpenDeployTasks:70236052]
> catalog.RootLocationEditor(62): Setting ROOT region location in ZooKeeper as
> C4S2.site:47710
> 2011-08-16 07:45:40,716 DEBUG [Thread-760-EventThread]
> zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s)
> of data from znode /hbase/unassigned/70236052 and set watcher;
> region=-ROOT-,,0, server=C4S2.site,47710,1313495126115,
> state=RS_ZK_REGION_OPENING
> 2011-08-16 07:45:40,717 DEBUG [Thread-760-EventThread]
> master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENING,
> server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
> 2011-08-16 07:45:40,725 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0]
> zookeeper.ZKAssign(661): regionserver:47710-0x131d2690f780004 Attempting to
> transition node 70236052/-ROOT- from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENED
> 2011-08-16 07:45:40,727 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0]
> zookeeper.ZKUtil(1109): regionserver:47710-0x131d2690f780004 Retrieved 52
> byte(s) of data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0,
> server=C4S2.site,47710,1313495126115, state=RS_ZK_REGION_OPENING
> 2011-08-16 07:45:40,740 DEBUG
> [RegionServer:0;C4S2.site,47710,1313495126115-EventThread]
> zookeeper.ZooKeeperWatcher(252): regionserver:47710-0x131d2690f780004
> Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected,
> path=/hbase/unassigned/70236052
> 2011-08-16 07:45:40,740 DEBUG [Thread-760-EventThread]
> zookeeper.ZooKeeperWatcher(252): master:60701-0x131d2690f780009 Received
> ZooKeeper Event, type=NodeDataChanged, state=SyncConnected,
> path=/hbase/unassigned/70236052
> 2011-08-16 07:45:40,740 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0]
> zookeeper.ZKAssign(712): regionserver:47710-0x131d2690f780004 Successfully
> transitioned node 70236052 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
> 2011-08-16 07:45:40,741 DEBUG [RS_OPEN_ROOT-C4S2.site,47710,1313495126115-0]
> handler.OpenRegionHandler(121): Opened -ROOT-,,0.70236052
> 2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread]
> zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s)
> of data from znode /hbase/unassigned/70236052 and set watcher;
> region=-ROOT-,,0, server=C4S2.site,47710,1313495126115,
> state=RS_ZK_REGION_OPENED
> 2011-08-16 07:45:40,741 DEBUG [Thread-760-EventThread]
> master.AssignmentManager(477): Handling transition=RS_ZK_REGION_OPENED,
> server=C4S2.site,47710,1313495126115, region=70236052/-ROOT-
> //.............................................It said that zk node can't be
> cleaned because of we have enforced on-line the
> root.......................................
> // The test will wait forever.
> 2011-08-16 07:45:40,741 WARN [Thread-760-EventThread]
> master.AssignmentManager(540): Received OPENED for region 70236052/-ROOT-
> from server C4S2.site,47710,1313495126115 but region was in the state null
> and not in expected PENDING_OPEN or OPENING states
> 2011-08-16 07:45:41,018 DEBUG [Master:0;C4S2.site:60701]
> zookeeper.ZKUtil(1109): master:60701-0x131d2690f780009 Retrieved 52 byte(s)
> of data from znode /hbase/unassigned/70236052 and set watcher;
> region=-ROOT-,,0, server=C4S2.site,47710,1313495126115,
> state=RS_ZK_REGION_OPENED
> 2011-08-16 07:45:41,233 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:41,337 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:41,439 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:41,543 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:41,645 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:41,748 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:41,900 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:42,002 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:42,105 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:42,206 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:42,308 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
> 2011-08-16 07:45:42,410 DEBUG [Thread-760] zookeeper.ZKAssign(807): ZK RIT ->
> 70236052
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira