Gao: Check out our hudson build. Look there for TestRollingRestart failures. Usually it passes. See here https://builds.apache.org/hudson/job/HBase-TRUNK/ for TRUNK. And here for branch: https://builds.apache.org/hudson/view/G-L/view/HBase/job/hbase-0.90/.
Can you figure what its waiting on in the below (TestRollingRestart is a nice ugly test!) St.Ack On Tue, May 24, 2011 at 2:31 AM, Gaojinchao <[email protected]> wrote: > hbase.master.assignment.timeoutmonitor.timeout should be set higher in > TestRollingRestart case. > > It is killed sometimes when we run all case. > > This is my analysis,. > Is there anyone who encounter? > > Logs: > // reassigned root and meta when regionsever has hutdowed > 2011-05-24 09:09:32,989 DEBUG > [MASTER_META_SERVER_OPERATIONS-C4C2.site:57883-4] > master.AssignmentManager(926): Assigning region -ROOT-,,0.70236052 to > C4C2.site,57417,1306199361353 > 2011-05-24 09:09:32,989 DEBUG [main-EventThread] > zookeeper.ZooKeeperWatcher(252): master:57883-0x1301f8b88270014 Received > ZooKeeper Event, type=NodeDeleted, state=SyncConnected, > path=/hbase/root-region-server > 2011-05-24 09:09:32,989 DEBUG > [MASTER_META_SERVER_OPERATIONS-C4C2.site:57883-4] master.ServerManager(617): > New connection to C4C2.site,57417,1306199361353 > 2011-05-24 09:09:32,992 INFO [PRI IPC Server handler 1 on 57417] > regionserver.HRegionServer(2073): Received request to open region: > -ROOT-,,0.7023 > > // reassigned root because of timeout. > 2011-05-24 09:09:39,628 INFO [C4C2.site:57883.timeoutMonitor] > master.AssignmentManager$TimeoutMonitor(1676): Regions in transition timed > out: -ROOT-,,0.70236052 state=OPENING, ts=1306199374115 > 2011-05-24 09:09:43,389 INFO [C4C2.site:57883.timeoutMonitor] > master.AssignmentManager$TimeoutMonitor(1676): Regions in transition timed > out: .META.,,1.1028785192 state=OPENING, ts=1306199375803 > 2011-05-24 09:09:47,632 INFO [C4C2.site:57883.timeoutMonitor] > master.AssignmentManager$TimeoutMonitor(1676): Regions in transition timed > out: -ROOT-,,0.70236052 state=OPENING, ts=1306199374115 > > // some exception happened > 2011-05-24 09:09:49,292 WARN > [RegionServer:3;C4C2.site,56262,1306199352333-EventThread] > zookeeper.MetaNodeTracker(66): Tried to reset META server location after > seeing the completion of a new META assignment but got an IOE > org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: Timed out > (2147483647ms) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:332) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364) > at > org.apache.hadoop.hbase.zookeeper.MetaNodeTracker.nodeDeleted(MetaNodeTracker.java:64) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:276) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) > 2011-05-24 09:09:49,314 INFO [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] > regionserver.Store(494): Renaming flushed file at > hdfs://localhost:48674/user/root/-ROOT-/70236052/.tmp/4785152515400750744 to > hdfs://localhost:48674/user/root/-ROOT-/70236052/info/8393650380237115731 > 2011-05-24 09:09:49,330 INFO > [RegionServer:3;C4C2.site,57417,1306199361353-EventThread] > catalog.CatalogTracker(419): Passed metaserver is null > 2011-05-24 09:09:49,334 DEBUG [main] master.TestRollingRestart(322): > TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting > > 2011-05-24 09:09:49,349 INFO [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] > regionserver.Store(504): Added > hdfs://localhost:48674/user/root/-ROOT-/70236052/info/8393650380237115731, > entries=4, sequenceid=86, memsize=928.0, filesize=612.0 > 2011-05-24 09:09:49,350 INFO [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] > regionserver.HRegion(1022): Finished memstore flush of ~928.0 for region > -ROOT-,,0.70236052 in 128ms, sequenceid=86, compaction requested=false > 2011-05-24 09:09:49,351 DEBUG [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] > regionserver.Store(418): closed info > 2011-05-24 09:09:49,352 INFO [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] > regionserver.HRegion(550): Closed -ROOT-,,0.70236052 > 2011-05-24 09:09:49,363 INFO > [RegionServer:3;C4C2.site,56262,1306199352333.compactor] > regionserver.Store(737): Completed major compaction of 4 file(s), new > file=hdfs://localhost:48674/user/root/-ROOT-/70236052/info/3975431265129056699, > size=1.0k; total size for store is 1.0k > 2011-05-24 09:09:49,363 INFO > [RegionServer:3;C4C2.site,56262,1306199352333.compactor] > regionserver.HRegion(782): completed compaction on region -ROOT-,,0.70236052 > after 0sec > 2011-05-24 09:09:49,434 DEBUG [main] master.TestRollingRestart(322): > > > > // waited forever ........ > TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting > > 2011-05-24 09:19:43,104 DEBUG [main] master.TestRollingRestart(322): > > TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting > > 2011-05-24 09:19:43,204 DEBUG [main] master.TestRollingRestart(322): > > TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting >
