just put new hbase version on our test cluster. and been testing it... so far if I shutdown an RS, master does not reassign its regions, and we remain inconsistent forerver, likewise when new RS is up, it does not get regions assigned to it, this is the master log:
2011-05-24 15:30:57,724 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 2011-05-24 15:30:57,724 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [img645.prod.imageshack.com,60020,1306276075768] 2011-05-24 15:30:57,724 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo found for img645.prod.imageshack.com,60020,1306276075768 2011-05-24 15:30:57,726 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs 2011-05-24 15:31:03,330 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs 2011-05-24 15:31:03,338 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) of data from znode /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set watcher; img645.prod.imageshack.com:60020 2011-05-24 15:31:03,350 INFO org.apache.hadoop.hbase.master.ServerManager: Server start rejected; we already have img645.imageshack.us:60020 registered; existingServer=serverName=img645.imageshack.us,60020,1306276075768, load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), newServer=serverName=img645.imageshack.us,60020,1306276262774, load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) 2011-05-24 15:31:03,350 INFO org.apache.hadoop.hbase.master.ServerManager: Triggering server recovery; existingServer img645.imageshack.us,60020,1306276075768 looks stale 2011-05-24 15:31:03,353 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=img645.imageshack.us,60020,1306276075768 to dead servers, submitted shutdown handler to be executed, root=false, meta=false 2011-05-24 15:31:03,353 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for img645.imageshack.us,60020,1306276075768 2011-05-24 15:31:04,348 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768 was carrying (skipping 0 regions(s) that are already in transition) 2011-05-24 15:31:04,348 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of img645.imageshack.us,60020,1306276075768 2011-05-24 15:31:06,333 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server img645.imageshack.us,60020,1306276262774 came back up, removed it from the dead servers list 2011-05-24 15:31:06,333 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=img645.imageshack.us,60020,1306276262774, regionCount=0, userLoad=false 2011-05-24 15:31:49,890 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection opening connection to ZooKeeper with ensemble (img648:2181) 2011-05-24 15:31:49,890 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=img648:2181 sessionTimeout=180000 watcher=hconnection 2011-05-24 15:31:49,891 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server img648/38.99.76.205:2181 2011-05-24 15:31:49,892 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to img648/38.99.76.205:2181, initiating session 2011-05-24 15:31:49,893 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server img648/38.99.76.205:2181, sessionid = 0x13024216e690004, negotiated timeout = 180000 2011-05-24 15:31:49,894 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2011-05-24 15:31:49,895 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection-0x13024216e690004 connected 2011-05-24 15:31:49,896 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Set watcher on existing znode /hbase/master 2011-05-24 15:31:49,896 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Retrieved 32 byte(s) of data from znode /hbase/master and set watcher; img648.prod.imageshack.com:60000 2011-05-24 15:31:49,897 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Set watcher on existing znode /hbase/root-region-server 2011-05-24 15:31:49,897 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x13024216e690004 Retrieved 26 byte(s) of data from znode /hbase/root-region-server and set watcher; img731.imageshack.us:60020 2011-05-24 15:31:49,900 DEBUG org.apache.hadoop.hbase.client.MetaScanner: Scanning .META. starting at row= for max=2147483647 rows 2011-05-24 15:31:49,900 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@26f50154; hsa=img731.imageshack.us:60020 2011-05-24 15:31:49,913 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Cached location for .META.,,1.1028785192 is img654.imageshack.us:60020 2011-05-24 15:31:50,061 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13024216e690004 2011-05-24 15:31:50,063 INFO org.apache.zookeeper.ZooKeeper: Session: 0x13024216e690004 closed 2011-05-24 15:31:50,063 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down Please help :) -Jack
