Thanks for the reply. I think the exception that you're asking about is this one:
2011-08-05 10:57:34,529 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Received exception accessing META during server shutdown of hadoop-3.ionamerica.priv,60020,1312306642172, retrying META read 2011-08-05 10:57:37,538 WARN org.apache.hadoop.hbase.zookeeper.MetaNodeTracker: Tried to reset META server location after seeing the completion of a new META assignment but got an IOE java.net.NoRouteToHostException: No route to host at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getRegionInfo(Unknown Source) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRegionLocation(CatalogTracker.java:424) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:272) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:331) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364) at org.apache.hadoop.hbase.zookeeper.MetaNodeTracker.nodeDeleted(MetaNodeTracker.java:64) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:276) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) If it helps at all, I put a copy master log up at https://s3-us-west-1.amazonaws.com/brent-be-public/hbase-hbase-master-hadoop-master.log.2011-08-05-partial which contains the time frame from when the master first noticed the region server was dead until it started spitting out "Received exception accessing META during server shutdown..." over and over again. Thanks, Brent On Mon, Aug 8, 2011 at 4:14 PM, Stack <[email protected]> wrote: > On Fri, Aug 5, 2011 at 2:13 PM, Brent Miller <[email protected]> > wrote: > > I was under the assumption that if a regionserver failed, the clients > would > > automatically switch over to a good regionserver. Also, if I pull up the > > mater's web UI, it no longer shows the failed regionserver in the "Region > > Servers" section. Is this a bug or does the client have to somehow check > if > > a regionserver is valid? > > > > We're using Clouder'a HBase 0.90.3-cdh3u1 on Ubuntu 10.04 > > > > What usually happens is that when a regionserver dies, the master will > notice its absence and then it will deploy the regions the dead server > was carrying elsewhere. The process that does this is named > ServerShutdownHandler. In your case above, it seems that this handler > is having an issue processing the dead server -- so the regions did > not get reassigned. What is the exception that is being thrown when > we try to contact .META. region? > > St.Ack >
