The below looks like HBASE-3660, 'HMaster will exit when starting with stale data in cached locations such as -ROOT- or .META.', included in 0.90.2 RC. St.Ack
On Fri, Apr 1, 2011 at 8:48 AM, Brush,Ryan <[email protected]> wrote: > This happens in similar conditions but is distinct from HBASE-3617. When the > region hosting ROOT isn't available during restart, the > NoRouteToHostException propagates all the way up the call stack and causes > the master to abort. It looks like this can be addressed by handling > NoRouteToHostException at some point and considering that node/region server > offline. > > I applied the patch from HBASE-3617 and it didn't fix the problem I'm seeing, > which I expected given the stack trace below. Assuming this reasoning is > correct, does this merit a separate JIRA? It does seem critical in that the > failure of a single node is preventing us from being up our cluster. > > 2011-04-01 10:15:19,472 INFO org.apache.hadoop.hbase.master.ServerManager: > Exiting wait on regionserver(s) to checkin; count=2, stopped=false, count of > regions out on cluster=0 > 2011-04-01 10:15:19,486 INFO org.apache.hadoop.hbase.master.MasterFileSystem: > Log folder > hdfs://iphadoop01:9000/hbase/.logs/iphadoop03.northamerica.cerner.net,60020,1301665635981 > belongs to an existing region server > 2011-04-01 10:15:19,486 INFO org.apache.hadoop.hbase.master.MasterFileSystem: > Log folder > hdfs://iphadoop01:9000/hbase/.logs/iphadoop05.northamerica.cerner.net,60020,1301665659785 > belongs to an existing region server > 2011-04-01 10:15:22,508 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.net.NoRouteToHostException: No route to host > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) > at > org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) > at $Proxy6.getProtocolVersion(Unknown Source) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:385) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:211) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:458) > at > org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:425) > at > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:383) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278) > 2011-04-01 10:15:22,510 INFO org.apache.hadoop.hbase.master.HMaster: Aborting > 2011-04-01 10:15:22,510 DEBUG org.apache.hadoop.hbase.master.HMaster: > Stopping service threads > > ---------------------------------------------------------------------- > CONFIDENTIALITY NOTICE This message and any included attachments are from > Cerner Corporation and are intended only for the addressee. The information > contained in this message is confidential and may constitute inside or > non-public information under international, federal, or state securities > laws. Unauthorized forwarding, printing, copying, distribution, or use of > such information is strictly prohibited and may be unlawful. If you are not > the addressee, please promptly delete this message and notify the sender of > the delivery error by e-mail or you may call Cerner's corporate offices in > Kansas City, Missouri, U.S.A at (+1) (816)221-1024. >
