ServerNotRunningException coming out of assignRootAndMeta kills the Master
--------------------------------------------------------------------------
Key: HBASE-4470
URL: https://issues.apache.org/jira/browse/HBASE-4470
Project: HBase
Issue Type: Bug
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Priority: Blocker
Fix For: 0.90.5
I'm surprised we still have issues like that and I didn't get a hit while
googling so forgive me if there's already a jira about it.
When the master starts it verifies the locations of root and meta before
assigning them, if the server is started but not running you'll get this:
{quote}
2011-09-23 04:47:44,859 WARN
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
RemoteException connecting to RS
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy6.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484)
at
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441)
at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)
{quote}
I hit that 3-4 times this week while debugging something else. The worst is
that when you restart the master it sees that as a failover, but none of the
regions are assigned so it takes an eternity to get back fully online.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira