[
https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gregory Chanan updated HBASE-4470:
----------------------------------
Attachment: HBASE-4470-90.patch
Here's a patch against 90. This adds a test (which I'll forward port to
92/94/96 if this gets +1'ed) and small fixups that are 90 specific. The fixups
cause the test to patch, when it failed previously.
I originally had a more complex test that actually threw the exception out of
getHConnection, but this was invasive (I had to restructure the code for
Mockito purposes) and it throwing out of get may be more resilient (i.e. we'll
probably call "get" when obtaining the meta location forever, but may not
always call getHConection in the future).
> ServerNotRunningException coming out of assignRootAndMeta kills the Master
> --------------------------------------------------------------------------
>
> Key: HBASE-4470
> URL: https://issues.apache.org/jira/browse/HBASE-4470
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.4
> Reporter: Jean-Daniel Cryans
> Assignee: Gregory Chanan
> Priority: Critical
> Fix For: 0.90.7
>
> Attachments: HBASE-4470-90.patch
>
>
> I'm surprised we still have issues like that and I didn't get a hit while
> googling so forgive me if there's already a jira about it.
> When the master starts it verifies the locations of root and meta before
> assigning them, if the server is started but not running you'll get this:
> {quote}
> 2011-09-23 04:47:44,859 WARN
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> RemoteException connecting to RS
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running
> yet
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
> at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> at $Proxy6.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
> at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
> at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
> at
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969)
> at
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388)
> at
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287)
> at
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484)
> at
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441)
> at
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)
> {quote}
> I hit that 3-4 times this week while debugging something else. The worst is
> that when you restart the master it sees that as a failover, but none of the
> regions are assigned so it takes an eternity to get back fully online.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira