[
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171765#comment-13171765
]
Jonathan Hsieh commented on HBASE-5063:
---------------------------------------
Here's the exception -- unfortunately it doesn't say which master it is unable
to connect to.
{code}
11/12/17 18:50:24 WARN regionserver.HRegionServer: Unable to connect to master.
Retrying. Error was:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
at
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1024)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:876)
at
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
at $Proxy8.getProtocolVersion(Unknown Source)
at
org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1616)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:787)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:674)
at java.lang.Thread.run(Thread.java:619)
{code}
> RegionServers fail to report to backup HMaster after primary goes down.
> -----------------------------------------------------------------------
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.92.0
> Reporter: Jonathan Hsieh
> Assignee: Jonathan Hsieh
> Priority: Critical
> Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active. Tables may show up but RegionServers never
> report on web page. Existing connections are fine. New connections cannot
> find regionservers.
> Note:
> * If we replace a new HM1 in the same place and kill HM2, the cluster
> functions normally again after recovery. This sees to indicate that
> regionservers are stuck trying to talk to the old HM1.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira