Heng Chen created HBASE-14182:
---------------------------------
Summary: My regionserver change ip. But hmaster still connect to
old ip after the rs restart
Key: HBASE-14182
URL: https://issues.apache.org/jira/browse/HBASE-14182
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.98.6
Reporter: Heng Chen
I use docker to deploy my hbase cluster, and the RS ip changed. When restart
this RS, hmaster webUI shows it connect to hmaster, but regions num. is zero
after a long time. I check the hmaster log and found that master still use old
ip to connect this rs.
This is hmaster's log below:
PS: 10.11.21.140 is old ip of rs dx-ape-regionserver1-online
{code}
2015-08-04 17:24:00,081 INFO [AM.ZK.Worker-pool2-t14141]
master.AssignmentManager: Assigning
solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to
dx-ape-regionserver1-online,60020,1438679950072
2015-08-04 17:24:06,800 WARN [AM.ZK.Worker-pool2-t14133]
master.AssignmentManager: Failed assignment of
solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020.
to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere
instead; try=3 of 10
java.net.ConnectException: Connection timed out
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
at
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
at
org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
at
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
at
org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
at
org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
at
org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-08-04 17:24:06,801 WARN [AM.ZK.Worker-pool2-t14140]
master.AssignmentManager: Failed assignment of
solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc. to
dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere
instead; try=2 of 10
java.net.ConnectException: Call to
dx-ape-regionserver1-online/10.11.21.140:60020 failed on connection exception:
java.net.ConnectException: Connection timed out
at
org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
at
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
at
org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
at
org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
at
org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection timed out
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
at
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
at
org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
... 16 more
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)