Heng Chen created HBASE-14182: --------------------------------- Summary: My regionserver change ip. But hmaster still connect to old ip after the rs restart Key: HBASE-14182 URL: https://issues.apache.org/jira/browse/HBASE-14182 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Heng Chen
I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS, hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I check the hmaster log and found that master still use old ip to connect this rs. This is hmaster's log below: PS: 10.11.21.140 is old ip of rs dx-ape-regionserver1-online {code} 2015-08-04 17:24:00,081 INFO [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072 2015-08-04 17:24:06,800 WARN [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3 of 10 java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999) at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447) at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2015-08-04 17:24:06,801 WARN [AM.ZK.Worker-pool2-t14140] master.AssignmentManager: Failed assignment of solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=2 of 10 java.net.ConnectException: Call to dx-ape-regionserver1-online/10.11.21.140:60020 failed on connection exception: java.net.ConnectException: Connection timed out at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999) at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447) at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) ... 16 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)