[ 
https://issues.apache.org/jira/browse/HBASE-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653730#comment-14653730
 ] 

Heng Chen commented on HBASE-14182:
-----------------------------------

I think i found the answer!

RpcClient use InetAddress class in Java.  And InetAddress has a cache to store 
<host,ip> pair
getAllByName0 will be called when request ip for a host, the source code in 
jdk1.8 is below:

{code}
private static InetAddress[] getAllByName0 (String host, InetAddress reqAddr, 
boolean check)
        throws UnknownHostException  {

        /* If it gets here it is presumed to be a hostname */
        /* Cache.get can return: null, unknownAddress, or InetAddress[] */

        /* make sure the connection to the host is allowed, before we
         * give out a hostname
         */
        if (check) {
            SecurityManager security = System.getSecurityManager();
            if (security != null) {
                security.checkConnect(host, -1);
            }
        }

        InetAddress[] addresses = getCachedAddresses(host);

        /* If no entry in cache, then do the host lookup */
        if (addresses == null) {
            addresses = getAddressesFromNameService(host, reqAddr);
        }

        if (addresses == unknown_array)
            throw new UnknownHostException(host);

        return addresses.clone();
    }
{code}

It will request cache first.  

So we can't change rs ip without hmaster restart.

One solution is that we can store ip information in ZK, and pass ip information 
into InetAddress Constructor when generate new instance.  The problem will be 
solved. 



> My regionserver change ip. But hmaster still connect to old ip after the rs 
> restart
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-14182
>                 URL: https://issues.apache.org/jira/browse/HBASE-14182
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.98.6
>            Reporter: Heng Chen
>
> I use docker to deploy my hbase cluster, and the RS ip changed. When restart 
> this RS,  hmaster webUI shows it connect to hmaster, but regions num. is zero 
> after a long time. I check the hmaster log and found that master still use 
> old ip to connect this rs.
> This is hmaster's log below:
> PS: 10.11.21.140 is old ip of  rs dx-ape-regionserver1-online
> {code}
> 2015-08-04 17:24:00,081 INFO  [AM.ZK.Worker-pool2-t14141] 
> master.AssignmentManager: Assigning 
> solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to 
> dx-ape-regionserver1-online,60020,1438679950072
> 2015-08-04 17:24:06,800 WARN  [AM.ZK.Worker-pool2-t14133] 
> master.AssignmentManager: Failed assignment of 
> solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020.
>  to dx-ape-regionserver1-online,60020,1438679950072, trying to assign 
> elsewhere instead; try=3 of 10
> java.net.ConnectException: Connection timed out
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
>         at 
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
>         at 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-08-04 17:24:06,801 WARN  [AM.ZK.Worker-pool2-t14140] 
> master.AssignmentManager: Failed assignment of 
> solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc. 
> to dx-ape-regionserver1-online,60020,1438679950072, trying to assign 
> elsewhere instead; try=2 of 10
> java.net.ConnectException: Call to 
> dx-ape-regionserver1-online/10.11.21.140:60020 failed on connection 
> exception: java.net.ConnectException: Connection timed out
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
>         at 
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
>         at 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.ConnectException: Connection timed out
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
>         at 
> org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
>         ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to