[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810755#comment-13810755 ] Hudson commented on HBASE-8667: --- SUCCESS: Integrated in HBase-0.94-security #326 (See [https://builds.apache.org/job/HBase-0.94-security/326/]) HBASE-9842 Backport HBASE-9593 and HBASE-8667 to 0.94 (rajeshbabu) (larsh: rev 1536592) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenInitializing.java Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8667_Trunk-V2.patch, HBASE-8667_Trunk.patch, HBASE-8667_trunk.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13807775#comment-13807775 ] Hudson commented on HBASE-8667: --- FAILURE: Integrated in HBase-0.94 #1188 (See [https://builds.apache.org/job/HBase-0.94/1188/]) HBASE-9842 Backport HBASE-9593 and HBASE-8667 to 0.94 (rajeshbabu) (larsh: rev 1536592) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenInitializing.java Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741590#comment-13741590 ] Hudson commented on HBASE-8667: --- SUCCESS: Integrated in hbase-0.95-on-hadoop2 #244 (See [https://builds.apache.org/job/hbase-0.95-on-hadoop2/244/]) HBASE-8667 Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine (stack: rev 1514426) * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741652#comment-13741652 ] Hudson commented on HBASE-8667: --- SUCCESS: Integrated in HBase-TRUNK #4398 (See [https://builds.apache.org/job/HBase-TRUNK/4398/]) HBASE-8667 Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine (stack: rev 1514427) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741731#comment-13741731 ] Hudson commented on HBASE-8667: --- FAILURE: Integrated in hbase-0.95 #455 (See [https://builds.apache.org/job/hbase-0.95/455/]) HBASE-8667 Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine (stack: rev 1514426) * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741760#comment-13741760 ] Hudson commented on HBASE-8667: --- SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #678 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/678/]) HBASE-8667 Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine (stack: rev 1514427) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711873#comment-13711873 ] Lars Hofhansl commented on HBASE-8667: -- [~stack], did you get a chance to test this? Patch looks good to me. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698655#comment-13698655 ] stack commented on HBASE-8667: -- [~rajesh23] So you are just doing what NetUtils was doing internally? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698660#comment-13698660 ] rajeshbabu commented on HBASE-8667: --- bq. So you are just doing what NetUtils was doing internally? Yes Stack. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698665#comment-13698665 ] stack commented on HBASE-8667: -- Patch looks great. Let me try on a cluster w/ broken reverse dns to make sure we don't regress but I like that this patch looks already to have removed the special casing of ubuntu install. Good on you Rajesh. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698683#comment-13698683 ] Viral Bajaria commented on HBASE-8667: -- I just tried this patch today on another Ubuntu VM (12.04) and it worked fine for me, I was using hadoop 1.0.4 if that helps. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691721#comment-13691721 ] rajeshbabu commented on HBASE-8667: --- Thanks [~viralbajaria] for testing the patch. [~anoop.hbase] bq. Seems NetUtils#connect(Socket socket, SocketAddress endpoint, SocketAddress localAddr, int timeout) not available with hadoop1. Yes Anoop.Its not present in hadoop 1.0.4, In latest patch avoided this. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691777#comment-13691777 ] Hadoop QA commented on HBASE-8667: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12589373/HBASE-8667_trunk_v6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6115//console This message is automatically generated. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020,
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691531#comment-13691531 ] stack commented on HBASE-8667: -- [~rajesh23] I was sitting w/ Viral and yeah, he had the classic ubuntu issue and the patch just fixed it. I see this when I tied it on hadoopqa precommit: https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/6110/artifact/trunk/patchprocess/trunk1.0JavacWarnings.txt Seems like a hadoop1 incompatibiity? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691700#comment-13691700 ] Anoop Sam John commented on HBASE-8667: --- Seems NetUtils#connect(Socket socket, SocketAddress endpoint, SocketAddress localAddr, int timeout) not available with hadoop1. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690172#comment-13690172 ] Anoop Sam John commented on HBASE-8667: --- Looks good Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690381#comment-13690381 ] stack commented on HBASE-8667: -- So, to be clear here... the master will use the hostname the RS gives it. It will not try to resolve the name the RS gives it and use the resolved name instead. Let me try this patch on a cluster where reverse dns is broke to make sure we don't see the doubled-RS issue. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690753#comment-13690753 ] Viral Bajaria commented on HBASE-8667: -- +1 to the patch. I applied it on current HBase 0.95 branch and was able to get hbase to work in standalone mode on Ubuntu 12.04 Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688933#comment-13688933 ] rajeshbabu commented on HBASE-8667: --- [~stack] https://issues.apache.org/jira/secure/attachment/12587780/HBASE-8667_trunk.patch is the latest patch I have tested.I think you are reviewing https://issues.apache.org/jira/secure/attachment/12587092/HBASE-8667_Trunk-V2.patch. Sorry for the patch name Stack, it should be something like HBASE-8667_trunk_v3.patch. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688953#comment-13688953 ] stack commented on HBASE-8667: -- Whoops. My fault. Why not just pass this.isa rather than wrap it in a new InetSocketAddress (which will do a new resolve -- could do http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#createUnresolved(java.lang.String, int) I suppose)? {code} +rpcClient = new RpcClient(conf, clusterId, new InetSocketAddress(this.isa.getHostName(), 0)); {code} Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688964#comment-13688964 ] rajeshbabu commented on HBASE-8667: --- [~ram_krish] bq. So after this patch the RPC server and the rpc client on the RS connects using the same host? Yes Ram. If we dont pass bind address in connect call,presently it will pass null internally. {code} // connection time out is 20s NetUtils.connect(this.socket, remoteId.getAddress(), getSocketTimeout(conf)); {code} {code} public static void connect(Socket socket, SocketAddress address, int timeout) throws IOException { connect(socket, address, null, timeout); } {code} Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688967#comment-13688967 ] rajeshbabu commented on HBASE-8667: --- [~stack] bq. could do http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#createUnresolved(java.lang.String, int) I suppose)? This is good. I will change and update the patch. Thanks. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689011#comment-13689011 ] Hadoop QA commented on HBASE-8667: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12588784/HBASE-8667_trunk_v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6083//console This message is automatically generated. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689306#comment-13689306 ] stack commented on HBASE-8667: -- [~rajesh23] v5 has this still: -rpcClient = new RpcClient(conf, clusterId); +rpcClient = new RpcClient(conf, clusterId, new InetSocketAddress( +this.isa.getAddress(), 0)); You cannot do? rpcClient = new RpcClient(conf, clusterId, this.isa); Thanks for doing this fixup. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689329#comment-13689329 ] rajeshbabu commented on HBASE-8667: --- If we use this.isa directly we will get BindException because rpc server already binding to the port(60010). Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689418#comment-13689418 ] stack commented on HBASE-8667: -- Ok. Makes sense. I am up for trying it. Thanks [~rajesh23]. Anyone else want to take a look? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688173#comment-13688173 ] rajeshbabu commented on HBASE-8667: --- bq. When we did NOT supply where to bind, what was client using? Some default? If we dont configure bind address then primary hostname will be used as bind address. Here are the test I have done,tried some permutations also.All cases master and rs communicating properly and cluster is fine. {code} Test 1: === Master bind address: 10.18.40.29 RS bind address : 127.0.0.1 RS RPC client bind address : Taken same ip of RS bind address (127.0.0.1:49503) tcp0 0 10.18.40.29:6 :::*LISTEN 19113/java tcp0 0 127.0.0.1:60020 :::*LISTEN 19558/java tcp0 0 10.18.40.29:6 127.0.0.1:49503 ESTABLISHED 19113/java Test 2: === Master bind address: 192.168.1.111 RS bind address : 10.18.40.29 RS RPC client bind address : Taken same ip of RS bind address (10.18.40.29:61297) tcp0 0 10.18.40.29:60020 :::*LISTEN 22408/java tcp0 0 192.168.1.111:6 :::*LISTEN 22277/java tcp0 0 192.168.1.111:6 10.18.40.29:61297 ESTABLISHED 22277/java Test 3: === Master bind address: Dint specify in configuration (it will take primary hostname - in my case ip of primary host name is 10.18.40.29) RS bind address : Didnt specify (primary host name is 10.18.40.29) RS RPC client bind address : Taken same ip of RS bind address(10.18.40.29:20302) tcp0 0 10.18.40.29:60020 :::*LISTEN 23952/java tcp0 0 10.18.40.29:6 :::*LISTEN 23823/java tcp0 0 10.18.40.29:6 10.18.40.29:20302 ESTABLISHED 23823/java {code} Thanks Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688636#comment-13688636 ] stack commented on HBASE-8667: -- So, this 'fixes' the original issue then? What about this blog: http://blog.devving.com/why-does-hbase-care-about-etchosts/ Does it fix the scenario described therein w/ ubuntu's 127.0.1.1 mess? It seems like it would. Good one [~rajesh23] Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826)
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688857#comment-13688857 ] rajeshbabu commented on HBASE-8667: --- bq. What about this blog: http://blog.devving.com/why-does-hbase-care-about-etchosts/ Does it fix the scenario described therein w/ ubuntu's 127.0.1.1 mess? It will fix the scenario also. {code} tcp0 0 127.0.0.1:60020 :::*LISTEN 4636/java tcp0 0 127.0.1.1:6 :::*LISTEN 4499/java tcp0 0 127.0.1.1:6 127.0.0.1:16430 ESTABLISHED 4499/java {code} Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688875#comment-13688875 ] ramkrishna.s.vasudevan commented on HBASE-8667: --- So after this patch the RPC server and the rpc client on the RS connects using the same host? But the port for the RPC client is different. Any reason why the {code} connect(Socket socket, SocketAddress endpoint, SocketAddress localAddr, int timeout) {code} was not used previously? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688877#comment-13688877 ] stack commented on HBASE-8667: -- You remove this: {code} - NameStringPair.Builder entry = NameStringPair.newBuilder() -.setName(HConstants.KEY_FOR_HOSTNAME_SEEN_BY_MASTER) -.setValue(rs.getHostname()); {code} Implication is that the master and regionserver will never disagree on the RS name? Is that so? Master just takes the name the RS proffers? I do not see any resolve going on in here (no InetAddress construction) so it maybe possible that there is no DNS in here to mess us up. This should be ok: {code} + this.serverNameFromMasterPOV = new ServerName(this.isa.getHostName(), this.isa.getPort(), + this.startcode); {code} This is over on the RS. And it is telling the master what name to use, the one it found when it did a resolve. We should change the name of this variable then: serverNameFromMasterPOV This is a change in how we do server naming but it looks safe and solves a few issues we have had w/ a while. Anyone else want to take a look here? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13687143#comment-13687143 ] stack commented on HBASE-8667: -- This is interesting [~rajesh23] When we did NOT supply where to bind, what was client using? Some default? Tell us more about the kind of tests you did. Thanks. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680316#comment-13680316 ] rajeshbabu commented on HBASE-8667: --- [~stack] bq. Our workaround was having the regionserver take the name the master proffered after checkin. This seemed to get rid of a an all-to-common problem seen in hbase deploys Then we need to initialize rpc server in RS with the hostname recieved from master after checkin right? Otherwise we will have this issue. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680733#comment-13680733 ] stack commented on HBASE-8667: -- bq. Then we need to initialize rpc server in RS with the hostname recieved from master after checkin right? Otherwise we will have this issue. The regionserver just takes the name and uses it in subsequent communication w/ the master -- it does not change where it is bound based of the name the master gave it. Are you suggesting that regionserver only set up an rpcserver after it has gotten name from master? What if this disagrees w/ what the operator told us use in the configuration? Isn't what we have here a setup problem; we have regionserver on localhost and master on an ip? Can you have regionserver bind to same ip? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680921#comment-13680921 ] Anoop Sam John commented on HBASE-8667: --- bq.Are you suggesting that regionserver only set up an rpcserver after it has gotten name from master? What if this disagrees w/ what the operator told us use in the configuration? Correct. I dont think this is good.. Here the issue was RS RPCServer bind with an ip. Now when the RS reports to Master the client socket was getting bound with another n/w interface and so master when it checks the hostname of the RS, it sees another name. Master now on will use that to communicate with RS but RS side there is no RPC server bound with this hostname/ip.. So this RS is like not in cluster at all.. When Master and RS are in seperate nodes and RS node is having 2 n/w interfaces and operator want to bind RS with a specific n/w interface, then also this issue may come up? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680925#comment-13680925 ] stack commented on HBASE-8667: -- bq. client socket was getting bound with another n/w interface Can we fix this so client has same home as the RS rpcserver? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680931#comment-13680931 ] Anoop Sam John commented on HBASE-8667: --- bq.Can we fix this so client has same home as the RS rpcserver? Sounds good Stack.. I was about to do this.. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679484#comment-13679484 ] Hadoop QA commented on HBASE-8667: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12587054/HBASE-8667_Trunk.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestMasterNoCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5987//console This message is automatically generated. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer:
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679559#comment-13679559 ] Ted Yu commented on HBASE-8667: --- {code} -// The hostname the master sees us as. -if (key.equals(HConstants.KEY_FOR_HOSTNAME_SEEN_BY_MASTER)) { {code} We don't need the above anymore ? Even for single network interface setup ? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453)
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679605#comment-13679605 ] Anoop Sam John commented on HBASE-8667: --- From Master not returning any hostname seen by Master. Infact this host name is passed by RS now. That is why I have removed this code from RS. One test failure is related to this. I will change the test. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679621#comment-13679621 ] Jean-Daniel Cryans commented on HBASE-8667: --- How does this patch behave if the region servers are reporting in as localhost? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679669#comment-13679669 ] Hadoop QA commented on HBASE-8667: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12587092/HBASE-8667_Trunk-V2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5988//console This message is automatically generated. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master:
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679670#comment-13679670 ] stack commented on HBASE-8667: -- bq. So can we pass the hostname which is actually bound with the RS server to Master when it is reporting? No. backgroundClusters rarely have DNS set up so reverse DNS matches forward lookup. When DNS finds disagreement, it resorts back to lowest common denominator IP. In the past, often a master showed twice the actual count of servers -- once by the regionservers reported in name, and then the same server showing with an IP. Our workaround was having the regionserver take the name the master proffered after checkin. This seemed to get rid of a an all-to-common problem seen in hbase deploys./background This blog seems related. http://blog.devving.com/why-does-hbase-care-about-etchosts/ Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672227#comment-13672227 ] rajeshbabu commented on HBASE-8667: --- [~anoop.hbase] Instead of restricting client by binding to specific ip address its better to pass rpcserver address of RS to master for registering. Please submit your patch Anoop. Thanks. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13671639#comment-13671639 ] Anoop Sam John commented on HBASE-8667: --- So here server @RS was bound to a different hostname/ip but the master sees RS identity as another. Master looks at connection's remote host name to decide what is the hostname of the RS. HRS {code} if (key.equals(HConstants.KEY_FOR_HOSTNAME_SEEN_BY_MASTER)) { String hostnameFromMasterPOV = e.getValue(); this.serverNameFromMasterPOV = new ServerName(hostnameFromMasterPOV, this.isa.getPort(), this.startcode); if (!this.serverNameFromMasterPOV.equals(this.isa.getHostName())) { LOG.info(Master passed us a different hostname to use; was= + this.isa.getHostName() + , but now= + this.serverNameFromMasterPOV.getHostname()); } continue; } {code} When master taken some other hostname for this RS we just log that in RS side and continue.. So can we pass the hostname which is actually bound with the RS server to Master when it is reporting? And Master uses that to communicate with RS then on? With this change I am able to start the cluster successfully in Rajesh's scenario. If this change sounds fine, I can submit the patch. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13671750#comment-13671750 ] rajeshbabu commented on HBASE-8667: --- bq. So here server @RS was bound to a different hostname/ip but the master sees RS identity as another. Master looks at connection's remote host name to decide what is the hostname of the RS. Yes Anoop, exactly this is happening. Presently no address bound to rpc client socket so connections remote hostname is deciding by the interface from which the communication is happening. What about an idea of binding master and RS ipc addresses(port is 0) to the rpcclient sockets in master and RS to avoid problems like this issue? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13671053#comment-13671053 ] rajeshbabu commented on HBASE-8667: --- Here is the netstat report {code} tcp0 0 :::192.168.0.100:6 :::* LISTEN 3745/java tcp0 0 :::127.0.0.1:60020 :::* LISTEN 3837/java tcp0 0 :::192.168.0.100:55248 :::192.168.0.100:6 ESTABLISHED 3837/java tcp0 0 :::192.168.0.100:6 :::192.168.0.100:55248 ESTABLISHED 3745/java {code} 192.168.0.100:6 - master rpc server 127.0.0.1:60020 - region server rpc server But rs registered with 192.168.0.100 address. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13671089#comment-13671089 ] ramkrishna.s.vasudevan commented on HBASE-8667: --- [~rajesh23] Is this is a bug? Or it should be dealt while setup? Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1432) at
[jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
[ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13671147#comment-13671147 ] rajeshbabu commented on HBASE-8667: --- Its bug only Ram, we have option for configuring network interface or bind address for master as well as region server. If both master and regionserver interfaces(bind address) are different in the same machine then this problem is coming. Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. -- Key: HBASE-8667 URL: https://issues.apache.org/jira/browse/HBASE-8667 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. I have configured master ipc address to ip of eth0 interface. Started master and regionserver on the same machine. 1) master rpc server bound to eth0 and RS rpc server bound to lo 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) Here are RS logs: {code} 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,6,1369960497008 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,6,1369960497008 that we are up with port=60020, startcode=1369960502544 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 {code} Here are master logs: {code} 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 6] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 {code} Since master has wrong rpc server address of RS, META is not getting assigned. {code} 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,6,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false - org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) at