I've made a patch that worked for me. Not sure, if I should post JIRA
issue. In attach, you can find hack.
On Fri, Oct 17, 2014 at 5:52 PM, Bojan Babic <[email protected]> wrote:
> I'm using giraph 1.1.0-SNAPSHOT for hadoop 1.2.1
>
> On Fri, Oct 17, 2014 at 4:01 PM, Bojan Babic <[email protected]> wrote:
>
>> Hi guys,
>>
>> I'm risking to post issue that has been already issued, but I'll take
>> risk to be ridiculed :)
>>
>> I have small hadoop cluster on Digital Ocean (1 master 4 nodes). I was
>> able to setup cluster and run word count example as well as single node
>> sample from Quick start.
>>
>> As I introduce more nodes into play, I get issue where Task Tracker
>> spawns Child process
>>
>> hduser@hdnode-2:~# jps
>>> 13839 TaskTracker
>>> 13697 DataNode
>>> 14067 Jps
>>> 13962 Child
>>
>> *13961 Child*
>>
>>
>> that listen on looback interface
>>
>> Proto Recv-Q Send-Q Local Address Foreign Address State
>>> User Inode PID/Program name
>>> tcp 0 0 127.0.0.1:1337 0.0.0.0:*
>>> LISTEN root 21544925 29912/python
>>> tcp 0 0 0.0.0.0:50010 0.0.0.0:*
>>> LISTEN hduser 21691552 13697/java
>>> tcp 0 0 127.0.0.1:30011 0.0.0.0:*
>>> LISTEN hduser 21693578 13962/java
>>> tcp 0 0 0.0.0.0:50075 0.0.0.0:*
>>> LISTEN hduser 21691554 13697/java
>>> tcp 0 0 0.0.0.0:50020 0.0.0.0:*
>>> LISTEN hduser 21691557 13697/java
>>> tcp 0 0 127.0.0.1:50118 0.0.0.0:*
>>> LISTEN hduser 21691870 13839/java
>>> tcp 0 0 0.0.0.0:41640 0.0.0.0:*
>>> LISTEN hduser 21691296 13697/java
>>> tcp 0 0 127.0.0.1:31337 0.0.0.0:*
>>> LISTEN root 20432660 1514/python
>>> tcp 0 0 0.0.0.0:50060 0.0.0.0:*
>>> LISTEN hduser 21692144 13839/java
>>> tcp 0 0 0.0.0.0:http-alt 0.0.0.0:*
>>> LISTEN root 20431897 1421/python
>>>
>>>
>>> *tcp 0 0 127.0.0.1:30001 <http://127.0.0.1:30001/>
>>> 0.0.0.0:* LISTEN hduser 21370004 7856/ssh
>>> tcp 0 0 127.0.0.1:30003 <http://127.0.0.1:30003/>
>>> 0.0.0.0:* LISTEN hduser 21693562 13961/java
>>> *tcp
>>> 0 0 127.0.0.1:58741 0.0.0.0:* LISTEN
>>> hduser 21370000 7856/ssh
>>> tcp 0 0 127.0.0.1:58742 0.0.0.0:*
>>> LISTEN hduser 21369982 7845/autossh
>>> tcp 0 0 0.0.0.0:ssh 0.0.0.0:*
>>> LISTEN root 9130 834/sshd
>>> tcp6 0 0 ::1:30001 :::*
>>> LISTEN hduser 21370003 7856/ssh
>>> tcp6 0 0 ::1:58741 :::*
>>> LISTEN hduser 21369999 7856/ssh
>>> tcp6 0 0 :::ssh :::*
>>> LISTEN root 9165 834/sshd
>>
>>
>> instead of all interfaces (0.0.0.0)
>>
>> This results in node being unreachable from other nodes. ie hdnode02:
>>
>>>
>>> 2014-10-17 14:10:31,146 WARN org.apache.giraph.comm.netty.NettyClient:
>>> 2014-10-17 14:10:31,159 WARN org.apache.giraph.comm.netty.NettyClient:
>>> connectAllAddresses: Future failed to connect with
>>> hdnode-2/XXX.XXX.XXX.XXX:30003 with 1 failures because of
>>> java.net.ConnectException: Connection refused:
>>> *hdnode-2/XXX.XXX.XXX.XXX:30003*
>>> 2014-10-17 14:10:31,159 INFO org.apache.giraph.comm.netty.NettyClient:
>>> connectAllAddresses: Successfully added 1 connections, (1 total connected)
>>> 2 failed, 2 failures total.
>>
>>
>> If I stop all processes and start nc on 30003, I can telnet to hdnode2.
>>
>> Question here is if there is any setup that will configure Child process
>> to listen on 0.0.0.0 instead of loopback interface?
>>
>> Thanks in advance
>>
>>
>
>
> --
> --------------------------------
> Bojan Babic, M.Sc.E.E
> Software developer
> twitter: @bojanbabic
> mobile: +1312 8602944
>
>
--
--------------------------------
Bojan Babic, M.Sc.E.E
Software developer
twitter: @bojanbabic
mobile: +1312 8602944
diff --git
a/giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyServer.java
b/giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyServer.java
index 454232a..6910d90 100644
--- a/giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyServer.java
+++ b/giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyServer.java
@@ -58,6 +58,7 @@ import io.netty.channel.AdaptiveRecvByteBufAllocator;
import java.net.InetSocketAddress;
import java.net.UnknownHostException;
+import static org.apache.giraph.conf.GiraphConstants.ALL_INTERFACE_ADDRESS;
import static
org.apache.giraph.conf.GiraphConstants.MAX_IPC_PORT_BIND_ATTEMPTS;
/**
@@ -87,6 +88,8 @@ public class NettyServer {
private final String localHostname;
/** Address of the server */
private InetSocketAddress myAddress;
+ /** Address of all interface of the server */
+ private InetSocketAddress bindAddress;
/** Current task info */
private TaskInfo myTaskInfo;
/** Maximum number of threads */
@@ -343,6 +346,7 @@ public class NettyServer {
// it as a constant to increase the port number with.
while (bindAttempts < maxIpcPortBindAttempts) {
this.myAddress = new InetSocketAddress(localHostname, bindPort);
+ bindAddress = new InetSocketAddress(ALL_INTERFACE_ADDRESS, bindPort);
if (failFirstPortBindingAttempt && bindAttempts == 0) {
if (LOG.isInfoEnabled()) {
LOG.info("start: Intentionally fail first " +
@@ -355,7 +359,7 @@ public class NettyServer {
}
try {
- ChannelFuture f = bootstrap.bind(myAddress).sync();
+ ChannelFuture f = bootstrap.bind(bindAddress).sync();
accepted.add(f.channel());
break;
} catch (InterruptedException e) {
diff --git
a/giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
b/giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
index e78eb42..5be1987 100644
--- a/giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
+++ b/giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
@@ -541,6 +541,9 @@ public interface GiraphConstants {
/** Local ZooKeeper directory to use */
String ZOOKEEPER_DIR = "giraph.zkDir";
+ /** all interface address */
+ String ALL_INTERFACE_ADDRESS = "0.0.0.0";
+
/** Max attempts for handling ZooKeeper connection loss */
IntConfOption ZOOKEEPER_OPS_MAX_ATTEMPTS =
new IntConfOption("giraph.zkOpsMaxAttempts", 3,