jiang licht wrote:
What are the exact packets and steps used to establish a namenode/datanode
connection and jobtracker/tasktracker connection?
I am asking this due to a weird problem related to starting datanodes and tasktrackers.
In my case, the namenode box has 2 ethernet interfaces combined as bond0
interface with IP address of IP_A and there is an IP alias IP_B for local
loopback interface as lo:1. All slave boxes sit on the same network segment as
IP_B.
The network is configured such that no slave box can reach namenode box at IP_A but namenode box can reach slave
boxes (clearly can only routed from bond0). So, slave boxes always use "hdfs://IP_B:50001" as
"fs.default.name" in "core-site.xml" and use IP_B:50002" for job tracker in
mapred-site.xml to reach namenode box.
There are the following 2 cases how namenode (or jobtracker) is configured on
namenode box.
Case #1: If I set "fs.default.name" to "hdfs://IP_B:50001", no slave boxes can join the cluster as data nodes
because the request to IP_B:50001 failed. "telnet IP_B 50001" on slave boxes resulted in connection refused. So, on
namenode box, I fired "tcpdump -i bond0 tcp port 50001" and then from a slave box did a "telnet IP_B 5001"
and watched for incoming and outgoing packets on namenode box.
Case #2: If I set "fs.default.name" to "hdfs://IP_A:50001", slave boxes can
join the cluster as data nodes. And I did the same thing to use tcpdump and telnet to watch the
traffic. I compared these two cases and found some difference in the traffic. So, I want to know if
there is a hand-shaking stage for namenode and datanode to establish a connection and what are the
packets for this purpose so that I can figure out if packets exchanged in case #1 are correct or
not, which may reveal why the connection request from data node to name node fails.
Also in Case #2, although all slave boxes can join the cluster as datanodes, no slave box
can start as a tasktracker because at the beginning of starting a tasktracker, the
tasktracker box uses IP_A:50001 to request connection to namenode and as mentioned above
(slaves are not allowed to reach namenode at IP_A but reverse direction is ok), this
cannot be done. But my confusion here is that on all slave boxes
"fs.default.name" is set to use IP_B:50001, how come it ended up with
contacting the namenode with IP_A:50001?
A bit complicated. But any thoughts?
the NN listens on the card given by the IP address of its hostname; it
does not like people connecting to it using a different hostname than
the one it is on (irritating, something to fix)
It sounds like you have DNS problems. you should have a consistent
mapping from hostname<-->IP Addr across the entire cluster, but the
issues you have indicate this may not be the case.