[ 
https://issues.apache.org/jira/browse/HADOOP-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470790
 ] 

Marco Nicosia commented on HADOOP-685:
--------------------------------------

I don't think a dataNode should ever try to determine its own hostname.

In situations where dataNodes might have virtual IP addresses configured, or 
have multiple interfaces on different subnets, determining what the "correct" 
hostname should be is non-deterministic. You can do some work to find the 
"administrative hostname" (ie, the name of the host, not necessarily any 
particular interface) but that's only useful for identification purposes, and 
requires DNS to get the FQDN.

I know it's not trivial, but I'd prefer that the nameNode record the IP address 
of a connection. That way there's no DNS involved at any level in the 
transaction, and we know exactly which interface/IP address is being used. 
Additionally, there's no worrying about /etc/hosts, or dhcp, or whatnot. It 
works for the entire time the dataNode's up, and making network connections.

In order to support multiple dataNodes per machine, dataNodes need to report 
their listening port, but I think that's required regardless of how we solve 
this problem?


> DataNode appears to require DNS name resolution as opposed to direct ip 
> mapping
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-685
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>         Environment: osx, ubuntu 6.10b
>            Reporter: James Todd
>         Assigned To: Raghu Angadi
>            Priority: Minor
>
> DataNode appears to require DNS resolution of nodes via the class 
> org.apache.hadoop.net.DNS as opposed being able to use a specified ip.
> as an example, i was not able to set up more then one instance of dfs 
> datanodes on one box using loopback w/ varying ports since DataNode
> resolved the ip of 127.0.0.1 to be "foo.bar" which was then mapped to the 
> dhcp allocated ip of 192.168.0.***, which was not addressable by the
> rest of the dfs cluster (namely namenode).
> while this example is trivial one should be able to use the very same process 
> yet change only the ip's of the nodes and have things work as
> expected.
> it would be nice to not always require nds resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to