I'm trying to setup what I think would be a common hadoop configuration. I have 4 data nodes on an internal 10.x network. Each of the data nodes only has access to the 10.x network. The name node has both an internal 10.x network interface and an external interface. I want the hdfs filesystem and job tracker to be available on the external network, but the communication within the cluster to be on the 10.x network. Is this possible to do? Changing the fs.default.name configuration parameter I can change the filesystem to listen from the internal to the external interface, however, then the data nodes can't communicate to the name node. I also tried setting the fs.default.name IP address to 0.0.0.0 to see if it would bind to all interfaces, but that didn't seem to work.
Is it possible to configure hadoop so that the datanodes communicate on an internal network, but access to hdfs and the job tracker are done through an external interface? Any help would be much appreciated. Thank you Andy
