Yes, I think so (I have not JobTracker with me to check, though). It might be a bit old link ( http://archive.cloudera.com/cdh/3/hadoop/cluster_setup.html#Hadoop+Rack+Awareness), but it says
"The NameNode and the JobTracker obtains the rack id of the slaves in the cluster by invoking an API resolve<http://archive.cloudera.com/cdh/3/hadoop/api/org/apache/hadoop/net/DNSToSwitchMapping.html#resolve(java.util.List)> in an administrator configured module. The API resolves the slave's DNS name (also IP address) to a rack id. What module to use can be configured using the configuration item topology.node.switch.mapping.impl. The default implementation of the same runs a script/command configured using topology.script.file.name..." 2013/11/29 Siddharth Tiwari <[email protected]> > Hi Team/ Adam > > Thanks for the response > When you say to have rack awareness script on jobtracker, you mean I must > have the script in jobtracker machines as well ? I am using jt ha, where I > am using mr2 binaries for namenode ha and mr1 for jt ha. But do I need to > put the script in all four machines or jut Nn > > Sent from my iPhone > > On Nov 29, 2013, at 6:58 AM, "Adam Kawa" <[email protected]> wrote: > > > I have 2 nics on my datanodes, is it possible to used the one dedicated >> for replication and other for all other comunication i.e with jt and >> namenodes ? > > > Please correct me, I am wrong, but I have never seen a support for that in > Hadoop. DataNodes uses limited number of threads for balancing (afaik, 5 > maximally + dfs.datanode.balance.bandwidthPerSec), so this way resources > used for balancing can be limited. > > >> Also eventhough I am using rackawareness script and dfsreport shows the >> racks jobtracker shows all tasktracker in default rack, how to correct that. >> > > We switched to YARN recently, so it is difficult for me to check it now. > Do you have rack-awareness script deployed on the JobTracker machine? > >
