Hi Navaz, you have to configure the following two properties in namenode(after that you need to restart the namenode).
<property> <name>topology.node.switch.mapping.impl</name> <value>org.apache.hadoop.net.ScriptBasedMapping</value> <description> The default implementation of the DNSToSwitchMapping. It invokes a script specified in topology.script.file.name to resolve node names. If the value for topology.script.file.name is not set, the default value of DEFAULT_RACK is returned for all node names. </description> </property> <property> <name>topology.script.file.name</name> <value>/path/to/topo.sh</value> <description> The script name that should be invoked to resolve DNS names to NetworkTopology names. Example: the script would take host.foo.bar as an argument, and return /rack1 as the output. </description> </property> Example script file. topo.sh ======= #!/bin/bash python <TOPOLOGY_SCRIPT_HOME>/topology.py "$@" topology.py =========== import sys from string import join DEFAULT_RACK = '/default/rack0'; RACK_MAP = { '208.94.2.10' : '/datacenter1/rack0', '1.2.3.4' : '/datacenter1/rack1', '1.2.3.5' : '/datacenter1/rack1', '1.2.3.6' : '/datacenter1/rack1', '10.2.3.4' : '/datacenter1/rack2', '10.2.3.4' : '/datacenter1/rack2' } if len(sys.argv)==1: print DEFAULT_RACK else: print join([RACK_MAP.get(i, DEFAULT_RACK) for i in sys.argv[1:]]," ") Please check the following link for more details. https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf Thanks & Regards Brahma Reddy Battula HUAWEI TECHNOLOGIES INDIA PVT.LTD. Ground,1&2 floors,Solitaire, 139/26,Amarjyoti Layout,Intermediate Ring Road,Domlur Bangalore - 560 071 , India Tel : +91- 80- 3980 9600 Ext No: 4905 Fax : +91-80-41118578 ________________________________ From: Abdul Navaz [navaz....@gmail.com] Sent: Monday, November 17, 2014 4:48 AM To: user@hadoop.apache.org Subject: Configure Rack Numbers Hello, I have hadoop cluster with 9 nodes. All belongs to /default racks. But I want the setup something similar to this. (All are in same subnets) Rack 0: DataNode1,Datanode2,DataNode3 and top of rack switch1. Rack 1: DataNode4,Datanode5,DataNode6 and top of rack switch2. Rack 3: DataNode7,Datanode8,DataNode9 and top of rack switch3. I am trying to check the Hadoop rack awareness and how it copies the single block of data in one rack and replicas in some other rack. I want to analyse some network performance from this. So how can we separate this DNs based on rack numbers. Where can we configure this rack numbers and say this DN belongs to this rack number. Thanks & Regards, Abdul Navaz