[
https://issues.apache.org/jira/browse/HDFS-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Horrocks updated HDFS-6952:
---------------------------------
Description:
The Rack Aware documentation references a rack-topology.sh script which has two
small flaws;
1) From 2.x.x the default config dir is ..etc/hadoop not ..etc/hadoop/conf
2) When configuring DN to rack IDs in the rack_topology.data file if hostnames
are used then the rack-topology.sh script returns the prefixed rack ID but the
balancer and fsck report omit the rack ID and only return one single rack (IP
addresses in the data file work fine).
(e.g: when using hostnames
rack_topology.data
------------------------
datanode0 01
..
grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
--------------------------------------------------------------------------------------------
2014-08-27 10:29:52,518 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /LAB/*rack*/192.168.0.12:50010
hdfs fsck /
-------------
Number of data-nodes: 3
Number of racks: 1)
(e.g. when using IP addresses:
rack_topology.data
-----------------------
192.168.0.10 01
..
grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
-----------------------
2014-08-27 11:14:22,796 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /LAB/*rack_01*/192.168.0.10:50010
hdfs fsck /
-------------
Number of data-nodes: 3
Number of racks: 2)
was:
The Rack Aware documentation references a rack-topology.sh script which has two
small flaws;
1) From 2.x.x the default config dir is ..etc/hadoop not ..etc/hadoop/conf
2) When configuring DN to rack IDs in the rack_topology.data file if hostnames
are used then the rack-topology.sh script returns the prefixed rack ID but the
balancer and fsck report omit the rack ID and only return one single rack (IP
addresses in the data file work fine).
(e.g: when using hostnames:
rack-topology.sh
-----------------------
RACK_PREFIX=LAB
..
HADOOP_CONF=${HADOOP_CONF:-"/usr/local/hadoop/hadoop-2.5.0/etc/hadoop"}
rack_topology.data
------------------------
datanode0 01
..
grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
--------------------------------------------------------------------------------------------
2014-08-27 10:29:52,518 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /LAB/*rack*/192.168.0.12:50010
hdfs fsck /
-------------
Number of data-nodes: 3
Number of racks: 1)
(e.g. when using IP addresses:
rack-topology.sh
-----------------------
RACK_PREFIX=LAB
..
HADOOP_CONF=${HADOOP_CONF:-"/usr/local/hadoop/hadoop-2.5.0/etc/hadoop"}
rack_topology.data
-----------------------
192.168.0.10 01
..
grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
-----------------------
2014-08-27 11:14:22,796 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /LAB/*rack_01*/192.168.0.10:50010
hdfs fsck /
-------------
Number of data-nodes: 3
Number of racks: 2)
> Update Rack Aware documentation and/or script
> ---------------------------------------------
>
> Key: HDFS-6952
> URL: https://issues.apache.org/jira/browse/HDFS-6952
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer & mover
> Affects Versions: 2.5.0, 2.6.0
> Reporter: Chris Horrocks
> Priority: Minor
> Labels: balancer, rack
>
> The Rack Aware documentation references a rack-topology.sh script which has
> two small flaws;
> 1) From 2.x.x the default config dir is ..etc/hadoop not ..etc/hadoop/conf
> 2) When configuring DN to rack IDs in the rack_topology.data file if
> hostnames are used then the rack-topology.sh script returns the prefixed rack
> ID but the balancer and fsck report omit the rack ID and only return one
> single rack (IP addresses in the data file work fine).
> (e.g: when using hostnames
> rack_topology.data
> ------------------------
> datanode0 01
> ..
> grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
> --------------------------------------------------------------------------------------------
> 2014-08-27 10:29:52,518 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /LAB/*rack*/192.168.0.12:50010
> hdfs fsck /
> -------------
> Number of data-nodes: 3
> Number of racks: 1)
> (e.g. when using IP addresses:
> rack_topology.data
> -----------------------
> 192.168.0.10 01
> ..
> grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
> -----------------------
> 2014-08-27 11:14:22,796 INFO org.apache.hadoop.net.NetworkTopology: Adding a
> new node: /LAB/*rack_01*/192.168.0.10:50010
> hdfs fsck /
> -------------
> Number of data-nodes: 3
> Number of racks: 2)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)