Hi All: I'm new to the hadoop platform, and I was trying to establish a network of 7 computers to form a cluster.
When I accessed the namenode web GUI at http://ss1:50070<http://mediaminer:50070/dfsnodelist.jsp?whatNodes=LIVE>, I found that the displayed configured capacity is much larger than the sum of the actual hard disk volumes of the 7 computers in the cluster. For instance, the machine *ss2* has 141GB hard disk. The following are the comparison between what the hadoop web gui shows and the result of "df -h" command in Ubuntu. Configured : 281.36 GB Non DFS Usage : 21.82 GB Remaining : 259.53 GB > df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 141G 3.8G 130G 3% / The summary of configured capacity table is shown below: Live Datanodes : 6 NodeLast Contact Admin StateConfigured Capacity (GB) Used (GB)Non DFS Used (GB) Remaining (GB)Used (%) Used (%)Remaining (%) Blocksss2<http://ss2:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F> 1 In Service281.36 021.82 259.530 92.24 1ss3<http://ss3:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F> 1 In Service281.36 020.62 260.730 92.67 1ss4<http://ss4:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F> 1 In Service226.55 017.99 208.560 92.06 0ss5<http://ss5:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F> 0 In Service140.9 013.46 127.450 90.45 0ss7<http://ss7:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F> 2 In Service351.86 025.92 325.940 92.63 0ss8<http://ss8:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F> 2 In Service152.77 014.03 138.730 90.81 0 ====================================================== The actual hard disk volumes are: *ss2* Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 147511784 3947892 136070672 3% / *ss3* Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 147511784 3318740 136699824 3% / *ss4* Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 118779164 3398316 109347184 4% / *ss5* Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 73874656 3302088 66819916 5% / *ss7* Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda5 184476740 4219176 170886684 3% / *ss8* Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 80094048 3289760 72735732 5% / ====================================================== It can be clearly seen that both the configured capacity and remaining capacity are *twice* the actual hard disk volume. All machines are installed Ubuntu. Hadoop version is 0.20. I set * dfs.datanode.du.reserved* to be 0 and *dfs.replication* to be 1. Anyone knows how this can be explained? Your help is much appreciated. Regards Andy
