HDFS Configured Capacity differs from Hard Disk volume

Andy XUE Mon, 20 Jun 2011 01:22:46 -0700

Hi All:

I'm new to the hadoop platform, and I was trying to establish a network of 7
computers to form a cluster.


When I accessed the namenode web GUI at
http://ss1:50070<http://mediaminer:50070/dfsnodelist.jsp?whatNodes=LIVE>,
I found that the displayed configured capacity is much larger than the sum
of the actual hard disk volumes of the 7 computers in the cluster.


For instance, the machine *ss2* has 141GB hard disk. The following are the
comparison between what the hadoop web gui shows and the result of "df -h"
command in Ubuntu.

Configured    : 281.36 GB
Non DFS Usage :  21.82 GB
Remaining     : 259.53 GB


>  df -h
Filesystem            Size  Used   Avail   Use%   Mounted on
/dev/sda1             141G  3.8G    130G     3%   /



The summary of configured capacity table is shown below:

Live Datanodes : 6

NodeLast
Contact Admin StateConfigured
Capacity (GB) Used
(GB)Non DFS
Used (GB) Remaining
(GB)Used
(%) Used
(%)Remaining
(%) 
Blocksss2<http://ss2:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
1 In Service281.36 021.82 259.530 92.24
1ss3<http://ss3:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
1 In Service281.36 020.62 260.730 92.67
1ss4<http://ss4:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
1 In Service226.55 017.99 208.560 92.06
0ss5<http://ss5:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
0 In Service140.9 013.46 127.450 90.45
0ss7<http://ss7:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
2 In Service351.86 025.92 325.940 92.63
0ss8<http://ss8:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
2 In Service152.77 014.03 138.730 90.81 0

======================================================
The actual hard disk volumes are:
*ss2*
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1            147511784   3947892 136070672   3% /

*ss3*
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1            147511784   3318740 136699824   3% /

*ss4*
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1            118779164   3398316 109347184   4% /

*ss5*
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             73874656   3302088  66819916   5% /

*ss7*
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda5            184476740   4219176 170886684   3% /

*ss8*
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             80094048   3289760  72735732   5% /
======================================================


It can be clearly seen that both the configured capacity and remaining
capacity are *twice* the actual hard disk volume.
All machines are installed Ubuntu. Hadoop version is 0.20. I set *
dfs.datanode.du.reserved* to be 0 and *dfs.replication* to be 1.

Anyone knows how this can be explained? Your help is much appreciated.


Regards
Andy

HDFS Configured Capacity differs from Hard Disk volume

Reply via email to