hdfs space problem.

Raj V Thu, 05 Aug 2010 08:33:59 -0700


I run a 512 node hadoop cluster. Yesterday I moved 30Gb of compressed data from 
a NFS mounted partition by running  on the namenode


hadoop fs -copyFromLocal  /mnt/data/data1 /mnt/data/data2 mnt/data/data3 
hdfs:/data

When the job completed the local disk on the namenode was 40% full ( Most of it 
used by the dfs dierctories) while the others had 1% disk utilization.

Just to see if there was an issue, I deleted the hdfs:/data directory and 
restarted the move from a datanode. 

Once again the disk space on that data node was substantially over utilized.

I would have assumed that the disk space would be more or less uniformly 
consumed on all the data nodes.

Is there a reason why one disk would be over utilized? 

Do I have to run balancer everytime I copy data?

Am I missing something?

Raj

hdfs space problem.

Reply via email to