I had a similar question recently.
Please check out balancer 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer
  this will balance the data across the nodes.

- Manoj

From: Chen Song <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, February 11, 2015 at 7:44 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: hadoop cluster with non-uniform disk spec

We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform 
in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on 
each node, while the other half have 5 volume of 900GB on each node.

dfs.datanode.fsdataset.volume.choosing.policy is set to 
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.

It winds up with the state of half of nodes are full while the other half 
underutilized. I am wondering if there is a known solution for this problem.

Thank you for any suggestions.

--
Chen Song


The information transmitted in this email is intended only for the person or 
entity to which it is addressed, and may contain material confidential to Xoom 
Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient(s) is prohibited. If you received this email in error, please contact 
the sender and delete the material from your files.

Reply via email to