Hello daemeon reiydelle
Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy? >>Yes, you need to set this policy which will balance among the disks @Chen Song following settings controls what percentage of new block allocations will be sent to volumes with more available disk space than others dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G) dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f Did you set while startup the cluster..? Thanks & Regards Brahma Reddy Battula ________________________________ From: daemeon reiydelle [[email protected]] Sent: Thursday, February 12, 2015 12:02 PM To: [email protected] Cc: Ravi Prakash Subject: Re: hadoop cluster with non-uniform disk spec What have you set dfs.datanode.fsdataset.volume.choosing.policy to (assuming you are on a current version of Hadoop)? Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy? ....... “Life should not be a journey to the grave with the intention of arriving safely in a pretty and well preserved body, but rather to skid in broadside in a cloud of smoke, thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!” - Hunter Thompson Daemeon C.M. Reiydelle USA (+1) 415.501.0198 London (+44) (0) 20 8144 9872 On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <[email protected]<mailto:[email protected]>> wrote: Hey Ravi Here are my settings: dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G) dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f Chen On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <[email protected]<mailto:[email protected]>> wrote: Hi Chen! Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction to? On Wednesday, February 11, 2015 7:44 AM, Chen Song <[email protected]<mailto:[email protected]>> wrote: We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node. dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy. It winds up with the state of half of nodes are full while the other half underutilized. I am wondering if there is a known solution for this problem. Thank you for any suggestions. -- Chen Song -- Chen Song
