Just curious as to why this would happen. There are other posts that suggest that datanode is responsible to enforce round-robin write strategy among the various disks specified using the "dfs.data.dir" property
http://www.quora.com/Can-Hadoop-deal-with-dfs.data.dir-devices-of-different- sizes http://hadoop.apache.org/common/docs/r0.20.1/hdfs-default.html Couple of reasons I can think of - The other mount points were added later on although as per the balancing logic datanode should eventually ensure that all disks are balanced - Mount points were unavailable at some point. As per the "dfs.data.dir" doc it says "Directories that do not exist are ignored". Unsure if unvaialable disks are reset on start of datanode or does it check every time there is a write. - older version of hadoop? On 11/7/10 2:19 AM, "[email protected]" <[email protected]> wrote: > From: Shavit Netzer <[email protected]> > Date: Fri, 5 Nov 2010 10:12:16 -0700 > To: "[email protected]" <[email protected]> > Cc: "[email protected]" <[email protected]> > Subject: Re: Hadoop partitions Problem > > Yes > > Sent from my mobile > > On 05/11/2010, at 19:09, "Harsh J" <[email protected]> wrote: > >> Hi, >> >> On Fri, Nov 5, 2010 at 9:03 PM, Shavit Netzer <[email protected]> >> wrote: >>> Hi, >>> >>> I have hadoop cluster with 24 nodes. >>> >>> Each node have 4 mount disks mnt - mnt3. >> >> Just to confirm -- You've configured all DataNodes to utilize ALL >> these mount points via the dfs.data.dir property, yes? >> >> >> >> -- >> Harsh J >> www.harshj.com iCrossing Privileged and Confidential Information This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
