_____
From: Vaibhav J [mailto:[email protected]]
Sent: Monday, March 16, 2009 5:46 PM
To: '[email protected]'; '[email protected]'
Subject: Problem : data distribution is non uniform between two different
disks on datanode.
We have 27 datanode and replication factor is 1. (data size is ~6.75 TB)
We have specified two different disks for dfs data directory on each
datanode by using
property dfs.data.dir in hadoop-site.xml file of conf directory.
(value of property dfs.data.dir : /mnt/hadoop-dfs/data,
/mnt2/hadoop-dfs/data)
when we are setting replication factor 2 then data distribution is biased to
first disk,
more data is coping on /mnt/hadoop-dfs/data and after copying some
data...first disk becomes full
and showing no available space on disk while we have enough space on second
disk (/mnt2/hadoop-dfs/data ).
so, it is difficult to achieve replication factor 2.
Data traffic is coming on second disk also (/mnt2/hadoop-dfs/data) but it
looks that
more data is copied on fisrt disk (/mnt/hadoop-dfs/data).
What should we do to get uniform data distribution between two different
disks on
each datanode to achieve replication factor 2?
Regards
Vaibhav J.