Mutiple dfs.data.dir vs RAID0

Jean-Marc Spaggiari Sun, 10 Feb 2013 17:58:08 -0800

Hi,

I have a quick question regarding RAID0 performances vs multiple
dfs.data.dir entries.


Let's say I have 2 x 2TB drives.

I can configure them as 2 separate drives mounted on 2 folders and
assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives
with RAID0 and assigned them as a single folder to dfs.data.dir.

With RAID0, the reads and writes are going to be spread over the 2
disks. This is significantly increasing the speed. But if I put 2
entries in dfs.data.dir, hadoop is going to spread over those 2
directories too, and at the end, ths results should the same, no?

Any experience/advice/results to share?

Thanks,

JM

Mutiple dfs.data.dir vs RAID0

Reply via email to