Utilizing multiple hard disks for hadoop HDFS ?

2011-12-01 Thread praveenesh kumar
Hi everyone,

So I have this blade server with 4x500 GB hard disks.
I want to use all these hard disks for hadoop HDFS.
How can I achieve this target ?

If I install hadoop on 1 hard disk and use other hard disk as normal
partitions eg.  -

/dev/sda1, -- HDD 1 -- Primary partition -- Linux + Hadoop installed on it
/dev/sda2, -- HDD 2 -- Mounted partition -- /mnt/dev/sda2
/dev/sda3, -- HDD3  -- Mounted partition -- /mnt/dev/sda3
/dev/sda4, -- HDD4  -- Mounted partition -- /mnt/dev/sda4

And if I create a hadoop.tmp.dir on each partition say --
/tmp/hadoop-datastore/hadoop-hadoop

and on core-site.xml, if I configure like --
property
namehadoop.tmp.dir/name

value/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda2/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda3/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda4/tmp/hadoop-datastore/hadoop-hadoop/value
descriptionA base for other temporary directories./description
/property

Will it work ??

Can I set the above property for dfs.data.dir also ?

Thanks,
Praveenesh


Re: Utilizing multiple hard disks for hadoop HDFS ?

2011-12-01 Thread Harsh J
You need to apply comma-separated lists only to dfs.data.dir (HDFS) and 
mapred.local.dir (MR) directly. Make sure the subdirectories are different for 
each, else you may accidentally wipe away your data when you restart MR 
services.

The hadoop.tmp.dir property does not accept multiple paths and you should avoid 
using it in production - its more of a utility property that acts as a default 
base path for other properties.

On 02-Dec-2011, at 11:05 AM, praveenesh kumar wrote:

 Hi everyone,
 
 So I have this blade server with 4x500 GB hard disks.
 I want to use all these hard disks for hadoop HDFS.
 How can I achieve this target ?
 
 If I install hadoop on 1 hard disk and use other hard disk as normal
 partitions eg.  -
 
 /dev/sda1, -- HDD 1 -- Primary partition -- Linux + Hadoop installed on it
 /dev/sda2, -- HDD 2 -- Mounted partition -- /mnt/dev/sda2
 /dev/sda3, -- HDD3  -- Mounted partition -- /mnt/dev/sda3
 /dev/sda4, -- HDD4  -- Mounted partition -- /mnt/dev/sda4
 
 And if I create a hadoop.tmp.dir on each partition say --
 /tmp/hadoop-datastore/hadoop-hadoop
 
 and on core-site.xml, if I configure like --
 property
namehadoop.tmp.dir/name
 
 value/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda2/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda3/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda4/tmp/hadoop-datastore/hadoop-hadoop/value
descriptionA base for other temporary directories./description
 /property
 
 Will it work ??
 
 Can I set the above property for dfs.data.dir also ?
 
 Thanks,
 Praveenesh