Hadoop Datacenter Setup

Aaron Tokhy Mon, 30 Jan 2012 14:40:54 -0800

Hi,

Our group is trying to set up a prototype for what will eventuallybecome a cluster of ~50 nodes.

Anyone have experiences with a stateless Hadoop cluster setup using thismethod on CentOS? Are there any caveats with a read-only root filesystem approach? This would save us from having to keep a root volumeon every system (whether it is installed on a USB thumb drive, or a RAID1 of bootable / partitions).


http://citethisbook.net/Red_Hat_Introduction_to_Stateless_Linux.html

We would like to keep the OS root file system separate from the Hadoopfilesystem(s) for maintenance reasons (we can hot swap disks while thesystem is running)

We were also considering installing the root filesystem on USB flashdrives, making it persistent yet separate. However we would identifyand turn off anything that would cause excess writes to the rootfilesystem given the limited number of USB flash drive write cycles(keep IO writes to the root filesystem to a minimum). We would do thisby storing the Hadoop logs on the same filesystem/drive as what wespecify in dfs.data.dir/dfs.name.dir.


In the end we would have something like this:

USB (MS DOS partition table + 1 ext2/3/4 partition)
/dev/sda
/dev/sda1    mounted as /        (possibly read-only)
/dev/sda2    mounted as /var    (read-write)
/dev/sda3    mounted as /tmp    (read-write)

Hadoop Disks (no partition table or GPT since these are 3TB disks)
/dev/sdb    /mnt/d0
/dev/sdc    /mnt/d1
/dev/sdd    /mnt/d2

/mnt/d0 would contain all Hadoop logs.

Hadoop configuration files would still reside on /


Any issues with such a setup?  Are there better ways of achieving this?

Hadoop Datacenter Setup

Reply via email to