We've been pretty successful in using whirr to spin up clusters of AWS cc1.4xlarge (cluster compute) instances. However, I'm now trying to have it spin up clusters of cc2.8xlarge instances, and it's failing. Would love to find a way to work around this.

Background:

We've rolled our own custom AMI (built off of the stock Ubuntu Precise AMI's at http://cloud.ubuntu.com/ami/), which bundles in some of our own software and data. We're then able to spin up clusters of cc1.4xlarge machines using this AMI.

When I was initially building this AMI, one of the issues I ran into is that Ubuntu names the ephemeral drives differently than whirr is expecting. Whirr is looking for drives /dev/sdb, sdc, etc., while under Ubuntu/Xen they're actually named /dev/xvdb, xvdc, etc. However, this was easy to work around: I just set up symlinks from /dev/xvdb to sdb and so on.


However, for some reason this solution isn't working on the cc2.8xlarge instances. I've created an AMI with four ephemeral disks, and have set up symlinks for each of them (sdb -> xvdb, sdc -> xvdc, sdd -> xvdd, sde -> xvde). But whirr isn't seeing the latter 2 disks. Instead, it's trying to initialize /dev/sdb and sdc twice, resulting in the Hadoop daemons being unable to start successfully on these machines.

The ultimate culprit seems to be this line in one of the node initialization scripts:

prepare_all_disks '/data0,/dev/sdb;/data1,/dev/sdc;/data2,/dev/sdb;/data3,/dev/sdc'

But I have to admit, I have no idea why it's trying to initialize the disks that way. I very definitely have 4 distinct disks (and symlinks) on these data node machines:

# ls -l /dev/xvd*
brw-rw---- 1 root disk 202, 16 Oct  2 20:53 /dev/xvdb
brw-rw---- 1 root disk 202, 32 Oct  2 20:53 /dev/xvdc
brw-rw---- 1 root disk 202, 48 Oct  2 20:53 /dev/xvdd
brw-rw---- 1 root disk 202, 64 Oct  2 20:53 /dev/xvde

# ls -l /dev/sd*
brw-rw---- 1 root disk 8, 0 Oct  2 20:53 /dev/sda
brw-rw---- 1 root disk 8, 1 Oct  2 20:53 /dev/sda1
lrwxrwxrwx 1 root root    4 Oct  2 20:53 /dev/sdb -> xvdb
lrwxrwxrwx 1 root root    4 Oct  2 20:53 /dev/sdc -> xvdc
lrwxrwxrwx 1 root root    4 Oct  2 20:53 /dev/sdd -> xvdd
lrwxrwxrwx 1 root root    4 Oct  2 20:53 /dev/sde -> xvde


Anyone have any idea what's happening here and/or how to fix?

Thanks,

DR

Reply via email to