We've been pretty successful in using whirr to spin up clusters of AWS
cc1.4xlarge (cluster compute) instances. However, I'm now trying to
have it spin up clusters of cc2.8xlarge instances, and it's failing.
Would love to find a way to work around this.
Background:
We've rolled our own custom AMI (built off of the stock Ubuntu Precise
AMI's at http://cloud.ubuntu.com/ami/), which bundles in some of our own
software and data. We're then able to spin up clusters of cc1.4xlarge
machines using this AMI.
When I was initially building this AMI, one of the issues I ran into is
that Ubuntu names the ephemeral drives differently than whirr is
expecting. Whirr is looking for drives /dev/sdb, sdc, etc., while under
Ubuntu/Xen they're actually named /dev/xvdb, xvdc, etc. However, this
was easy to work around: I just set up symlinks from /dev/xvdb to sdb
and so on.
However, for some reason this solution isn't working on the cc2.8xlarge
instances. I've created an AMI with four ephemeral disks, and have set
up symlinks for each of them (sdb -> xvdb, sdc -> xvdc, sdd -> xvdd, sde
-> xvde). But whirr isn't seeing the latter 2 disks. Instead, it's
trying to initialize /dev/sdb and sdc twice, resulting in the Hadoop
daemons being unable to start successfully on these machines.
The ultimate culprit seems to be this line in one of the node
initialization scripts:
prepare_all_disks
'/data0,/dev/sdb;/data1,/dev/sdc;/data2,/dev/sdb;/data3,/dev/sdc'
But I have to admit, I have no idea why it's trying to initialize the
disks that way. I very definitely have 4 distinct disks (and symlinks)
on these data node machines:
# ls -l /dev/xvd*
brw-rw---- 1 root disk 202, 16 Oct 2 20:53 /dev/xvdb
brw-rw---- 1 root disk 202, 32 Oct 2 20:53 /dev/xvdc
brw-rw---- 1 root disk 202, 48 Oct 2 20:53 /dev/xvdd
brw-rw---- 1 root disk 202, 64 Oct 2 20:53 /dev/xvde
# ls -l /dev/sd*
brw-rw---- 1 root disk 8, 0 Oct 2 20:53 /dev/sda
brw-rw---- 1 root disk 8, 1 Oct 2 20:53 /dev/sda1
lrwxrwxrwx 1 root root 4 Oct 2 20:53 /dev/sdb -> xvdb
lrwxrwxrwx 1 root root 4 Oct 2 20:53 /dev/sdc -> xvdc
lrwxrwxrwx 1 root root 4 Oct 2 20:53 /dev/sdd -> xvdd
lrwxrwxrwx 1 root root 4 Oct 2 20:53 /dev/sde -> xvde
Anyone have any idea what's happening here and/or how to fix?
Thanks,
DR