Hi Grant,

[snip]

All of these include a start script that is downloaded from what is called a "user-data file". This can be up to 16K in length. I used that script to customize my instances with additional loftware like hadoop, java, our own
software as well as reconfiguring the instance as necessary, mounting
elastic block volumes and tweaking the DHCP configuration to add an
over-ride to avoid a few gotchas. Total boot time was still typically < 40
s and I hear that it has gotten faster since then.

Can you share the script, obviously removing the part for your prop. software?

FWIW, you can look at what we use for Bixo, with our EC2 AMI - it's at:

http://github.com/bixo/bixo/blob/master/bin/ec2/hadoop-aws/etc/hadoop-ec2-init-remote.sh

Though the version current in GitHub is missing one important correction - you want to call ulimit -n 20000 right before running the hadoop-daemon script to start the tasktracker on the slave, as in:

ulimit -n 20000
"$HADOOP_HOME"/bin/hadoop-daemon.sh start tasktracker

-- Ken

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g




Reply via email to