Re: Auto-shutdown for EC2 clusters

Chris K Wensel Thu, 23 Oct 2008 08:54:02 -0700

Hey Stuart

I did that for a client using Cascading events and SQS.

When jobs completed, they dropped a message on SQS where a listenerpicked up new jobs and ran with them, or decided to kill off thecluster. The currently shipping EC2 scripts are suitable for havingmultiple simultaneous clusters for this purpose.

Cascading has always and now Hadoop supports (thanks Tom) raw fileaccess on S3, so this is quite natural. This is the best approach asdata is pulled directly into the Mapper, instead of onto HDFS first,then read into the Mapper from HDFS.


YMMV

chris

On Oct 23, 2008, at 7:47 AM, Stuart Sierra wrote:

Hi folks,
Anybody tried scripting Hadoop on EC2 to...
1. Launch a cluster
2. Pull data from S3
3. Run a job
4. Copy results to S3
5. Terminate the cluster
... without any user interaction?

-Stuart


--
Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/

Re: Auto-shutdown for EC2 clusters

Reply via email to