From: Andrei Savu <savu.and...@gmail.com<mailto:savu.and...@gmail.com>> Reply-To: "user@whirr.apache.org<mailto:user@whirr.apache.org>" <user@whirr.apache.org<mailto:user@whirr.apache.org>> Date: Thu, 12 Jan 2012 05:53:57 -0800 To: "user@whirr.apache.org<mailto:user@whirr.apache.org>" <user@whirr.apache.org<mailto:user@whirr.apache.org>> Subject: Re: [newbie] Unable to setup hadoop cluster on ec2
Yes we're going to be running jobs on a continuous basis I understand. Managing long running Hadoop clusters in Amazon is tricky due to namenode availability issues and inconsistent network & disk performance. Have you looked at this from a cost perspective? Maybe it's cheaper to buy a bunch of servers for this cluster that needs to be on all the time. We have a running cluster with a provider but node additions to cluster takes a long time (provisioning, contracts etc). Hence the move Also, how can I specify ebs volumes for these machines ? Unfortunately there is no easy way to do this with the current implementation. Do you want to take the lead on this? See https://issues.apache.org/jira/browse/WHIRR-290 I may not have the bandwidth to ramp up but would appreciate if you could send me some pointers on getting started ! We have a wiki page that describes how to build Whirr and contribute changes: https://cwiki.apache.org/confluence/display/WHIRR/How+To+Contribute -- Andrei will take a look Thanks, Madhu