Hi Keith, > On Wed, Oct 14, 2009 at 11:58 AM, Keith Thomas wrote: >> Am I correct in understanding that a farm of EC2 instances with Hadoop and >> HBase installed and configured individually by myself are the quickest and >> most effective way to progress with this effort?
On Wed, Oct 14, 2009 at 12:31 AM, Tatsuya Kawano wrote: > Well, you're not wrong. To run HBase on Amazon Web Services, you > should use EC2 instances and configure them by yourself. Make sure you > pick Extra Large instances from EC2 (see: > http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A8), and you may > also want EBS volumes as the storage devices rather than S3. (S3 is > good for archiving data) I checked this again and realized the easiest way to set up your EC2 instances could be using Cloudera's pre-built disk images and client scripts. HBase added to Cloudera distribution (hbase-user announcement by Stack) http://bit.ly/1dDutR Getting Started -- Cloudera Distribution AMI for Hadoop: http://bit.ly/4ma9Tv As I said in the previous post, use EBS (not S3) as the underling storage for HDFS. An correction from my last post. On Thu, Oct 15, 2009 at 2:25 AM, Tatsuya Kawano wrote: > Note that an Extra Large instance of Amazon EC2 is equipped with 4 > virtual CPU cores and 7.9GB RAM. -- A High-CPU Extra Large (c1.xlarge) instance is equipped with 8 virtual CPU cores and 7 GB RAM. -- An Extra Large (m1.xlarge) instance is equipped with 4 virtual CPU cores and 15 GB RAM. Thanks, -- Tatsuya Kawano (Mr.) Tokyo, Japan