Re: On storing HBase data in AWS S3

Tatsuya Kawano Wed, 14 Oct 2009 17:10:07 -0700

Hi Keith,

> On Wed, Oct 14, 2009 at 11:58 AM, Keith Thomas wrote:
>> Am I correct in understanding that a farm of EC2 instances with Hadoop and
>> HBase installed and configured individually by myself are the quickest and
>> most effective way to progress with this effort?


On Wed, Oct 14, 2009 at 12:31 AM, Tatsuya Kawano wrote:
> Well, you're not wrong. To run HBase on Amazon Web Services, you
> should use EC2 instances and configure them by yourself. Make sure you
> pick Extra Large instances from EC2 (see:
> http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A8), and you may
> also want EBS volumes as the storage devices rather than S3. (S3 is
> good for archiving data)


I checked this again and realized the easiest way to set up your EC2
instances could be using Cloudera's pre-built disk images and client
scripts.

HBase added to Cloudera distribution (hbase-user announcement by Stack)
http://bit.ly/1dDutR

Getting Started -- Cloudera Distribution AMI for Hadoop:
http://bit.ly/4ma9Tv


As I said in the previous post, use EBS (not S3) as the underling
storage for HDFS.


An correction from my last post.

On Thu, Oct 15, 2009 at 2:25 AM, Tatsuya Kawano wrote:
> Note that an Extra Large instance of Amazon EC2 is equipped with 4
> virtual CPU cores and 7.9GB RAM.

-- A High-CPU Extra Large (c1.xlarge) instance is equipped with 8
virtual CPU cores and 7 GB RAM.
-- An Extra Large (m1.xlarge) instance is equipped with 4 virtual CPU
cores and 15 GB RAM.


Thanks,

-- 
Tatsuya Kawano (Mr.)
Tokyo, Japan

Re: On storing HBase data in AWS S3

Reply via email to