Hi Weishung,

See the EC2 instance pricing details here:
http://aws.amazon.com/ec2/#pricing

<http://aws.amazon.com/ec2/#pricing>and try to calculate it out vs. price
quotes for hardware.

You'll need to run at _least_ m1.large or c1.xlarge instances for HBase.
 There was a recent discussion thread covering EC2 performance.  You can
look it up at search-hadoop.com.

If you don't need the cluster running 24x7, maybe you can make the EC2
pricing work out.  Just be aware that you'll be taking a hit in raw IO
performance per node, so you may need to balance that out with more nodes
than you would need with using your own hardware.  If you need to persist
data between cluster restarts, you'll also need either EBS or S3 storage, so
be sure to factor that in.  Also factor in bandwidth costs if you need to
transfer a lot of data in/out of AWS.

My own impression is that EC2 is great and very cost effective for short
lived, on-demand computing resources.  We use it a great deal for functional
testing.  For 24x7 services, it seems like you pay a premium long term over
owning your own hardware, with advantage of no large up-front cost for
acquisition and access to easy elasticity to expand to meet demand, but with
a cost of reduced performance per node due to virtualization.

Best advice I can give is do some benchmarking to see how many nodes you
need to satisfy your processing requirements in EC2 vs on raw hardware and
try to comparatively price it out.

--gh

On Thu, Mar 10, 2011 at 9:12 AM, Weishung Chung <weish...@gmail.com> wrote:

> I am trying to estimate the cost of hosting own HBase cluster vs using EC2.
> Could anyone give me some guidance?
> Cluster size ~ 6 to 8 nodes
> Usage ~ at least 12 hours/day with lot of read/write operations. (I know I
> need to have more concrete usage number here)
>
> Thank you so much :)
>

Reply via email to