Nice presentation Andy! Sean, I am experimenting with a small cluster on EC2 right now. Here is my experience.
1) it is a 5 node cluster (1 master + 4 slaves). All c1.xlarge instances. 2) I initially tried m1.large, but ran into some stability issues. So moved to c1.xlarge. Cluster is more stable now. Follow the instructions here carefully : http://hadoop.apache.org/hbase/docs/r0.20.3/api/overview-summary.html 3) I made a custom AMI from Ubuntu v9.10. It had all the required software. I spin up the instances based on this AMI. If I do it again, I'd go with Cloudera AMIs 4) I setup the cluster manually; it was manageable b/c it is a small cluster. If you need more machines, look at the EC2 scripts in Andy's presentation. 5) I run the whole cluster on EBS volumes. The max limit on EBS is 1TB. You can always string a few EBS disks as RAID-X configuration 6) EBS through-put is NOT really that impressive. Understandably. But good-enough 6.5) format your volumes as EXT4 or XFS file system. Both of these are better than EXT3. 7) turn on compression ( I use LZO). Not only it saves space on HDFS, it also speeds up read/writes. 8) c1.xlarge has 8G mem. My Hbase heap is 3G. 9) I have about 600 million rows in hbase (1 TB compressed) . I run MR jobs on them. I am fighting some stability problems now. Probably because the cluster is too small to handle the load (2000 regions / regionserver). Investigating and will post results here. 10) part of the experimentation is to figure out the cost of running a cluster as well. a C1.Xlarge costs about $500 / month - reserved instances cost about $150 / month (1 yr term). So be sure to factor this as well. 11) hbase comes with simple load-tester. (HBASE/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation) useful to test the cluster and it's stability. hope this helps. thanks Sujee http://sujee.net hbase-map-reduce tutorial here : http://sujee.net/tech/articles/hbase-map-reduce-freq-counter/ On Wed, Apr 21, 2010 at 12:40 AM, Andrew Purtell <apurt...@apache.org> wrote: > More info: > > http://hbase.s3.amazonaws.com/hbase/HBase-EC2-HUG9.odp > http://hbase.s3.amazonaws.com/hbase/HBase-EC2-HUG9.pdf > > - Andy > > > > > >