HI Keith, On Wed, Oct 14, 2009 at 11:58 AM, Keith Thomas <keith.tho...@gmail.com> wrote: > Am I correct in understanding that a farm of EC2 instances with Hadoop and > HBase installed and configured individually by myself are the quickest and > most effective way to progress with this effort?
Well, you're not wrong. To run HBase on Amazon Web Services, you should use EC2 instances and configure them by yourself. Make sure you pick Extra Large instances from EC2 (see: http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A8), and you may also want EBS volumes as the storage devices rather than S3. (S3 is good for archiving data) But... Are you really sure you want to use HBase for your Grail based web application on the cloud? I would definitely recommend MySQL which should be more suitable for both web applications and Amazon Web Services environment. HBase is not a cloud database and is currently more suitable for batch processing with billions of records. If you use HBase for this purpose, you will -- loose the Object Relational Mapping support from Grails. -- have to take care of database transactions and secondary indices by yourself. -- likely suffered from a latency of data retrieval, unless you use memcached. -- need more server resources than MySQL. MySQL can run on 1 EC2 instance, while HBase requires about 12 EC2 instances (2 for masters and DFS namenodes, 5 for region servers and DFS datanodes, 5 for ZooKeeper) Is there any special reason to use HBase for you web application? Thanks, -- Tatsuya Kawano (Mr.) Tokyo, Japan