> From: Matthew LeMieux
> I'm starting to find that EC2 is not reliable enough to support
> HBase.
[...]
> (I've been using m1.large and m2.xlarge running CDH3)

I personally don't use EC2 for anything more than on demand ad hoc testing, but 
I do know of successful deployments there. 

However, I at least have been consistent in my advice to use c1.xlarge 
instances. Note, **c**1.xlarge. This instance type is what has worked 
reasonably well for me. Other/lesser/cheaper ones in terms of virtual compute 
units have not.

> What would it take to make HBase resilient enough to take
> advantage of those environments?  Based on my experience
> and comments on this list, it seems "HBase
> in the cloud" is still a rather painful proposition.  

This is a good question and a valid point.

There is tension between 
  - tuning down ZooKeeper timeouts etc. to quickly identify failed nodes thus 
to trigger rapid redeployment of the regions to minimize their unavailability
  - tuning up ZooKeeper timeouts etc. to ride over stop-the-world GC or foibles 
of virtualized environments

There are open JIRAs in this area. For example, 
https://issues.apache.org/jira/browse/HBASE-1316

If you have some ideas or code that might or do demonstrate better behavior in 
environments like EC2, we'd love to hear them or see it!

   - Andy





Reply via email to