I'd say if the memcached model is working for you, stick with it. HBase (currently) caches whole blocks. With cache blocks enabled you can achieve 10s of thousands of reqs/sec with a pretty small cluster. However there's a catch. Once you reach the point where your tables are so large they can't all sit in memory at the same time you'll see a behavior change. User traffic tends to be very random access which, with block caching, can cause a lot of thrashing with frequent cache evictions. We've seen this bring our cluster to it's knees.
IMHO a better model is persist things in HBase and then cache things with memcached just as you would with any other data store. If you're looking for a spiffy memcached replacement I'd recommend checking out Redis. On Sat, Aug 18, 2012 at 3:12 AM, Lin Ma <[email protected]> wrote: > Hello guys, > > In your experience, is it practical to use HBase directly for serving? > Saying handle directly user traffic (tens of thousands QPS scale) behind > Apache, and replace the role of memcached? I am not sure whether there are > any known panic to replace memcached by using HBase? One issue I could > think about is for a specific row range, only one active region server > could handle the request, but in memcached, we can setup several memcached > instance with duplicate content (all of them are active) to serve the same > purpose under a VIP which could achieve better performance and scalability. > > Any advice or reference documents are appreciated. Thanks. > > regards, > Lin
