Here are a few of my thoughts:

If possible, you might want to localize your data to a few regions if you can 
and then may be have exclusive access to those regions. This way, external load 
will not impact you.  I have heard that write penalty of SSDs is quite high. 
But I think, they will still be better than spinning disks. Also( I read a 
while back), with SSDs you get a quota of max possible writes so if you are 
write heavy, it may be an issue.

I would presume any solution like cache which is built within Hbase will suffer 
from the same issues you described. OTOH, External caching can help but then 
you need to invest there and maintain cache to source consistency - might be 
another issue.

If you are just doing KV lookups and no ranges, why don't just use KV stores 
like Cassandra or may be explore other Nosql solns like Mongo etc? 

If your data lookups exhibits temporal locality, external, client side cache 
pools may help.

My 2c,
Abhishek


-----Original Message-----
From: ddlat...@gmail.com [mailto:ddlat...@gmail.com] On Behalf Of Dave Latham
Sent: Friday, October 19, 2012 4:31 PM
To: user@hbase.apache.org
Subject: scaling a low latency service with HBase

I need to scale an internal service / datastore that is currently hosted on an 
HBase cluster and wanted to ask for advice from anyone out there who may have 
some to share.  The service does simple key value lookups on 20 byte keys to 
20-40 byte values.  It currently has about 5 billion entries (200GB), and 
processes about 40k random reads per second, and about 2k random writes per 
second.  It currently delivers a median response at 2ms, 90% at 20ms, 99% at 
200ms, 99.5% at 5000ms - but the mean is 58ms which is no longer meeting our 
needs very well.  It is persistent and highly available.  I need to measure its 
working set more closely, but I believe that around 20-30% (randomly 
distributed) of the data is accessed each day.  I want a system that can scale 
to at least 10x current levels (50 billion entries - 2TB, 400k requests per 
second) and achieve a mean < 5ms (ideally 1-2ms) and 99.5% < 50ms response time 
for reads while maintaining persistence and reasonably high availability 
(99.9%).  Writes would ideally be in the same as range but we could probably 
tolerate a mean more in the 20-30ms range.

Clearly for that latency, spinning disks won't cut it.  The current service is 
running out of an hbase cluster that is shared with many other things and when 
those other things hit the disk and network hard is when it degrades.  The 
cluster has hundreds of nodes and this data is fits in a small slice of block 
cache across most of them.  The concerns are that its performance is impacted 
by other loads and that as it continues to grow there may not be enough space 
in the current cluster's shared block cache.


So I'm looking for something that will serve out of memory (backed by disk for 
persistence) or from SSDs.  A few questions that I would love to hear answers 
for:

 - Does HBase sound like a good match as this grows?
 - Does anyone have experience running HBase over SSDs?  What sort of latency 
and requests per second have you been able to achieve?
 - Is anyone using a row cache on top of (or built into) HBase?  I think 
there's been a bit of discussion on occasion but it hasn't gone very far.
There would be some overhead for each row.  It seems that if we were to 
continue to rely on memory + disks this could reduce the memory required.
 - Does anyone have alternate suggestions for such a service?

Dave

Reply via email to