Re: scaling a low latency service with HBase

2012-10-22 Thread Dave Latham
 Here are a few of my thoughts:

 If possible, you might want to localize your data to a few regions if you can 
 and then may be have exclusive access to those regions. This way, external 
 load will not impact you.  I have heard that write penalty of SSDs is quite 
 high. But I think, they will still be better than spinning disks. Also( I 
 read a while back), with SSDs you get a quota of max possible writes so if 
 you are write heavy, it may be an issue.

If the data lives only on a few regions, then it is only on a few
servers which means it won't fit in RAM, so it comes back to SSD's.  I
have also read that there is a limited number of lifetime writes on
SSDs, and I'm very interested in how that interacts with HBase's write
pipeline (which was designed for spinning disks).  I would imagine
that each HLog sync would cause an SSD write at each DataNode in the
pipeline.  In that case I would expect wear leveling to give
sufficient life.


 I would presume any solution like cache which is built within Hbase will 
 suffer from the same issues you described. OTOH, External caching can help 
 but then you need to invest there and maintain cache to source consistency - 
 might be another issue.

 If you are just doing KV lookups and no ranges, why don't just use KV stores 
 like Cassandra or may be explore other Nosql solns like Mongo etc?

Are there any that would have particular advantages when it came to SSDs?


 If your data lookups exhibits temporal locality, external, client side cache 
 pools may help.

 My 2c,
 Abhishek


 -Original Message-
 From: ddlat...@gmail.com [mailto:ddlat...@gmail.com] On Behalf Of Dave Latham
 Sent: Friday, October 19, 2012 4:31 PM
 To: user@hbase.apache.org
 Subject: scaling a low latency service with HBase

 I need to scale an internal service / datastore that is currently hosted on 
 an HBase cluster and wanted to ask for advice from anyone out there who may 
 have some to share.  The service does simple key value lookups on 20 byte 
 keys to 20-40 byte values.  It currently has about 5 billion entries (200GB), 
 and processes about 40k random reads per second, and about 2k random writes 
 per second.  It currently delivers a median response at 2ms, 90% at 20ms, 99% 
 at 200ms, 99.5% at 5000ms - but the mean is 58ms which is no longer meeting 
 our needs very well.  It is persistent and highly available.  I need to 
 measure its working set more closely, but I believe that around 20-30% 
 (randomly distributed) of the data is accessed each day.  I want a system 
 that can scale to at least 10x current levels (50 billion entries - 2TB, 400k 
 requests per second) and achieve a mean  5ms (ideally 1-2ms) and 99.5%  
 50ms response time for reads while maintaining persistence and reasonably 
 high availability (99.9%).  Writes would ideally be in the same as range but 
 we could probably tolerate a mean more in the 20-30ms range.

 Clearly for that latency, spinning disks won't cut it.  The current service 
 is running out of an hbase cluster that is shared with many other things and 
 when those other things hit the disk and network hard is when it degrades.  
 The cluster has hundreds of nodes and this data is fits in a small slice of 
 block cache across most of them.  The concerns are that its performance is 
 impacted by other loads and that as it continues to grow there may not be 
 enough space in the current cluster's shared block cache.


 So I'm looking for something that will serve out of memory (backed by disk 
 for persistence) or from SSDs.  A few questions that I would love to hear 
 answers for:

  - Does HBase sound like a good match as this grows?
  - Does anyone have experience running HBase over SSDs?  What sort of latency 
 and requests per second have you been able to achieve?
  - Is anyone using a row cache on top of (or built into) HBase?  I think 
 there's been a bit of discussion on occasion but it hasn't gone very far.
 There would be some overhead for each row.  It seems that if we were to 
 continue to rely on memory + disks this could reduce the memory required.
  - Does anyone have alternate suggestions for such a service?

 Dave


Re: scaling a low latency service with HBase

2012-10-22 Thread Dave Latham
On Fri, Oct 19, 2012 at 5:22 PM, Amandeep Khurana ama...@gmail.com wrote:
 Answers inline

 On Fri, Oct 19, 2012 at 4:31 PM, Dave Latham lat...@davelink.net wrote:

 I need to scale an internal service / datastore that is currently hosted on
 an HBase cluster and wanted to ask for advice from anyone out there who may
 have some to share.  The service does simple key value lookups on 20 byte
 keys to 20-40 byte values.  It currently has about 5 billion entries
 (200GB), and processes about 40k random reads per second, and about 2k
 random writes per second.  It currently delivers a median response at 2ms,
 90% at 20ms, 99% at 200ms, 99.5% at 5000ms - but the mean is 58ms which is
 no longer meeting our needs very well.  It is persistent and highly
 available.  I need to measure its working set more closely, but I believe
 that around 20-30% (randomly distributed) of the data is accessed each
 day.  I want a system that can scale to at least 10x current levels (50
 billion entries - 2TB, 400k requests per second) and achieve a mean  5ms
 (ideally 1-2ms) and 99.5%  50ms response time for reads while maintaining
 persistence and reasonably high availability (99.9%).  Writes would ideally
 be in the same as range but we could probably tolerate a mean more in the
 20-30ms range.

 Clearly for that latency, spinning disks won't cut it.  The current service
 is running out of an hbase cluster that is shared with many other things
 and when those other things hit the disk and network hard is when it
 degrades.  The cluster has hundreds of nodes and this data is fits in a
 small slice of block cache across most of them.  The concerns are that its
 performance is impacted by other loads and that as it continues to grow
 there may not be enough space in the current cluster's shared block cache.

 So I'm looking for something that will serve out of memory (backed by disk
 for persistence) or from SSDs.  A few questions that I would love to hear
 answers for:

  - Does HBase sound like a good match as this grows?


 Yes. The key to get more predictable performance is to separate out
 workloads. What are the other things that are using the same physical
 hardware and impacting performance? Have you measure performance when
 nothing else is running on the cluster?

There are several other things sharing the cluster and using it more
heavily than this service - both online request handling as well as
some large batch map reduce jobs.  When the large jobs aren't running
the performance is acceptable and typically in the 1-2ms mean reads
range.  (Served out of block cache).



  - Does anyone have experience running HBase over SSDs?  What sort of
 latency and requests per second have you been able to achieve?


 I don't believe many people are actually running this in production yet.
 Some folks have done some research on this topic and posted blogs (eg:
 http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html)
 but there's not a whole lot more than that to go by at this point.

Thanks, that's a really helpful reference.  It sounds like it could be
a big gain over disks already but that the bottleneck would move from
IO to CPU and that there would be significant work to be done.



  - Is anyone using a row cache on top of (or built into) HBase?  I think
 there's been a bit of discussion on occasion but it hasn't gone very far.
 There would be some overhead for each row.  It seems that if we were to
 continue to rely on memory + disks this could reduce the memory required.
  - Does anyone have alternate suggestions for such a service?


 The biggest recommendation is to separate out the workloads and then start
 planning for more hardware or additional components to get better
 performance.

Right, that's why I'm looking to separate this service out.  However,
I'd like to go with a much smaller set of nodes for this particular
service rather than duplicating the large, expensive cluster.


RE: scaling a low latency service with HBase

2012-10-19 Thread Pamecha, Abhishek
Here are a few of my thoughts:

If possible, you might want to localize your data to a few regions if you can 
and then may be have exclusive access to those regions. This way, external load 
will not impact you.  I have heard that write penalty of SSDs is quite high. 
But I think, they will still be better than spinning disks. Also( I read a 
while back), with SSDs you get a quota of max possible writes so if you are 
write heavy, it may be an issue.

I would presume any solution like cache which is built within Hbase will suffer 
from the same issues you described. OTOH, External caching can help but then 
you need to invest there and maintain cache to source consistency - might be 
another issue.

If you are just doing KV lookups and no ranges, why don't just use KV stores 
like Cassandra or may be explore other Nosql solns like Mongo etc? 

If your data lookups exhibits temporal locality, external, client side cache 
pools may help.

My 2c,
Abhishek


-Original Message-
From: ddlat...@gmail.com [mailto:ddlat...@gmail.com] On Behalf Of Dave Latham
Sent: Friday, October 19, 2012 4:31 PM
To: user@hbase.apache.org
Subject: scaling a low latency service with HBase

I need to scale an internal service / datastore that is currently hosted on an 
HBase cluster and wanted to ask for advice from anyone out there who may have 
some to share.  The service does simple key value lookups on 20 byte keys to 
20-40 byte values.  It currently has about 5 billion entries (200GB), and 
processes about 40k random reads per second, and about 2k random writes per 
second.  It currently delivers a median response at 2ms, 90% at 20ms, 99% at 
200ms, 99.5% at 5000ms - but the mean is 58ms which is no longer meeting our 
needs very well.  It is persistent and highly available.  I need to measure its 
working set more closely, but I believe that around 20-30% (randomly 
distributed) of the data is accessed each day.  I want a system that can scale 
to at least 10x current levels (50 billion entries - 2TB, 400k requests per 
second) and achieve a mean  5ms (ideally 1-2ms) and 99.5%  50ms response time 
for reads while maintaining persistence and reasonably high availability 
(99.9%).  Writes would ideally be in the same as range but we could probably 
tolerate a mean more in the 20-30ms range.

Clearly for that latency, spinning disks won't cut it.  The current service is 
running out of an hbase cluster that is shared with many other things and when 
those other things hit the disk and network hard is when it degrades.  The 
cluster has hundreds of nodes and this data is fits in a small slice of block 
cache across most of them.  The concerns are that its performance is impacted 
by other loads and that as it continues to grow there may not be enough space 
in the current cluster's shared block cache.


So I'm looking for something that will serve out of memory (backed by disk for 
persistence) or from SSDs.  A few questions that I would love to hear answers 
for:

 - Does HBase sound like a good match as this grows?
 - Does anyone have experience running HBase over SSDs?  What sort of latency 
and requests per second have you been able to achieve?
 - Is anyone using a row cache on top of (or built into) HBase?  I think 
there's been a bit of discussion on occasion but it hasn't gone very far.
There would be some overhead for each row.  It seems that if we were to 
continue to rely on memory + disks this could reduce the memory required.
 - Does anyone have alternate suggestions for such a service?

Dave


Re: scaling a low latency service with HBase

2012-10-19 Thread Amandeep Khurana
Answers inline

On Fri, Oct 19, 2012 at 4:31 PM, Dave Latham lat...@davelink.net wrote:

 I need to scale an internal service / datastore that is currently hosted on
 an HBase cluster and wanted to ask for advice from anyone out there who may
 have some to share.  The service does simple key value lookups on 20 byte
 keys to 20-40 byte values.  It currently has about 5 billion entries
 (200GB), and processes about 40k random reads per second, and about 2k
 random writes per second.  It currently delivers a median response at 2ms,
 90% at 20ms, 99% at 200ms, 99.5% at 5000ms - but the mean is 58ms which is
 no longer meeting our needs very well.  It is persistent and highly
 available.  I need to measure its working set more closely, but I believe
 that around 20-30% (randomly distributed) of the data is accessed each
 day.  I want a system that can scale to at least 10x current levels (50
 billion entries - 2TB, 400k requests per second) and achieve a mean  5ms
 (ideally 1-2ms) and 99.5%  50ms response time for reads while maintaining
 persistence and reasonably high availability (99.9%).  Writes would ideally
 be in the same as range but we could probably tolerate a mean more in the
 20-30ms range.

 Clearly for that latency, spinning disks won't cut it.  The current service
 is running out of an hbase cluster that is shared with many other things
 and when those other things hit the disk and network hard is when it
 degrades.  The cluster has hundreds of nodes and this data is fits in a
 small slice of block cache across most of them.  The concerns are that its
 performance is impacted by other loads and that as it continues to grow
 there may not be enough space in the current cluster's shared block cache.

 So I'm looking for something that will serve out of memory (backed by disk
 for persistence) or from SSDs.  A few questions that I would love to hear
 answers for:

  - Does HBase sound like a good match as this grows?


Yes. The key to get more predictable performance is to separate out
workloads. What are the other things that are using the same physical
hardware and impacting performance? Have you measure performance when
nothing else is running on the cluster?


  - Does anyone have experience running HBase over SSDs?  What sort of
 latency and requests per second have you been able to achieve?


I don't believe many people are actually running this in production yet.
Some folks have done some research on this topic and posted blogs (eg:
http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html)
but there's not a whole lot more than that to go by at this point.


  - Is anyone using a row cache on top of (or built into) HBase?  I think
 there's been a bit of discussion on occasion but it hasn't gone very far.
 There would be some overhead for each row.  It seems that if we were to
 continue to rely on memory + disks this could reduce the memory required.
  - Does anyone have alternate suggestions for such a service?


The biggest recommendation is to separate out the workloads and then start
planning for more hardware or additional components to get better
performance.


 Dave



Re: scaling a low latency service with HBase

2012-10-19 Thread Andrew Purtell
What Amandeep said, and also:

You said your working set is randomly distributed but, if frequent
invalidation isn't a concern and read accesses are still clustered
temporally, an in-memory cache out in front of the cluster would smooth
over periods when the disks are busy servicing MR workload or whatever else
is going on. Another way of separating workloads.

Regarding use of SSDs with HBase, this is an area I intend to get direct
experience with soon, and will report back findings as they become
available.


On Fri, Oct 19, 2012 at 5:22 PM, Amandeep Khurana ama...@gmail.com wrote:

 Answers inline

 On Fri, Oct 19, 2012 at 4:31 PM, Dave Latham lat...@davelink.net wrote:

  I need to scale an internal service / datastore that is currently hosted
 on
  an HBase cluster and wanted to ask for advice from anyone out there who
 may
  have some to share.  The service does simple key value lookups on 20 byte
  keys to 20-40 byte values.  It currently has about 5 billion entries
  (200GB), and processes about 40k random reads per second, and about 2k
  random writes per second.  It currently delivers a median response at
 2ms,
  90% at 20ms, 99% at 200ms, 99.5% at 5000ms - but the mean is 58ms which
 is
  no longer meeting our needs very well.  It is persistent and highly
  available.  I need to measure its working set more closely, but I believe
  that around 20-30% (randomly distributed) of the data is accessed each
  day.  I want a system that can scale to at least 10x current levels (50
  billion entries - 2TB, 400k requests per second) and achieve a mean  5ms
  (ideally 1-2ms) and 99.5%  50ms response time for reads while
 maintaining
  persistence and reasonably high availability (99.9%).  Writes would
 ideally
  be in the same as range but we could probably tolerate a mean more in the
  20-30ms range.
 
  Clearly for that latency, spinning disks won't cut it.  The current
 service
  is running out of an hbase cluster that is shared with many other things
  and when those other things hit the disk and network hard is when it
  degrades.  The cluster has hundreds of nodes and this data is fits in a
  small slice of block cache across most of them.  The concerns are that
 its
  performance is impacted by other loads and that as it continues to grow
  there may not be enough space in the current cluster's shared block
 cache.
 
  So I'm looking for something that will serve out of memory (backed by
 disk
  for persistence) or from SSDs.  A few questions that I would love to hear
  answers for:
 
   - Does HBase sound like a good match as this grows?
 

 Yes. The key to get more predictable performance is to separate out
 workloads. What are the other things that are using the same physical
 hardware and impacting performance? Have you measure performance when
 nothing else is running on the cluster?


   - Does anyone have experience running HBase over SSDs?  What sort of
  latency and requests per second have you been able to achieve?
 

 I don't believe many people are actually running this in production yet.
 Some folks have done some research on this topic and posted blogs (eg:
 http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html)
 but there's not a whole lot more than that to go by at this point.


   - Is anyone using a row cache on top of (or built into) HBase?  I think
  there's been a bit of discussion on occasion but it hasn't gone very far.
  There would be some overhead for each row.  It seems that if we were to
  continue to rely on memory + disks this could reduce the memory required.
   - Does anyone have alternate suggestions for such a service?
 

 The biggest recommendation is to separate out the workloads and then start
 planning for more hardware or additional components to get better
 performance.


  Dave
 




-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)