Re: Cluster sizing guidelines

2014-07-19 Thread lars hofhansl
We can answer #3 at least: You can store about 2T of effective data per node in 
HBase, unless you have a mostly read-only load.
See here for the reasoning: 
http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html

For #1 we were traditionally bound by disk/network IO (due to the 2-way 
replication). With SSDs and 10ge networks that is no longer true, though.

As Andy said, #1 and #2 depend heavily on your setup, and there a quite a few 
tuning knobs for particular setups.

We did some throughput tests with our setup. I'll see whether we can get those 
published. But note that these are valid only for our particular setup and 
network topology and our workloads, and are likely not generally useful.


Also, HBase is a row store. The number of rows or KeyValues per unit of data 
will also strongly influence the performance, there is a per KeyValue cost and 
even a per row cost for both read and write throughput. It also depends on how 
many parallel clients you use, as HBase currently does not stream data between 
client/server and server/server and hence cannot keep the network pipes full 
unless multiple clients are used.


So it might be hard to present such metrics as a 2D graph.

-- Lars




 From: Andrew Purtell apurt...@apache.org
To: user@hbase.apache.org user@hbase.apache.org 
Sent: Wednesday, July 16, 2014 2:32 PM
Subject: Re: Cluster sizing guidelines
 

Those questions don't have pat answers. HBase has a few interesting load
dependent tunables and the ceiling you'll encounter depends as much on the
characteristics of the nodes (particularly, block devices) and the network,
not merely the software.

We can certainly, through experimentation, establish upper bounds on perf,
optimizing either for throughput at a given payload size or latency within
a given bound (your questions #1 and #2). I. e. using now-typical systems
with 32 cores, 64-128 GB of RAM (and a fair amount allocated to bucket
cache), and 2-4 solid state volumes, and a 10ge network, here are plots of
the measured upper bound of metric M on the y-axis over number of slave
cluster nodes on the X axis.

Open questions:
1. Which measurement tool and test automation?
2. Where can we get ~100 decent nodes for a realistic assessment?
3. Who's going to fund the test dev and testbed?




On Wed, Jul 16, 2014 at 1:41 PM, Amandeep Khurana ama...@gmail.com wrote:

 Thanks Lars.

 I'm curious how we'd answer questions like:
 1. How many nodes do I need to sustain a write throughput of N reqs/sec
 with payload of size M KB?
 2. How many nodes do I need to sustain a read throughput of N reqs/sec with
 payload of size M KB with a latency of X ms per read.
 3. How many nodes do I need to store N TB of total data with one of the
 above constraints?

 This goes into looking at the bottlenecks that need to be taken into
 account during write and read times and also the max number of regions and
 region size that a single region server can host.

 What are your thoughts on this?

 -Amandeep


 On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl la...@apache.org wrote:

  This is a somewhat fuzzy art.
 
  Some points to consider:
  1. All data is replicated three ways. Or in other words, if you run three
  RegionServer/Datanodes each machine will get 100% of the writes. If you
 run
  6, each gets 50% of the writes. From that aspect HBase clusters with less
  than 9 RegionServers are not really useful.
  2. As for the machines themselves. Just go with any reasonable machine,
  and pick the cheapest you can find. At least 8 cores, at least 32GB of
 RAM,
  at least 6 disks, no RAID needed. (we have machines with 12 cores in 2
  sockets, 96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet well
  tuned for SSDs.
 
 
  You also carefully need to consider your network topology. With HBase
  you'll see quite some east-west traffic (i.e. between racks). 10ge is
 good
  if you have it. We have 1ge everywhere so far, and we found this is a
  single most bottleneck for write performance.
 
 
  Also see this blog post about HBase memory sizing (shameless plug):
 
 http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html
 
 
  I'm planning a blog post about this topic with more details.
 
 
  -- Lars
 
 
 
  
   From: Amandeep Khurana ama...@gmail.com
  To: user@hbase.apache.org user@hbase.apache.org
  Sent: Tuesday, July 15, 2014 10:48 PM
  Subject: Cluster sizing guidelines
 
 
  Hi
 
  How do users usually go about sizing HBase clusters? What are the factors
  you take into account? What are typical hardware profiles you run with?
 Any
  data points you can share would help.
 
  Thanks
  Amandeep
 




-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: Cluster sizing guidelines

2014-07-19 Thread lars hofhansl
Yeah. Right direction. Correct on 3 counts. Should have read all email before I 
replied to your earlier one.





 From: Amandeep Khurana ama...@gmail.com
To: user@hbase.apache.org user@hbase.apache.org 
Sent: Thursday, July 17, 2014 11:36 AM
Subject: Re: Cluster sizing guidelines
 

On Wed, Jul 16, 2014 at 2:32 PM, Andrew Purtell apurt...@apache.org wrote:

 Those questions don't have pat answers. HBase has a few interesting load
 dependent tunables and the ceiling you'll encounter depends as much on the
 characteristics of the nodes (particularly, block devices) and the network,
 not merely the software.

 We can certainly, through experimentation, establish upper bounds on perf,
 optimizing either for throughput at a given payload size or latency within
 a given bound (your questions #1 and #2). I. e. using now-typical systems
 with 32 cores, 64-128 GB of RAM (and a fair amount allocated to bucket
 cache), and 2-4 solid state volumes, and a 10ge network, here are plots of
 the measured upper bound of metric M on the y-axis over number of slave
 cluster nodes on the X axis.


Agreed. I'm trying to figure out what guidelines we can establish for a
given hardware profile.

From what I've seen and understood so far, it's a balancing act between the
following factors for any given type of hardware:

1. Write throughput. You are basically bottlenecked on the WAL in this case.
2. Read latency. You want to keep as much in memory across if the
requirements demand low latency. How does off-heap cache play in here and
what are our experiences in using that in production?
3. Total storage requirement. What's the amount of data you can store per
node? 12x3TB drives are becoming more common but can HBase leverage that
level of storage density? 40GB regions * 100 regions per server (max) gets
you to 4TB. Replicated, that becomes 12TB. This is pretty much the max load
you want to put on a single server from a memory stand point to achieve
high write throughput or low read latency (factors #1 and #2).

Am I thinking in the right direction here?






 Open questions:
 1. Which measurement tool and test automation?
 2. Where can we get ~100 decent nodes for a realistic assessment?
 3. Who's going to fund the test dev and testbed?



 On Wed, Jul 16, 2014 at 1:41 PM, Amandeep Khurana ama...@gmail.com
 wrote:

  Thanks Lars.
 
  I'm curious how we'd answer questions like:
  1. How many nodes do I need to sustain a write throughput of N reqs/sec
  with payload of size M KB?
  2. How many nodes do I need to sustain a read throughput of N reqs/sec
 with
  payload of size M KB with a latency of X ms per read.
  3. How many nodes do I need to store N TB of total data with one of the
  above constraints?
 
  This goes into looking at the bottlenecks that need to be taken into
  account during write and read times and also the max number of regions
 and
  region size that a single region server can host.
 
  What are your thoughts on this?
 
  -Amandeep
 
 
  On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl la...@apache.org wrote:
 
   This is a somewhat fuzzy art.
  
   Some points to consider:
   1. All data is replicated three ways. Or in other words, if you run
 three
   RegionServer/Datanodes each machine will get 100% of the writes. If you
  run
   6, each gets 50% of the writes. From that aspect HBase clusters with
 less
   than 9 RegionServers are not really useful.
   2. As for the machines themselves. Just go with any reasonable machine,
   and pick the cheapest you can find. At least 8 cores, at least 32GB of
  RAM,
   at least 6 disks, no RAID needed. (we have machines with 12 cores in 2
   sockets, 96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet well
   tuned for SSDs.
  
  
   You also carefully need to consider your network topology. With HBase
   you'll see quite some east-west traffic (i.e. between racks). 10ge is
  good
   if you have it. We have 1ge everywhere so far, and we found this is a
   single most bottleneck for write performance.
  
  
   Also see this blog post about HBase memory sizing (shameless plug):
  
 
 http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html
  
  
   I'm planning a blog post about this topic with more details.
  
  
   -- Lars
  
  
  
   
    From: Amandeep Khurana ama...@gmail.com
   To: user@hbase.apache.org user@hbase.apache.org
   Sent: Tuesday, July 15, 2014 10:48 PM
   Subject: Cluster sizing guidelines
  
  
   Hi
  
   How do users usually go about sizing HBase clusters? What are the
 factors
   you take into account? What are typical hardware profiles you run with?
  Any
   data points you can share would help.
  
   Thanks
   Amandeep
  
 



 --
 Best regards,

    - Andy

 Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via Tom White)


Re: Cluster sizing guidelines

2014-07-19 Thread Andrew Purtell
The i2.8xlarge and hs1.8xlarge EC2 instance types would provide opportunity for 
testing what really happens today when you attempt a high density storage 
architecture with HDFS and HBase. The hs1 type has 24 spinning disks. I think 
the i2.8xlarge better represents near-future challenges in effective 
utilization: it has 8 x 800 GB SSD and 244 GB of RAM. Would be hard to get 
ahold of and very expensive to operate though. 


 On Jul 19, 2014, at 1:32 AM, lars hofhansl la...@apache.org wrote:
 
 Yeah. Right direction. Correct on 3 counts. Should have read all email before 
 I replied to your earlier one.
 
 
 
 
 
 From: Amandeep Khurana ama...@gmail.com
 To: user@hbase.apache.org user@hbase.apache.org 
 Sent: Thursday, July 17, 2014 11:36 AM
 Subject: Re: Cluster sizing guidelines
 
 
 On Wed, Jul 16, 2014 at 2:32 PM, Andrew Purtell apurt...@apache.org wrote:
 
 Those questions don't have pat answers. HBase has a few interesting load
 dependent tunables and the ceiling you'll encounter depends as much on the
 characteristics of the nodes (particularly, block devices) and the network,
 not merely the software.
 
 We can certainly, through experimentation, establish upper bounds on perf,
 optimizing either for throughput at a given payload size or latency within
 a given bound (your questions #1 and #2). I. e. using now-typical systems
 with 32 cores, 64-128 GB of RAM (and a fair amount allocated to bucket
 cache), and 2-4 solid state volumes, and a 10ge network, here are plots of
 the measured upper bound of metric M on the y-axis over number of slave
 cluster nodes on the X axis.
 
 Agreed. I'm trying to figure out what guidelines we can establish for a
 given hardware profile.
 
 From what I've seen and understood so far, it's a balancing act between the
 following factors for any given type of hardware:
 
 1. Write throughput. You are basically bottlenecked on the WAL in this case.
 2. Read latency. You want to keep as much in memory across if the
 requirements demand low latency. How does off-heap cache play in here and
 what are our experiences in using that in production?
 3. Total storage requirement. What's the amount of data you can store per
 node? 12x3TB drives are becoming more common but can HBase leverage that
 level of storage density? 40GB regions * 100 regions per server (max) gets
 you to 4TB. Replicated, that becomes 12TB. This is pretty much the max load
 you want to put on a single server from a memory stand point to achieve
 high write throughput or low read latency (factors #1 and #2).
 
 Am I thinking in the right direction here?
 
 
 
 
 
 
 Open questions:
 1. Which measurement tool and test automation?
 2. Where can we get ~100 decent nodes for a realistic assessment?
 3. Who's going to fund the test dev and testbed?
 
 
 
 On Wed, Jul 16, 2014 at 1:41 PM, Amandeep Khurana ama...@gmail.com
 wrote:
 
 Thanks Lars.
 
 I'm curious how we'd answer questions like:
 1. How many nodes do I need to sustain a write throughput of N reqs/sec
 with payload of size M KB?
 2. How many nodes do I need to sustain a read throughput of N reqs/sec
 with
 payload of size M KB with a latency of X ms per read.
 3. How many nodes do I need to store N TB of total data with one of the
 above constraints?
 
 This goes into looking at the bottlenecks that need to be taken into
 account during write and read times and also the max number of regions
 and
 region size that a single region server can host.
 
 What are your thoughts on this?
 
 -Amandeep
 
 
 On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl la...@apache.org wrote:
 
 This is a somewhat fuzzy art.
 
 Some points to consider:
 1. All data is replicated three ways. Or in other words, if you run
 three
 RegionServer/Datanodes each machine will get 100% of the writes. If you
 run
 6, each gets 50% of the writes. From that aspect HBase clusters with
 less
 than 9 RegionServers are not really useful.
 2. As for the machines themselves. Just go with any reasonable machine,
 and pick the cheapest you can find. At least 8 cores, at least 32GB of
 RAM,
 at least 6 disks, no RAID needed. (we have machines with 12 cores in 2
 sockets, 96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet well
 tuned for SSDs.
 
 
 You also carefully need to consider your network topology. With HBase
 you'll see quite some east-west traffic (i.e. between racks). 10ge is
 good
 if you have it. We have 1ge everywhere so far, and we found this is a
 single most bottleneck for write performance.
 
 
 Also see this blog post about HBase memory sizing (shameless plug):
 http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html
 
 
 I'm planning a blog post about this topic with more details.
 
 
 -- Lars
 
 
 
 
   From: Amandeep Khurana ama...@gmail.com
 To: user@hbase.apache.org user@hbase.apache.org
 Sent: Tuesday, July 15, 2014 10:48 PM
 Subject: Cluster sizing guidelines
 
 
 Hi
 
 How do

Re: Cluster sizing guidelines

2014-07-17 Thread Andrew Purtell
We could at the very least come up with a set of experiments likely to
produce actionable data for users or potential users. We can defer the
question of how those experiments might come about until later.


On Thu, Jul 17, 2014 at 11:36 AM, Amandeep Khurana ama...@gmail.com wrote:

 On Wed, Jul 16, 2014 at 2:32 PM, Andrew Purtell apurt...@apache.org
 wrote:

  Those questions don't have pat answers. HBase has a few interesting load
  dependent tunables and the ceiling you'll encounter depends as much on
 the
  characteristics of the nodes (particularly, block devices) and the
 network,
  not merely the software.
 
  We can certainly, through experimentation, establish upper bounds on
 perf,
  optimizing either for throughput at a given payload size or latency
 within
  a given bound (your questions #1 and #2). I. e. using now-typical systems
  with 32 cores, 64-128 GB of RAM (and a fair amount allocated to bucket
  cache), and 2-4 solid state volumes, and a 10ge network, here are plots
 of
  the measured upper bound of metric M on the y-axis over number of slave
  cluster nodes on the X axis.
 

 Agreed. I'm trying to figure out what guidelines we can establish for a
 given hardware profile.

 From what I've seen and understood so far, it's a balancing act between the
 following factors for any given type of hardware:

 1. Write throughput. You are basically bottlenecked on the WAL in this
 case.
 2. Read latency. You want to keep as much in memory across if the
 requirements demand low latency. How does off-heap cache play in here and
 what are our experiences in using that in production?
 3. Total storage requirement. What's the amount of data you can store per
 node? 12x3TB drives are becoming more common but can HBase leverage that
 level of storage density? 40GB regions * 100 regions per server (max) gets
 you to 4TB. Replicated, that becomes 12TB. This is pretty much the max load
 you want to put on a single server from a memory stand point to achieve
 high write throughput or low read latency (factors #1 and #2).

 Am I thinking in the right direction here?


 
  Open questions:
  1. Which measurement tool and test automation?
  2. Where can we get ~100 decent nodes for a realistic assessment?
  3. Who's going to fund the test dev and testbed?
 
 
 
  On Wed, Jul 16, 2014 at 1:41 PM, Amandeep Khurana ama...@gmail.com
  wrote:
 
   Thanks Lars.
  
   I'm curious how we'd answer questions like:
   1. How many nodes do I need to sustain a write throughput of N reqs/sec
   with payload of size M KB?
   2. How many nodes do I need to sustain a read throughput of N reqs/sec
  with
   payload of size M KB with a latency of X ms per read.
   3. How many nodes do I need to store N TB of total data with one of the
   above constraints?
  
   This goes into looking at the bottlenecks that need to be taken into
   account during write and read times and also the max number of regions
  and
   region size that a single region server can host.
  
   What are your thoughts on this?
  
   -Amandeep
  
  
   On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl la...@apache.org
 wrote:
  
This is a somewhat fuzzy art.
   
Some points to consider:
1. All data is replicated three ways. Or in other words, if you run
  three
RegionServer/Datanodes each machine will get 100% of the writes. If
 you
   run
6, each gets 50% of the writes. From that aspect HBase clusters with
  less
than 9 RegionServers are not really useful.
2. As for the machines themselves. Just go with any reasonable
 machine,
and pick the cheapest you can find. At least 8 cores, at least 32GB
 of
   RAM,
at least 6 disks, no RAID needed. (we have machines with 12 cores in
 2
sockets, 96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet
 well
tuned for SSDs.
   
   
You also carefully need to consider your network topology. With HBase
you'll see quite some east-west traffic (i.e. between racks). 10ge is
   good
if you have it. We have 1ge everywhere so far, and we found this is a
single most bottleneck for write performance.
   
   
Also see this blog post about HBase memory sizing (shameless plug):
   
  
 
 http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html
   
   
I'm planning a blog post about this topic with more details.
   
   
-- Lars
   
   
   

 From: Amandeep Khurana ama...@gmail.com
To: user@hbase.apache.org user@hbase.apache.org
Sent: Tuesday, July 15, 2014 10:48 PM
Subject: Cluster sizing guidelines
   
   
Hi
   
How do users usually go about sizing HBase clusters? What are the
  factors
you take into account? What are typical hardware profiles you run
 with?
   Any
data points you can share would help.
   
Thanks
Amandeep
   
  
 
 
 
  --
  Best regards,
 
 - Andy
 
  Problems worthy of attack prove their worth by hitting back. - Piet Hein
  (via Tom White

Re: Cluster sizing guidelines

2014-07-16 Thread lars hofhansl
This is a somewhat fuzzy art.

Some points to consider:
1. All data is replicated three ways. Or in other words, if you run three 
RegionServer/Datanodes each machine will get 100% of the writes. If you run 6, 
each gets 50% of the writes. From that aspect HBase clusters with less than 9 
RegionServers are not really useful.
2. As for the machines themselves. Just go with any reasonable machine, and 
pick the cheapest you can find. At least 8 cores, at least 32GB of RAM, at 
least 6 disks, no RAID needed. (we have machines with 12 cores in 2 sockets, 
96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet well tuned for SSDs.


You also carefully need to consider your network topology. With HBase you'll 
see quite some east-west traffic (i.e. between racks). 10ge is good if you have 
it. We have 1ge everywhere so far, and we found this is a single most 
bottleneck for write performance.


Also see this blog post about HBase memory sizing (shameless plug): 
http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html


I'm planning a blog post about this topic with more details.


-- Lars




 From: Amandeep Khurana ama...@gmail.com
To: user@hbase.apache.org user@hbase.apache.org 
Sent: Tuesday, July 15, 2014 10:48 PM
Subject: Cluster sizing guidelines
 

Hi

How do users usually go about sizing HBase clusters? What are the factors
you take into account? What are typical hardware profiles you run with? Any
data points you can share would help.

Thanks
Amandeep

Re: Cluster sizing guidelines

2014-07-16 Thread Amandeep Khurana
Thanks Lars.

I'm curious how we'd answer questions like:
1. How many nodes do I need to sustain a write throughput of N reqs/sec
with payload of size M KB?
2. How many nodes do I need to sustain a read throughput of N reqs/sec with
payload of size M KB with a latency of X ms per read.
3. How many nodes do I need to store N TB of total data with one of the
above constraints?

This goes into looking at the bottlenecks that need to be taken into
account during write and read times and also the max number of regions and
region size that a single region server can host.

What are your thoughts on this?

-Amandeep


On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl la...@apache.org wrote:

 This is a somewhat fuzzy art.

 Some points to consider:
 1. All data is replicated three ways. Or in other words, if you run three
 RegionServer/Datanodes each machine will get 100% of the writes. If you run
 6, each gets 50% of the writes. From that aspect HBase clusters with less
 than 9 RegionServers are not really useful.
 2. As for the machines themselves. Just go with any reasonable machine,
 and pick the cheapest you can find. At least 8 cores, at least 32GB of RAM,
 at least 6 disks, no RAID needed. (we have machines with 12 cores in 2
 sockets, 96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet well
 tuned for SSDs.


 You also carefully need to consider your network topology. With HBase
 you'll see quite some east-west traffic (i.e. between racks). 10ge is good
 if you have it. We have 1ge everywhere so far, and we found this is a
 single most bottleneck for write performance.


 Also see this blog post about HBase memory sizing (shameless plug):
 http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html


 I'm planning a blog post about this topic with more details.


 -- Lars



 
  From: Amandeep Khurana ama...@gmail.com
 To: user@hbase.apache.org user@hbase.apache.org
 Sent: Tuesday, July 15, 2014 10:48 PM
 Subject: Cluster sizing guidelines


 Hi

 How do users usually go about sizing HBase clusters? What are the factors
 you take into account? What are typical hardware profiles you run with? Any
 data points you can share would help.

 Thanks
 Amandeep



Cluster sizing guidelines

2014-07-15 Thread Amandeep Khurana
Hi

How do users usually go about sizing HBase clusters? What are the factors
you take into account? What are typical hardware profiles you run with? Any
data points you can share would help.

Thanks
Amandeep