Re: Cluster sizing guidelines

Andrew Purtell Thu, 17 Jul 2014 11:48:15 -0700

We could at the very least come up with a set of experiments likely to
produce actionable data for users or potential users. We can defer the
question of how those experiments might come about until later.



On Thu, Jul 17, 2014 at 11:36 AM, Amandeep Khurana <[email protected]> wrote:

> On Wed, Jul 16, 2014 at 2:32 PM, Andrew Purtell <[email protected]>
> wrote:
>
> > Those questions don't have pat answers. HBase has a few interesting load
> > dependent tunables and the ceiling you'll encounter depends as much on
> the
> > characteristics of the nodes (particularly, block devices) and the
> network,
> > not merely the software.
> >
> > We can certainly, through experimentation, establish upper bounds on
> perf,
> > optimizing either for throughput at a given payload size or latency
> within
> > a given bound (your questions #1 and #2). I. e. using now-typical systems
> > with 32 cores, 64-128 GB of RAM (and a fair amount allocated to bucket
> > cache), and 2-4 solid state volumes, and a 10ge network, here are plots
> of
> > the measured upper bound of metric M on the y-axis over number of slave
> > cluster nodes on the X axis.
> >
>
> Agreed. I'm trying to figure out what guidelines we can establish for a
> given hardware profile.
>
> From what I've seen and understood so far, it's a balancing act between the
> following factors for any given type of hardware:
>
> 1. Write throughput. You are basically bottlenecked on the WAL in this
> case.
> 2. Read latency. You want to keep as much in memory across if the
> requirements demand low latency. How does off-heap cache play in here and
> what are our experiences in using that in production?
> 3. Total storage requirement. What's the amount of data you can store per
> node? 12x3TB drives are becoming more common but can HBase leverage that
> level of storage density? 40GB regions * 100 regions per server (max) gets
> you to 4TB. Replicated, that becomes 12TB. This is pretty much the max load
> you want to put on a single server from a memory stand point to achieve
> high write throughput or low read latency (factors #1 and #2).
>
> Am I thinking in the right direction here?
>
>
> >
> > Open questions:
> > 1. Which measurement tool and test automation?
> > 2. Where can we get ~100 decent nodes for a realistic assessment?
> > 3. Who's going to fund the test dev and testbed?
> >
> >
> >
> > On Wed, Jul 16, 2014 at 1:41 PM, Amandeep Khurana <[email protected]>
> > wrote:
> >
> > > Thanks Lars.
> > >
> > > I'm curious how we'd answer questions like:
> > > 1. How many nodes do I need to sustain a write throughput of N reqs/sec
> > > with payload of size M KB?
> > > 2. How many nodes do I need to sustain a read throughput of N reqs/sec
> > with
> > > payload of size M KB with a latency of X ms per read.
> > > 3. How many nodes do I need to store N TB of total data with one of the
> > > above constraints?
> > >
> > > This goes into looking at the bottlenecks that need to be taken into
> > > account during write and read times and also the max number of regions
> > and
> > > region size that a single region server can host.
> > >
> > > What are your thoughts on this?
> > >
> > > -Amandeep
> > >
> > >
> > > On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl <[email protected]>
> wrote:
> > >
> > > > This is a somewhat fuzzy art.
> > > >
> > > > Some points to consider:
> > > > 1. All data is replicated three ways. Or in other words, if you run
> > three
> > > > RegionServer/Datanodes each machine will get 100% of the writes. If
> you
> > > run
> > > > 6, each gets 50% of the writes. From that aspect HBase clusters with
> > less
> > > > than 9 RegionServers are not really useful.
> > > > 2. As for the machines themselves. Just go with any reasonable
> machine,
> > > > and pick the cheapest you can find. At least 8 cores, at least 32GB
> of
> > > RAM,
> > > > at least 6 disks, no RAID needed. (we have machines with 12 cores in
> 2
> > > > sockets, 96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet
> well
> > > > tuned for SSDs.
> > > >
> > > >
> > > > You also carefully need to consider your network topology. With HBase
> > > > you'll see quite some east-west traffic (i.e. between racks). 10ge is
> > > good
> > > > if you have it. We have 1ge everywhere so far, and we found this is a
> > > > single most bottleneck for write performance.
> > > >
> > > >
> > > > Also see this blog post about HBase memory sizing (shameless plug):
> > > >
> > >
> >
> http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html
> > > >
> > > >
> > > > I'm planning a blog post about this topic with more details.
> > > >
> > > >
> > > > -- Lars
> > > >
> > > >
> > > >
> > > > ________________________________
> > > >  From: Amandeep Khurana <[email protected]>
> > > > To: "[email protected]" <[email protected]>
> > > > Sent: Tuesday, July 15, 2014 10:48 PM
> > > > Subject: Cluster sizing guidelines
> > > >
> > > >
> > > > Hi
> > > >
> > > > How do users usually go about sizing HBase clusters? What are the
> > factors
> > > > you take into account? What are typical hardware profiles you run
> with?
> > > Any
> > > > data points you can share would help.
> > > >
> > > > Thanks
> > > > Amandeep
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: Cluster sizing guidelines

Reply via email to