Re: 0.92 and Read/writes not scaling

Mikael Sitruk Mon, 26 Mar 2012 07:22:11 -0700

Juhani hi

By storefile behavior i meant that you look to the metrics and check the
number of store file over time and see if the you are bounded or the files
increase and decrease all the time. if this is not the case (and the number
of store file increase all the time), hbase will throttle the requests.
128-256 bytes each request grouped in 10 is not much data, i have a data
set where each request is approx 4K and have insert time of 7-10 ms.
Do you see this latency problem on insert during all the test or at some
time?


Did you check your network latency?
BTW batch is not supported by ycsb, so when you mean a set of 10 put you
mean the table buffer? in my test it is disabled.

Mikael.S


On Mon, Mar 26, 2012 at 6:58 AM, Matt Corgan <mcor...@hotpads.com> wrote:

> When you increased regions on your previous test, did it start maxing out
> CPU?  What improvement did you see?
>
> Have you tried increasing the memstore flush size to something like 512MB?
>  Maybe you're blocked on flushes.  40,000 (4,000/server) is pretty slow for
> a disabled WAL i think, especially with batch size of 10.  If you increase
> write batch size to 1000 how much does your write throughput increase?
>
>
> On Fri, Mar 23, 2012 at 3:48 AM, Juhani Connolly <juha...@gmail.com>
> wrote:
>
> > Also, the latency on requests is extremely long. If we group them into
> > sets of 10 puts(128-256 bytes each) before flushing the client table,
> > latency is over 1 second.
> >
> > We get entries like this in our logs:
> > 22:17:51,010 WARN org.apache.hadoop.ipc.HBaseServer:
> > (responseTooSlow):
> >
> >
> {"processingtimems":16692,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@65312e3b
> > ),
> > rpc version=1, client version=29,
> > methodsFingerPrint=54742778","client":"10.172.109.3:42725
> >
> ","starttimems":1332335854317,"queuetimems":6387,"class":"HRegionServer","responsesize":0,"method":"multi"}
> >
> > Any suggestions as to where we should be digging?
> >
> > On Fri, Mar 23, 2012 at 4:40 PM, Juhani Connolly <juha...@gmail.com>
> > wrote:
> > > Status update:
> > >
> > > - We moved to cdh 4b1, so hbase 0.92 and hdfs 0.23(until now we were
> > > using 0.20.2 series)
> > > - Did the tests now with 256/512 regions, the numbers do appear to
> > > scale which is good.
> > >
> > > BUT, our write throughput has gone in the dump. If we disable wal
> > > writes, we still get nearly 40,000 a second, but with it on, we're
> > > lucky to get more than 12,000. Before we were getting as high as
> > > 70,000 grouping puts together. Have set up log collection, and am not
> > > finding anything unusual in the logs.
> > >
> > > Mikael: One of the tests is the ycsb one where we just let it choose
> > > the size. Our own custom test has a configurable size, but we have
> > > been testing with entries that are 128-256 bytes per entry, as this is
> > > what we expect in our application. What exactly should we be looking
> > > at with the storefiles?
> > >
> > > On Wed, Mar 21, 2012 at 2:29 PM, Mikael Sitruk <
> mikael.sit...@gmail.com>
> > wrote:
> > >> Juhani,
> > >> Can you look at the storefiles and tell how they behave during the
> test?
> > >> What is the size of the data you insert/update?
> > >> Mikael
> > >> On Mar 20, 2012 8:10 PM, "Juhani Connolly" <juha...@gmail.com> wrote:
> > >>
> > >>> Hi Matt,
> > >>>
> > >>> this is something we haven't tested much, we were always running with
> > >>> about 32 regions which gave enough coverage for an even spread over
> > >>> all machines.
> > >>> I will run our tests with enough regions per server to cover all
> cores
> > >>> and get back to the ml
> > >>>
> > >>> On Tue, Mar 20, 2012 at 1:55 AM, Matt Corgan <mcor...@hotpads.com>
> > wrote:
> > >>> > I'd be curious to see what happens if you split the table into 1
> > region
> > >>> per
> > >>> > CPU core, so 24 cores * 11 servers = 264 regions.  Each region has
> 1
> > >>> > memstore which is a ConcurrentSkipListMap, and you're currently
> > hitting
> > >>> > each CSLM with 8 cores which might be too contentious.  Normally in
> > >>> > production you would want multiple memstores per CPU core.
> > >>> >
> > >>> >
> > >>> > On Mon, Mar 19, 2012 at 5:31 AM, Juhani Connolly <
> juha...@gmail.com>
> > >>> wrote:
> > >>> >
> > >>> >> Actually we did try running off two machines both running our own
> > >>> >> tests in parallel. Unfortunately the results were a split that
> > results
> > >>> >> in the same total throughput. We also did the same thing with
> iperf
> > >>> >> running from each machine to another machine, indicating 800Mb
> > >>> >> additional throughput between each pair of machines.
> > >>> >> However we didn't try these tests very thoroughly so I will
> revisit
> > >>> >> them as soon as I get back to the office, thanks.
> > >>> >>
> > >>> >> On Mon, Mar 19, 2012 at 9:21 PM, Christian Schäfer <
> > >>> syrious3...@yahoo.de>
> > >>> >> wrote:
> > >>> >> > referring to my experiences I expect the client to be the
> > bottleneck,
> > >>> >> too.
> > >>> >> >
> > >>> >> > So try to increase the count of client-machines (not client
> > threads)
> > >>> >> each with its own unshared network interface.
> > >>> >> >
> > >>> >> > In my case I could double write throughput by doubling client
> > machine
> > >>> >> count with a much smaller system than yours (5 machines, 4gigs RAM
> > >>> each).
> > >>> >> >
> > >>> >> > Good Luck
> > >>> >> > Chris
> > >>> >> >
> > >>> >> >
> > >>> >> >
> > >>> >> > ________________________________
> > >>> >> >  Von: Juhani Connolly <juha...@gmail.com>
> > >>> >> > An: user@hbase.apache.org
> > >>> >> > Gesendet: 13:02 Montag, 19.März 2012
> > >>> >> > Betreff: Re: 0.92 and Read/writes not scaling
> > >>> >> >
> > >>> >> > I was concerned that may be the case too, which is why we ran
> the
> > ycsb
> > >>> >> > tests in addition to our application specific and general
> > performance
> > >>> >> > tests. checking profiles of the execution just showed the vast
> > >>> majority
> > >>> >> of
> > >>> >> > time spent waiting for responses. these were all run with 400
> > >>> >> > threads(though we tried more/less just in case)
> > >>> >> > 2012/03/19 20:57 "Mingjian Deng" <koven2...@gmail.com>:
> > >>> >> >
> > >>> >> >> @Juhani:
> > >>> >> >> How many clients did you test? Maybe the bottleneck was client?
> > >>> >> >>
> > >>> >> >> 2012/3/19 Ramkrishna.S.Vasudevan <
> > ramkrishna.vasude...@huawei.com>
> > >>> >> >>
> > >>> >> >> > Hi Juhani
> > >>> >> >> >
> > >>> >> >> > Can you tell more on how the regions are balanced?
> > >>> >> >> > Are you overloading only specific region server alone?
> > >>> >> >> >
> > >>> >> >> > Regards
> > >>> >> >> > Ram
> > >>> >> >> >
> > >>> >> >> > > -----Original Message-----
> > >>> >> >> > > From: Juhani Connolly [mailto:juha...@gmail.com]
> > >>> >> >> > > Sent: Monday, March 19, 2012 4:11 PM
> > >>> >> >> > > To: user@hbase.apache.org
> > >>> >> >> > > Subject: 0.92 and Read/writes not scaling
> > >>> >> >> > >
> > >>> >> >> > > Hi,
> > >>> >> >> > >
> > >>> >> >> > > We're running into a brick wall where our throughput
> numbers
> > will
> > >>> >> not
> > >>> >> >> > > scale as we increase server counts both using custom
> inhouse
> > >>> tests
> > >>> >> and
> > >>> >> >> > > ycsb.
> > >>> >> >> > >
> > >>> >> >> > > We're using hbase 0.92 on hadoop 0.20.2(we also experience
> > the
> > >>> same
> > >>> >> >> > > issues using 0.90 before switching our testing to  this
> > version).
> > >>> >> >> > >
> > >>> >> >> > > Our cluster consists of:
> > >>> >> >> > > - Namenode and hmaster on separate servers, 24 core, 64gb
> > >>> >> >> > > - up to 11 datanode/regionservers. 24 core, 64gb, 4 * 1tb
> > >>> disks(hope
> > >>> >> >> > > to get this changed)
> > >>> >> >> > >
> > >>> >> >> > > We have adjusted our gc settings, and mslabs:
> > >>> >> >> > >
> > >>> >> >> > >   <property>
> > >>> >> >> > >     <name>hbase.hregion.memstore.mslab.enabled</name>
> > >>> >> >> > >     <value>true</value>
> > >>> >> >> > >   </property>
> > >>> >> >> > >
> > >>> >> >> > >   <property>
> > >>> >> >> > >     <name>hbase.hregion.memstore.mslab.chunksize</name>
> > >>> >> >> > >     <value>2097152</value>
> > >>> >> >> > >   </property>
> > >>> >> >> > >
> > >>> >> >> > >   <property>
> > >>> >> >> > >
> <name>hbase.hregion.memstore.mslab.max.allocation</name>
> > >>> >> >> > >     <value>1024768</value>
> > >>> >> >> > >   </property>
> > >>> >> >> > >
> > >>> >> >> > > hdfs xceivers is set to 8192
> > >>> >> >> > >
> > >>> >> >> > > We've experimented with a variety of handler counts for
> > namenode,
> > >>> >> >> > > datanodes and regionservers with no changes in throughput.
> > >>> >> >> > >
> > >>> >> >> > > For testing with ycsb, we do the following each time(with
> > nothing
> > >>> >> else
> > >>> >> >> > > using the cluster):
> > >>> >> >> > > - truncate test table
> > >>> >> >> > > - add a small amount of data, then split the table into 32
> > >>> regions
> > >>> >> and
> > >>> >> >> > > call balancer from the shell.
> > >>> >> >> > > - load 10m rows
> > >>> >> >> > > - do a 1:2:7 insert:update:read test with 10million rows
> > >>> (64k/sec)
> > >>> >> >> > > - do a 5:5 insert:update test with 10 million rows
> (23k/sec)
> > >>> >> >> > > - do a pure read test with 10 million rows (75k/sec)
> > >>> >> >> > >
> > >>> >> >> > > We have observed ganglia, iostat -d -x, iptraf, top, dstat
> > and a
> > >>> >> >> > > variety of other diagnostic tools and network/io/cpu/memory
> > as
> > >>> >> >> > > bottlenecks seem highly unlikely as none of them are  ever
> > >>> seriously
> > >>> >> >> > > taxed. This leave me to assume this is some kind of locking
> > >>> issue?
> > >>> >> >> > > Delaying WAL flushes gives a small throughput bump but it
> > doesn't
> > >>> >> >> > > scale.
> > >>> >> >> > >
> > >>> >> >> > > There also doesn't seem to be many figures around to
> compare
> > ours
> > >>> >> to.
> > >>> >> >> > > We can get our throughput numbers higher with tricks like
> not
> > >>> >> writing
> > >>> >> >> > > the WAL or delaying flushes, batching requests, but nothing
> > >>> seems to
> > >>> >> >> > > scale with additional slaves.
> > >>> >> >> > > Could anyone provide guidance as to what may be preventing
> > >>> >> throughput
> > >>> >> >> > > figures from scaling as we increase our slave count?
> > >>> >> >> >
> > >>> >> >> >
> > >>> >> >>
> > >>> >>
> > >>>
> >
>



-- 
Mikael.S

Re: 0.92 and Read/writes not scaling

Reply via email to