Another point that could help to stay under the `1s SLA': enable direct byte buffers for LruBlockCache. Have a look at HBASE-4027.
On Thu, Aug 29, 2013 at 9:27 PM, Kiru Pakkirisamy <[email protected] > wrote: > Yes, in that case, it matters. I was talking about a case where you are > mostly serving from cache. > > Regards, > - kiru > > > Kiru Pakkirisamy | webcloudtech.wordpress.com > > > ________________________________ > From: Saurabh Yahoo <[email protected]> > To: "[email protected]" <[email protected]> > Cc: "[email protected]" <[email protected]> > Sent: Thursday, August 29, 2013 12:09 PM > Subject: Re: experiencing high latency for few reads in HBase > > > Thanks Kiru. > > We have 10TB of data on disk. It would not fit in memory. Also for the > first time, hbase need to read from the disk. And it has to go through the > network to read the blocks which are stored at other data node. > > So in my opinion, locality matters. > > Thanks, > Saurabh. > > On Aug 29, 2013, at 2:33 PM, Kiru Pakkirisamy <[email protected]> > wrote: > > > But locality index should not matter right if you are in IN_MEMORY most > and you are running the test after a few runs to make sure they are > already in IN_MEMORY (ie blockCacheHit is high or blockCacheMiss is low) > (?) > > > > Regards, > > - kiru > > > > > > Kiru Pakkirisamy | webcloudtech.wordpress.com > > > > > > ________________________________ > > From: Vladimir Rodionov <[email protected]> > > To: "[email protected]" <[email protected]> > > Sent: Thursday, August 29, 2013 11:11 AM > > Subject: RE: experiencing high latency for few reads in HBase > > > > > > Usually, either cluster restart or major compaction helps improving > locality index. > > There is an issue in region assignment after table disable/enable in > 0.94.x (x <11) which > > breaks HDFS locality. Fixed in 0.94.11 > > > > You can write your own routine to manually "localize" particular table > using public HBase Client API. > > > > But this won't help you to stay withing 1 sec anyway. > > > > Best regards, > > Vladimir Rodionov > > Principal Platform Engineer > > Carrier IQ, www.carrieriq.com > > e-mail: [email protected] > > > > ________________________________________ > > From: Saurabh Yahoo [[email protected]] > > Sent: Thursday, August 29, 2013 10:52 AM > > To: [email protected] > > Cc: [email protected] > > Subject: Re: experiencing high latency for few reads in HBase > > > > Thanks Vlad. > > > > Quick question. I notice hdfsBlocksLocalityIndex is around 50 in all > region servers. > > > > Does that could be a problem? If it is, how to solve that? We already > ran the major compaction after ingesting the data. > > > > Thanks, > > Saurabh. > > > > On Aug 29, 2013, at 12:17 PM, Vladimir Rodionov <[email protected]> > wrote: > > > >> Yes. HBase won't guarantee strict sub-second latency. > >> > >> Best regards, > >> Vladimir Rodionov > >> Principal Platform Engineer > >> Carrier IQ, www.carrieriq.com > >> e-mail: [email protected] > >> > >> ________________________________________ > >> From: Saurabh Yahoo [[email protected]] > >> Sent: Thursday, August 29, 2013 2:49 AM > >> To: [email protected] > >> Cc: [email protected] > >> Subject: Re: experiencing high latency for few reads in HBase > >> > >> Hi Vlad, > >> > >> We do have strict latency requirement as it is financial data requiring > direct access from clients. > >> > >> Are you saying that it is not possible to achieve sub second latency > using hbase (because it is based on java.) ? > >> > >> > >> > >> > >> > >> > >> > >> On Aug 28, 2013, at 8:10 PM, Vladimir Rodionov <[email protected]> > wrote: > >> > >>> Increasing Java heap size will make latency worse, actually. > >>> You can't guarantee 1 sec max latency if run Java app (unless your > heap size is much less than 1GB). > >>> I have never heard about strict maximum latency limit. Usually , its > 99% , 99.9 or 99.99% query percentiles. > >>> > >>> You can greatly reduce your 99.xxx% percentile latency by storing you > data in 2 replicas to two different region servers. > >>> Issue two read operations to those two region servers in parallel and > get the first response. Probability theory states that probability > >>> of two independent events (slow requests) is the product of event's > probabilities themselves. > >>> > >>> > >>> Best regards, > >>> Vladimir Rodionov > >>> Principal Platform Engineer > >>> Carrier IQ, www.carrieriq.com > >>> e-mail: [email protected] > >>> > >>> ________________________________________ > >>> From: Saurabh Yahoo [[email protected]] > >>> Sent: Wednesday, August 28, 2013 4:18 PM > >>> To: [email protected] > >>> Subject: Re: experiencing high latency for few reads in HBase > >>> > >>> Thanks Kiru, > >>> > >>> Scan is not an option for our use cases. Our read is pretty random. > >>> > >>> Any other suggestion to bring down the latency. > >>> > >>> Thanks, > >>> Saurabh. > >>> > >>> > >>> On Aug 28, 2013, at 7:01 PM, Kiru Pakkirisamy < > [email protected]> wrote: > >>> > >>>> Saurabh, we are able to 600K rowxcolumns in 400 msec. We have put > what was a 40million row table as 400K rows and columns. We Get about 100 > of the rows from this 400K , do quite a bit of calculations in the > coprocessor (almost a group-order by) and return in this time. > >>>> Maybe should consider replacing the MultiGets with Scan with Filter. > I like the FuzzyRowFilter even though you might need to match with exact > key. It works only with fixed length key. > >>>> (I do have an issue right now, it is not scaling to multiple clients.) > >>>> > >>>> Regards, > >>>> - kiru > >>>> > >>>> > >>>> Kiru Pakkirisamy | webcloudtech.wordpress.com > >>>> > >>>> > >>>> ________________________________ > >>>> From: Saurabh Yahoo <[email protected]> > >>>> To: "[email protected]" <[email protected]> > >>>> Cc: "[email protected]" <[email protected]> > >>>> Sent: Wednesday, August 28, 2013 3:20 PM > >>>> Subject: Re: experiencing high latency for few reads in HBase > >>>> > >>>> > >>>> Thanks Kitu. We need less than 1 sec latency. > >>>> > >>>> We are using both muliGet and get. > >>>> > >>>> We have three concurrent clients running 10 threads each. ( that > makes total 30 concurrent clients). > >>>> > >>>> Thanks, > >>>> Saurabh. > >>>> > >>>> On Aug 28, 2013, at 4:30 PM, Kiru Pakkirisamy < > [email protected]> wrote: > >>>> > >>>>> Right 4 sec is good. > >>>>> @Saurabh - so your read is - getting 20 out of 25 millions rows ?. > Is this a Get or a Scan ? > >>>>> BTW, in this stress test how many concurrent clients do you have ? > >>>>> > >>>>> Regards, > >>>>> - kiru > >>>>> > >>>>> > >>>>> ________________________________ > >>>>> From: Vladimir Rodionov <[email protected]> > >>>>> To: "[email protected]" <[email protected]> > >>>>> Sent: Wednesday, August 28, 2013 12:15 PM > >>>>> Subject: RE: experiencing high latency for few reads in HBase > >>>>> > >>>>> > >>>>> 1. 4 sec max latency is not that bad taking into account 12GB heap. > It can be much larger. What is your SLA? > >>>>> 2. Block evictions is the result of a poor cache hit rate and the > root cause of a periodic stop-the-world GC pauses (max latencies > >>>>> latencies you have been observing in the test) > >>>>> 3. Block cache consists of 3 parts (25% young generation, 50% - > tenured, 25% - permanent). Permanent part is for CF with > >>>>> IN_MEMORY = true (you can specify this when you create CF). Block > first stored in 'young gen' space, then gets promoted to 'tenured gen' space > >>>>> (or gets evicted). May be your 'perm gen' space is underutilized? > This is exact 25% of 4GB (1GB). Although HBase LruBlockCache should use all > the space allocated for block cache - > >>>>> there is no guarantee (as usual). If you don have in_memory column > families you may decrease > >>>>> > >>>>> > >>>>> > >>>>> Best regards, > >>>>> Vladimir Rodionov > >>>>> Principal Platform Engineer > >>>>> Carrier IQ, www.carrieriq.com > >>>>> e-mail: [email protected] > >>>>> > >>>>> ________________________________________ > >>>>> From: Saurabh Yahoo [[email protected]] > >>>>> Sent: Wednesday, August 28, 2013 5:10 AM > >>>>> To: [email protected] > >>>>> Subject: experiencing high latency for few reads in HBase > >>>>> > >>>>> Hi, > >>>>> > >>>>> We are running a stress test in our 5 node cluster and we are > getting the expected mean latency of 10ms. But we are seeing around 20 > reads out of 25 million reads having latency more than 4 seconds. Can > anyone provide the insight what we can do to meet below second SLA for each > and every read? > >>>>> > >>>>> We observe the following things - > >>>>> > >>>>> 1. Reads are evenly distributed among 5 nodes. CPUs remain under 5% > utilized. > >>>>> > >>>>> 2. We have 4gb block cache (30% block cache out of 12gb) setup. 3gb > block cache got filled up but around 1gb remained free. There are a large > number of cache eviction. > >>>>> > >>>>> Questions to experts - > >>>>> > >>>>> 1. If there are still 1gb of free block cache available, why is > hbase evicting the block from cache? > >>>>> > >>>>> 4. We are seeing memory went up to 10gb three times before dropping > sharply to 5gb. > >>>>> > >>>>> Any help is highly appreciable, > >>>>> > >>>>> Thanks, > >>>>> Saurabh. > >>>>> > >>>>> Confidentiality Notice: The information contained in this message, > including any attachments hereto, may be confidential and is intended to be > read only by the individual or entity to whom this message is addressed. If > the reader of this message is not the intended recipient or an agent or > designee of the intended recipient, please note that any review, use, > disclosure or distribution of this message or its attachments, in any form, > is strictly prohibited. If you have received this message in error, please > immediately notify the sender and/or [email protected] and > delete or destroy any copy of this message and its attachments. > -- Adrien Mogenet http://www.borntosegfault.com
