Re: Early comparisons between 0.90 and 0.92

Jean-Daniel Cryans Thu, 15 Dec 2011 12:18:48 -0800

Like I said, I'm using local reads (HDFS-2246) and the data is
_already in the OS cache_.


J-D

2011/12/15 Vladimir Rodionov <[email protected]>:
> 200K random reads is way -way above of what we see in production.
> ?
>
> 1.1M row scan - as well. 10-20K per sec max when you run 'count' from HBase 
> shell
>
> Is there any magic recipe  I am not aware about yet?
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [email protected]
>
> ________________________________________
> From: [email protected] [[email protected]] On Behalf Of Jean-Daniel Cryans 
> [[email protected]]
> Sent: Wednesday, December 14, 2011 7:20 PM
> To: [email protected]
> Subject: Re: Early comparisons between 0.90 and 0.92
>
> Yes and yes.
>
> J-D
> On Dec 14, 2011 5:52 PM, "Matt Corgan" <[email protected]> wrote:
>
>> Regions are major compacted and have empty memstores, so no merging of
>> stores when reading?
>>
>>
>> 2011/12/14 Jean-Daniel Cryans <[email protected]>
>>
>> > Yes sorry 1.1M
>> >
>> > This is PE, the table is set to a block size of 4KB and block caching
>> > is disabled. Nothing else special in there.
>> >
>> > J-D
>> >
>> > 2011/12/14  <[email protected]>:
>> > > Thanks for the info, J-D.
>> > >
>> > > I guess the 1.1 below is in millions.
>> > >
>> > > Can you tell us more about your tables - bloom filters, etc ?
>> > >
>> > >
>> > >
>> > > 在 Dec 14, 2011，5:26 PM，Jean-Daniel Cryans <[email protected]> 写道：
>> > >
>> > >> Hey guys,
>> > >>
>> > >> I was doing some comparisons between 0.90.5 and 0.92.0, mainly
>> > >> regarding reads. The numbers are kinda irrelevant but the differences
>> > >> are. BTW this is on CDH3u3 with random reads.
>> > >>
>> > >> In 0.90.0, scanning 50M rows that are in the OS cache I go up to about
>> > >> 1.7M rows scanned per second.
>> > >>
>> > >> In 0.92.0, scanning those same rows (meaning that I didn't run
>> > >> compactions after migrating so it's picking the same data from the OS
>> > >> cache), I scan about 1.1 rows per second.
>> > >>
>> > >> 0.92 is 50% slower when scanning.
>> > >>
>> > >> In 0.90.0 random reading 50M rows that are OS cached I can do about
>> > >> 200k reads per second.
>> > >>
>> > >> In 0.92.0, again with those same rows, I can go up to 260k per second.
>> > >>
>> > >> 0.92 is 30% faster when random reading.
>> > >>
>> > >> I've been playing with that data set for a while and the numbers in
>> > >> 0.92.0 when using HFileV1 or V2 are pretty much the same meaning that
>> > >> something else changed or the code that's generic to both did.
>> > >>
>> > >>
>> > >> I'd like to be able to associate those differences to code changes in
>> > >> order to understand what's going on. I would really appreciate if
>> > >> others also took some time to test it out or to think about what could
>> > >> cause this.
>> > >>
>> > >> Thx,
>> > >>
>> > >> J-D
>> >
>>
>
> Confidentiality Notice:  The information contained in this message, including 
> any attachments hereto, may be confidential and is intended to be read only 
> by the individual or entity to whom this message is addressed. If the reader 
> of this message is not the intended recipient or an agent or designee of the 
> intended recipient, please note that any review, use, disclosure or 
> distribution of this message or its attachments, in any form, is strictly 
> prohibited.  If you have received this message in error, please immediately 
> notify the sender and/or [email protected] and delete or destroy 
> any copy of this message and its attachments.

Re: Early comparisons between 0.90 and 0.92

Reply via email to