Re: Early comparisons between 0.90 and 0.92

Jean-Daniel Cryans Thu, 15 Dec 2011 12:25:25 -0800

Trying this now.

J-D


On Thu, Dec 15, 2011 at 11:35 AM, Lars <[email protected]> wrote:
> Do you see the same slowdown with the default 64k block size?
>
> Lars <[email protected]> schrieb:
>
>>I'll be busy today... I'll double check my scanning related changes as soon 
>>as i can.
>>
>>Jean-Daniel Cryans <[email protected]> schrieb:
>>
>>>Yes and yes.
>>>
>>>J-D
>>>On Dec 14, 2011 5:52 PM, "Matt Corgan" <[email protected]> wrote:
>>>
>>>> Regions are major compacted and have empty memstores, so no merging of
>>>> stores when reading?
>>>>
>>>>
>>>> 2011/12/14 Jean-Daniel Cryans <[email protected]>
>>>>
>>>> > Yes sorry 1.1M
>>>> >
>>>> > This is PE, the table is set to a block size of 4KB and block caching
>>>> > is disabled. Nothing else special in there.
>>>> >
>>>> > J-D
>>>> >
>>>> > 2011/12/14  <[email protected]>:
>>>> > > Thanks for the info, J-D.
>>>> > >
>>>> > > I guess the 1.1 below is in millions.
>>>> > >
>>>> > > Can you tell us more about your tables - bloom filters, etc ?
>>>> > >
>>>> > >
>>>> > >
>>>> > > 在 Dec 14, 2011，5:26 PM，Jean-Daniel Cryans <[email protected]> 写道：
>>>> > >
>>>> > >> Hey guys,
>>>> > >>
>>>> > >> I was doing some comparisons between 0.90.5 and 0.92.0, mainly
>>>> > >> regarding reads. The numbers are kinda irrelevant but the differences
>>>> > >> are. BTW this is on CDH3u3 with random reads.
>>>> > >>
>>>> > >> In 0.90.0, scanning 50M rows that are in the OS cache I go up to about
>>>> > >> 1.7M rows scanned per second.
>>>> > >>
>>>> > >> In 0.92.0, scanning those same rows (meaning that I didn't run
>>>> > >> compactions after migrating so it's picking the same data from the OS
>>>> > >> cache), I scan about 1.1 rows per second.
>>>> > >>
>>>> > >> 0.92 is 50% slower when scanning.
>>>> > >>
>>>> > >> In 0.90.0 random reading 50M rows that are OS cached I can do about
>>>> > >> 200k reads per second.
>>>> > >>
>>>> > >> In 0.92.0, again with those same rows, I can go up to 260k per second.
>>>> > >>
>>>> > >> 0.92 is 30% faster when random reading.
>>>> > >>
>>>> > >> I've been playing with that data set for a while and the numbers in
>>>> > >> 0.92.0 when using HFileV1 or V2 are pretty much the same meaning that
>>>> > >> something else changed or the code that's generic to both did.
>>>> > >>
>>>> > >>
>>>> > >> I'd like to be able to associate those differences to code changes in
>>>> > >> order to understand what's going on. I would really appreciate if
>>>> > >> others also took some time to test it out or to think about what could
>>>> > >> cause this.
>>>> > >>
>>>> > >> Thx,
>>>> > >>
>>>> > >> J-D
>>>> >
>>>>

Re: Early comparisons between 0.90 and 0.92

Reply via email to