Re: Early comparisons between 0.90 and 0.92

Jean-Daniel Cryans Thu, 15 Dec 2011 12:24:17 -0800

The numbers are irrelevant for this discussion as I'm trying to
compare two almost equal things trying to find why there's a
difference. But since you're asking nicely:


14 slave nodes, 2x E5520, 24GB of RAM (only 1GB given to HBase), 4
SATA 7200rpm disks.

This is the command line I'm using:

To load:
hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 50

To scan:
hbase org.apache.hadoop.hbase.PerformanceEvaluation scan 50

To read:
hbase org.apache.hadoop.hbase.PerformanceEvaluation randomRead 50

I'm using 35 mappers per machine, for the random read test I assume
the clients have to go over the network 13/14 of the time. For the
scan locality should be good, but we don't have a top of the rack
bottleneck.

After the initial loading I major compact. For all the tests the
region remain on the same machines, even across 0.90 and 0.92.

BTW PerformanceEvaluation (which we call PE) using 1KB values. The
size of a KV is 1.5KB on average according to the HFile tool.

Hope this helps,

J-D

On Thu, Dec 15, 2011 at 12:17 PM, Matt Corgan <[email protected]> wrote:
> 260k random reads per second is a lot... is that on one node?  how many
> client threads?  and is the client going over the network, is it on the
> datanode, or are you using a specialized test where they're in the same
> process?
>
>
> On Thu, Dec 15, 2011 at 11:35 AM, Lars <[email protected]> wrote:
>
>> Do you see the same slowdown with the default 64k block size?
>>
>> Lars <[email protected]> schrieb:
>>
>> >I'll be busy today... I'll double check my scanning related changes as
>> soon as i can.
>> >
>> >Jean-Daniel Cryans <[email protected]> schrieb:
>> >
>> >>Yes and yes.
>> >>
>> >>J-D
>> >>On Dec 14, 2011 5:52 PM, "Matt Corgan" <[email protected]> wrote:
>> >>
>> >>> Regions are major compacted and have empty memstores, so no merging of
>> >>> stores when reading?
>> >>>
>> >>>
>> >>> 2011/12/14 Jean-Daniel Cryans <[email protected]>
>> >>>
>> >>> > Yes sorry 1.1M
>> >>> >
>> >>> > This is PE, the table is set to a block size of 4KB and block caching
>> >>> > is disabled. Nothing else special in there.
>> >>> >
>> >>> > J-D
>> >>> >
>> >>> > 2011/12/14  <[email protected]>:
>> >>> > > Thanks for the info, J-D.
>> >>> > >
>> >>> > > I guess the 1.1 below is in millions.
>> >>> > >
>> >>> > > Can you tell us more about your tables - bloom filters, etc ?
>> >>> > >
>> >>> > >
>> >>> > >
>> >>> > > 在 Dec 14, 2011，5:26 PM，Jean-Daniel Cryans <[email protected]>
>> 写道：
>> >>> > >
>> >>> > >> Hey guys,
>> >>> > >>
>> >>> > >> I was doing some comparisons between 0.90.5 and 0.92.0, mainly
>> >>> > >> regarding reads. The numbers are kinda irrelevant but the
>> differences
>> >>> > >> are. BTW this is on CDH3u3 with random reads.
>> >>> > >>
>> >>> > >> In 0.90.0, scanning 50M rows that are in the OS cache I go up to
>> about
>> >>> > >> 1.7M rows scanned per second.
>> >>> > >>
>> >>> > >> In 0.92.0, scanning those same rows (meaning that I didn't run
>> >>> > >> compactions after migrating so it's picking the same data from
>> the OS
>> >>> > >> cache), I scan about 1.1 rows per second.
>> >>> > >>
>> >>> > >> 0.92 is 50% slower when scanning.
>> >>> > >>
>> >>> > >> In 0.90.0 random reading 50M rows that are OS cached I can do
>> about
>> >>> > >> 200k reads per second.
>> >>> > >>
>> >>> > >> In 0.92.0, again with those same rows, I can go up to 260k per
>> second.
>> >>> > >>
>> >>> > >> 0.92 is 30% faster when random reading.
>> >>> > >>
>> >>> > >> I've been playing with that data set for a while and the numbers
>> in
>> >>> > >> 0.92.0 when using HFileV1 or V2 are pretty much the same meaning
>> that
>> >>> > >> something else changed or the code that's generic to both did.
>> >>> > >>
>> >>> > >>
>> >>> > >> I'd like to be able to associate those differences to code
>> changes in
>> >>> > >> order to understand what's going on. I would really appreciate if
>> >>> > >> others also took some time to test it out or to think about what
>> could
>> >>> > >> cause this.
>> >>> > >>
>> >>> > >> Thx,
>> >>> > >>
>> >>> > >> J-D
>> >>> >
>> >>>
>>

Re: Early comparisons between 0.90 and 0.92

Reply via email to