Thanks for the info, J-D. 

I guess the 1.1 below is in millions. 

Can you tell us more about your tables - bloom filters, etc ?



在 Dec 14, 2011,5:26 PM,Jean-Daniel Cryans <[email protected]> 写道:

> Hey guys,
> 
> I was doing some comparisons between 0.90.5 and 0.92.0, mainly
> regarding reads. The numbers are kinda irrelevant but the differences
> are. BTW this is on CDH3u3 with random reads.
> 
> In 0.90.0, scanning 50M rows that are in the OS cache I go up to about
> 1.7M rows scanned per second.
> 
> In 0.92.0, scanning those same rows (meaning that I didn't run
> compactions after migrating so it's picking the same data from the OS
> cache), I scan about 1.1 rows per second.
> 
> 0.92 is 50% slower when scanning.
> 
> In 0.90.0 random reading 50M rows that are OS cached I can do about
> 200k reads per second.
> 
> In 0.92.0, again with those same rows, I can go up to 260k per second.
> 
> 0.92 is 30% faster when random reading.
> 
> I've been playing with that data set for a while and the numbers in
> 0.92.0 when using HFileV1 or V2 are pretty much the same meaning that
> something else changed or the code that's generic to both did.
> 
> 
> I'd like to be able to associate those differences to code changes in
> order to understand what's going on. I would really appreciate if
> others also took some time to test it out or to think about what could
> cause this.
> 
> Thx,
> 
> J-D

Reply via email to