Thanks for the info, J-D. I guess the 1.1 below is in millions.
Can you tell us more about your tables - bloom filters, etc ? 在 Dec 14, 2011,5:26 PM,Jean-Daniel Cryans <[email protected]> 写道: > Hey guys, > > I was doing some comparisons between 0.90.5 and 0.92.0, mainly > regarding reads. The numbers are kinda irrelevant but the differences > are. BTW this is on CDH3u3 with random reads. > > In 0.90.0, scanning 50M rows that are in the OS cache I go up to about > 1.7M rows scanned per second. > > In 0.92.0, scanning those same rows (meaning that I didn't run > compactions after migrating so it's picking the same data from the OS > cache), I scan about 1.1 rows per second. > > 0.92 is 50% slower when scanning. > > In 0.90.0 random reading 50M rows that are OS cached I can do about > 200k reads per second. > > In 0.92.0, again with those same rows, I can go up to 260k per second. > > 0.92 is 30% faster when random reading. > > I've been playing with that data set for a while and the numbers in > 0.92.0 when using HFileV1 or V2 are pretty much the same meaning that > something else changed or the code that's generic to both did. > > > I'd like to be able to associate those differences to code changes in > order to understand what's going on. I would really appreciate if > others also took some time to test it out or to think about what could > cause this. > > Thx, > > J-D
