Like I said, I'm using local reads (HDFS-2246) and the data is _already in the OS cache_.
J-D 2011/12/15 Vladimir Rodionov <[email protected]>: > 200K random reads is way -way above of what we see in production. > ? > > 1.1M row scan - as well. 10-20K per sec max when you run 'count' from HBase > shell > > Is there any magic recipe I am not aware about yet? > > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: [email protected] > > ________________________________________ > From: [email protected] [[email protected]] On Behalf Of Jean-Daniel Cryans > [[email protected]] > Sent: Wednesday, December 14, 2011 7:20 PM > To: [email protected] > Subject: Re: Early comparisons between 0.90 and 0.92 > > Yes and yes. > > J-D > On Dec 14, 2011 5:52 PM, "Matt Corgan" <[email protected]> wrote: > >> Regions are major compacted and have empty memstores, so no merging of >> stores when reading? >> >> >> 2011/12/14 Jean-Daniel Cryans <[email protected]> >> >> > Yes sorry 1.1M >> > >> > This is PE, the table is set to a block size of 4KB and block caching >> > is disabled. Nothing else special in there. >> > >> > J-D >> > >> > 2011/12/14 <[email protected]>: >> > > Thanks for the info, J-D. >> > > >> > > I guess the 1.1 below is in millions. >> > > >> > > Can you tell us more about your tables - bloom filters, etc ? >> > > >> > > >> > > >> > > 在 Dec 14, 2011,5:26 PM,Jean-Daniel Cryans <[email protected]> 写道: >> > > >> > >> Hey guys, >> > >> >> > >> I was doing some comparisons between 0.90.5 and 0.92.0, mainly >> > >> regarding reads. The numbers are kinda irrelevant but the differences >> > >> are. BTW this is on CDH3u3 with random reads. >> > >> >> > >> In 0.90.0, scanning 50M rows that are in the OS cache I go up to about >> > >> 1.7M rows scanned per second. >> > >> >> > >> In 0.92.0, scanning those same rows (meaning that I didn't run >> > >> compactions after migrating so it's picking the same data from the OS >> > >> cache), I scan about 1.1 rows per second. >> > >> >> > >> 0.92 is 50% slower when scanning. >> > >> >> > >> In 0.90.0 random reading 50M rows that are OS cached I can do about >> > >> 200k reads per second. >> > >> >> > >> In 0.92.0, again with those same rows, I can go up to 260k per second. >> > >> >> > >> 0.92 is 30% faster when random reading. >> > >> >> > >> I've been playing with that data set for a while and the numbers in >> > >> 0.92.0 when using HFileV1 or V2 are pretty much the same meaning that >> > >> something else changed or the code that's generic to both did. >> > >> >> > >> >> > >> I'd like to be able to associate those differences to code changes in >> > >> order to understand what's going on. I would really appreciate if >> > >> others also took some time to test it out or to think about what could >> > >> cause this. >> > >> >> > >> Thx, >> > >> >> > >> J-D >> > >> > > Confidentiality Notice: The information contained in this message, including > any attachments hereto, may be confidential and is intended to be read only > by the individual or entity to whom this message is addressed. If the reader > of this message is not the intended recipient or an agent or designee of the > intended recipient, please note that any review, use, disclosure or > distribution of this message or its attachments, in any form, is strictly > prohibited. If you have received this message in error, please immediately > notify the sender and/or [email protected] and delete or destroy > any copy of this message and its attachments.
