Flushing, at least when I try it now, long after I stopped writing, doesn't seem to have any effect.
In my log I see this: 2011-05-03 08:57:55,384 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.39 GB, free=897.87 MB, max=4.27 GB, blocks=54637, accesses=89411811, hits=75769916, hitRatio=84.74%%, cachingAccesses=83656318, cachingHits=75714473, cachingHitsRatio=90.50%%, evictions=1135, evicted=7887205, evictedPerRun=6949.0791015625 and every 30 seconds or so something like this: 2011-05-03 08:58:07,900 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 436.92 MB of total=3.63 GB 2011-05-03 08:58:07,947 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=436.95 MB, total=3.2 GB, single=931.65 MB, multi=2.68 GB, memory=3.69 KB Now, if the entire working set I'm reading is 100MB in size, why would it have to evict 436MB just to get it filled back in 30 seconds? Also, what is a good value for hfile.block.cache.size (I have it now on .35) but with 12.5GB of RAM available for the region servers it seem I should be able to get it much higher. -eran On Mon, May 2, 2011 at 22:14, Jean-Daniel Cryans <[email protected]> wrote: > It might be the slow memstore issue... after inserting your dataset > issue a flush on your table in the shell, wait a few seconds, then > start reading. Someone else on the mailing list recently saw this type > of issue. > > Regarding the block caching logging, here's what I see in my logs: > > 2011-05-02 10:05:38,718 DEBUG > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > eviction started; Attempting to free 303.77 MB of total=2.52 GB > 2011-05-02 10:05:38,751 DEBUG > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU > eviction completed; freed=303.8 MB, total=2.22 GB, single=755.67 MB, > multi=1.76 GB, memory=0 KB > 2011-05-02 10:07:18,737 DEBUG > org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=2.27 > GB, free=718.03 MB, max=2.97 GB, blocks=36450, accesses=1056364760, > hits=939002423, hitRatio=88.88%%, cachingAccesses=967172747, > cachingHits=932095548, cachingHitsRatio=96.37%%, evictions=7801, > evicted=35040749, evictedPerRun=4491.8276367187 > > Keep in mind that currently we don't have like a moving average for > the percentages so at some point those numbers are set in stone... > > The handler config is only good if you are using a ton of clients, > which doesn't seem to be the case (at least now). > > J-D > > On Wed, Apr 27, 2011 at 6:42 AM, Eran Kutner <eran@> wrote: >> I must say the more I play with it the more baffled I am with the >> results. I ran the read test again today after not touching the >> cluster for a couple of days and now I'm getting the same high read >> numbers (10-11K reads/sec per server with some server reaching even >> 15K r/s) if I read 1, 10, 100 or even 1000 rows from every key space, >> however 5000 rows yielded a read rate of only 3K rows per second, even >> after a very long time. Just to be clear I'm always random reading a >> single row in every request, the number of rows I'm talking about are >> the ranges of rows within each key space that I'm randomly selecting >> my keys from. >> >> St.Ack - to answer your questions: >> >> Writing from two machines increased the total number of writes per >> second by about 10%, maybe less. Reads showed 15-20% increase when ran >> from 2 machines. >> >> I already had most of the performance tuning recommendations >> implemented (garbage collection, using the new memory slabs feature, >> using LZO) when I ran my previous test, the only config I didn't have >> is "hbase.regionserver.handler.count", I changed it to 128, or 16 >> threads per core, which seems like a reasonable number and tried >> inserting to the same key ranges as before, it didn't seem to have >> made any difference in the total performance. >> >> My keys are about 15 bytes long. >> >> As for caching I can't find those cache hit ratio numbers in my logs, >> do they require a special parameter to enable them? That said, my >> calculations show that the entire data set I'm randomly reading should >> easily fit in the servers memory. Each row has 15 bytes of key + 128 >> bytes of data + overhead - let's say 200 bytes. If I'm reading 5000 >> rows from each key space and have a total of 100 key spaces that's >> 100*5000*200=100000000B=100MB. This is spread across 5 servers with >> 16GB of RAM, out of which 12.5GB are allocated to the region servers. >> >> -eran >
