Yes, its all about the block cache. IN_MEMORY is a useful tool as well, but be careful you can choke out other regions/tables.
-ryan On Tue, Sep 14, 2010 at 12:07 AM, Abhijit Pol <[email protected]> wrote: > @Ryan > when you mentioned caching and lots of RAM you referred giving it to block > cache or memstore? > > we have a table with two column families A and B. For column family A we > have set "IN_MEMORY" ==> true and we have multiple 64GB ram machines where > we would like to hold this column family in RAM for better read performance. > A gets some writes and lots of read requests. > > our understanding is memstore will hold writes (and serve reads as long as > keys are in memstore). once memstore is flushed it will be block cache that > will give us in-memory access. If so we should be having bigger block cache > than % given to memstore?? > > Thanks, > Abhi > > > On Thu, Aug 19, 2010 at 9:20 PM, Ryan Rawson <[email protected]> wrote: > >> Due to DFS client things are a little not as good as they should be... >> They are being worked on, so it will get resolved in time. >> >> In the mean time, the key to fast access is caching... ram ram ram. >> >> -ryan >> >> On Thu, Aug 19, 2010 at 10:15 AM, Abhijit Pol <[email protected]> >> wrote: >> > We are using Hbase 0.20.5 drop with latest cloudera Hadoop distribution. >> > >> > - We are hitting 3 nodes Hbase cluster from a client which has 10 >> > threads each with thread local copy of HTable client object and >> > established connection to server. >> > - Each of 10 threads issuing 10,000 read requests of keys randomly >> > selected from pool of 1000 keys. All keys are present on HBase and >> > table is pinned in memory (to make sure we don't have any disk seeks). >> > - If we run this test with 10 threads we get avg latency as seen by >> > client = 8ms (excluding initial 10 connection setup time) . But if we >> > increase # threads to 100, 250 to 500 we get increasing latency >> > numbers like 26ms, 51ms, 90ms. >> > - We have enabled HBase metrics on RS and we see "get_avg_time" on all >> > RS between 5-15ms in all tests, consistently. >> > >> > Is this expected? Any tips to get consistent performance below 20ms? >> > >> >
