And it will probably evict everyone else that was already present. Hello latency.
J-D On Thu, Nov 17, 2011 at 2:08 PM, lars hofhansl <[email protected]> wrote: > Hi Sam, > The idea is that the entire result of the scan will not fit into the cache if > the scan scans a "reasonable" number of cells, and hence it unlikely that > another scan will hit cached blocks before they get evicted, especially when > using an LRU cache. > > -- Lars > > > ----- Original Message ----- > From: Sam Seigal <[email protected]> > To: [email protected] > Cc: > Sent: Thursday, November 17, 2011 1:44 PM > Subject: block caching > > I have a table that I only use for generating indexes. It rarely will > have random reads, but will have M/R jobs running against it > constantly for generating indexes. Even the index table, random reads > will be rare. It will mostly be used for scanning blocks of data. > > > According to HBase The Definitive Guide > > "As HBase reads entire blocks of data for efficient IO usage it > retains these blocks in an in-memory cache, so that subsequent reads > do not need any disk operation. The default of true enables the block > cache for every read operation. But if your use-case only ever has > sequential reads on a particular column family it is advisable to > disable it from polluting the block cache by setting the block cache > enabled flag to false. " > > "There are other options you can use to influence how the block cache > is used, for example during a scan operation. This is useful during > full table scans so that you do not cause a major churn on the cache. > See the section called “Configuration” for more information about this > feature." > > "Scan instances can be set to use the block cache in the region server > via the setCacheBlocks() method. For scans used with MapReduce jobs, > this should be false. For frequently accessed rows, it is advisable to > use the block cache." > > > What is the reasoning behind the above ? Why is using a block cache > for M/R jobs not a good idea if it is doing full table scans ? > >
