Have you added the following when passing Scan to your job ? scan.setCacheBlocks(false);
BTW image didn't go through. Consider putting image on third-party site. On Mon, Jun 2, 2014 at 12:55 PM, Matt K <[email protected]> wrote: > Hi all, > > We are running a number of Map/Reduce jobs on top of HBase. We are not > using HBase for any of its realtime capabilities, only for > batch-processing. So we aren't doing lookups, just scans. > > Each one of our jobs has *scan.setCaching(false)* to turn off > block-caching, since each block will only be accessed once. > > We recently started using Cloudera Manager, and I’m seeing something that > doesn’t add up. See image below. It’s clear from the graphs that Block > Cache is being used currently, and blocks are being cached and evicted. > > We do have *hfile.block.cache.size* set to 0.4 (default), but my > understanding is that the jobs setting scan.setCaching(false) should > override this. Since it’s set in every job, there should be no blocks being > cached. > > Can anyone help me understand what we’re seeing? > > Thanks, > > -Matt > > [image: Inline image 1] >
