Hi, thanks for responses.

Ted - when I said "scan.setCaching", I meant "scan.setCacheBlocks(false)".
That's what I get for not copying/pasting directly from code :)

I added a link to the graphs here:
https://drive.google.com/file/d/0B3ZQ0nMNMFxCOHZNZVFsWEhCOUU/edit?usp=sharing

Bryan - I believe you're right, but wanted to confirm.

Thanks,
-Matt


On Mon, Jun 2, 2014 at 4:09 PM, Ted Yu <[email protected]> wrote:

> Have you added the following when passing Scan to your job ?
>
> scan.setCacheBlocks(false);
>
> BTW image didn't go through.
> Consider putting image on third-party site.
>
> On Mon, Jun 2, 2014 at 12:55 PM, Matt K <[email protected]> wrote:
>
> > Hi all,
> >
> > We are running a number of Map/Reduce jobs on top of HBase. We are not
> > using HBase for any of its realtime capabilities, only for
> > batch-processing. So we aren't doing lookups, just scans.
> >
> > Each one of our jobs has *scan.setCaching(false)* to turn off
> > block-caching, since each block will only be accessed once.
> >
> > We recently started using Cloudera Manager, and I’m seeing something that
> > doesn’t add up. See image below. It’s clear from the graphs that Block
> > Cache is being used currently, and blocks are being cached and evicted.
> >
> > We do have *hfile.block.cache.size* set to 0.4 (default), but my
> > understanding is that the jobs setting scan.setCaching(false) should
> > override this. Since it’s set in every job, there should be no blocks
> being
> > cached.
> >
> > Can anyone help me understand what we’re seeing?
> >
> > Thanks,
> >
> > -Matt
> >
> > [image: Inline image 1]
> >
>



-- 
www.calcmachine.com - easy online calculator.

Reply via email to