Re: How do I diagnose IO bounded errors using the framework counters?

W.P. McNeill Thu, 29 Sep 2011 17:15:33 -0700

This is definitely a map-increase job.

I could try a combiner, but I don't think that would help. My keys are small
compared to my values, and values must be kept separate when they are
accumulated in the reducer--they can't be combined into some smaller form,
i.e. they are more like bitmaps than word counts. So the only I/O a combiner
would save for me is in the duplication of (relatively small) keys plus
Hadoop's overhead for a <key, value> pair, which is going to be swamped by
the values themselves.


On Thu, Sep 29, 2011 at 4:29 PM, Lance Norskog <[email protected]> wrote:

> When in doubt, go straight to the owner of a fact. The operating system is
> what really knows disk i/o.
> "my mapper job--which may write multiple <key,value> pairs for each one it
> receives--is writing too many" - ah, a map-increase job :) This is what
> Combiners are for- to keep explosions of data from hitting the network by
> combining in the mapper machine.
>
> On Thu, Sep 29, 2011 at 4:15 PM, W.P. McNeill <[email protected]> wrote:
>
> > I have a problem where certain Hadoop jobs take prohibitively long to
> run.
> > My hypothesis is that I am generating more I/O than my cluster can handle
> > and I need to substantiate this. I am looking closely at the Map Reduce
> > framework counters because I think they contain the information I need,
> but
> > I don't understand what the various File System Counters are telling me.
> Is
> > there a pointer to an list of exactly what all these counters mean? (So
> far
> > my online research has only turned up other people asking the same
> > question.)
> >
> > In particular, I suspect that my mapper job--which may write multiple
> <key,
> > value> pairs for each one it receives--is writing too many and the values
> > are too large, but I'm not sure how to test this quantitatively.
> >
> > Specific questions:
> >
> >   1. I assume "Map input records" is the total of all <key, value> pairs
> >   coming into the mappers and "Map output records" is the total of all
> > <key,
> >   value> pairs written by the mapper. Is this correct?
> >   2. What is "Map output bytes"? Is this the total number of bytes in all
> >   the <key, value> pairs written by the mapper?
> >   3. How would I calculate a corresponding "Map input bytes"? Why doesn't
> >   that counter exist?
> >   4. What is the relationship between the FILE|HDFS_BYTES_READ|WRITTEN
> >   counters? What exactly do they mean, and how do they relate to the "Map
> >   output bytes" counter?
> >   5. Sometimes the FILE bytes read and written values are an order of
> >   magnitude larger than the corresponding HDFS values, and sometimes it's
> > the
> >   other way around. How do I go about interpreting this?
> >
>
>
>
> --
> Lance Norskog
> [email protected]
>

Re: How do I diagnose IO bounded errors using the framework counters?

Reply via email to