On Tue, Sep 1, 2015 at 11:29 AM, Riesland, Zack <zack.riesl...@sensus.com> wrote: > You say I can find information about spills in the job counters. Are you > talking about “failed” map tasks, or is there something else that will help > me identify spill scenarios?
"Spilled records" is a counter that is available at the job level and at individual task level -- you can see it in the Counters view of a job or task in the web interface of the YARN Resource Manager or History Server. This is a counter that will only be included if a task was successful (this is the case for all counters in MapReduce jobs). "Map Output Records" is the other counter that you'll want to compare with this. - Gabriel