Identify the input file for a failed mapper/reducer

Jason Fennell Thu, 26 Mar 2009 03:55:27 -0700

Is there a way to identify the input file a mapper was running on when
it failed?  When a large job fails because of bad input lines I have
to resort to rerunning the entire job to isolate a single bad line
(since the log doesn't contain information on the file that that
mapper was running on).


Basically, I would like to be able to do one of the following:
1. Find the file that a mapper was running on when it failed
2. Find the block that a mapper was running on when it failed (and be
able to find file names from block ids)

I haven't been able to find any documentation on facilities to
accomplish either (1) or (2), so I'm hoping someone on this list will
have a suggestion.

I am using the Hadoop streaming API on hadoop 0.18.2.

-Jason

Identify the input file for a failed mapper/reducer

Reply via email to