Re: How to stop a mapper within a map-reduce job when you detect bad input

ed Thu, 21 Oct 2010 08:24:14 -0700

Hello,

The MapRunner classes looks promising.  I noticed it is in the deprecated
mapred package but I didn't see an equivalent class in the mapreduce
package.  Is this going to ported to mapreduce or is it no longer being
supported?  Thanks!


~Ed

On Thu, Oct 21, 2010 at 6:36 AM, Harsh J <[email protected]> wrote:

> If it occurs eventually as your record reader reads it, then you may
> use a MapRunner class instead of a Mapper IFace/Subclass. This way,
> you may try/catch over the record reader itself, and call your map
> function only on valid next()s. I think this ought to work.
>
> You can set it via JobConf.setMapRunnerClass(...).
>
> Ref: MapRunner API @
>
> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/MapRunner.html
>
> On Wed, Oct 20, 2010 at 4:14 AM, ed <[email protected]> wrote:
> > Hello,
> >
> > I have a simple map-reduce job that reads in zipped files and converts
> them
> > to lzo compression.  Some of the files are not properly zipped which
> results
> > in Hadoop throwing an "java.io.EOFException: Unexpected end of input
> stream
> > error" and causes the job to fail.  Is there a way to catch this
> exception
> > and tell hadoop to just ignore the file and move on?  I think the
> exception
> > is being thrown by the class reading in the Gzip file and not my mapper
> > class.  Is this correct?  Is there a way to handle this type of error
> > gracefully?
> >
> > Thank you!
> >
> > ~Ed
> >
>
>
>
> --
> Harsh J
> www.harshj.com
>

Re: How to stop a mapper within a map-reduce job when you detect bad input

Reply via email to