If it occurs eventually as your record reader reads it, then you may use a MapRunner class instead of a Mapper IFace/Subclass. This way, you may try/catch over the record reader itself, and call your map function only on valid next()s. I think this ought to work.
You can set it via JobConf.setMapRunnerClass(...). Ref: MapRunner API @ http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/MapRunner.html On Wed, Oct 20, 2010 at 4:14 AM, ed <[email protected]> wrote: > Hello, > > I have a simple map-reduce job that reads in zipped files and converts them > to lzo compression. Some of the files are not properly zipped which results > in Hadoop throwing an "java.io.EOFException: Unexpected end of input stream > error" and causes the job to fail. Is there a way to catch this exception > and tell hadoop to just ignore the file and move on? I think the exception > is being thrown by the class reading in the Gzip file and not my mapper > class. Is this correct? Is there a way to handle this type of error > gracefully? > > Thank you! > > ~Ed > -- Harsh J www.harshj.com
