Thanks Tom! Didn't see your post before posting =) On Thu, Oct 21, 2010 at 1:28 PM, ed <[email protected]> wrote:
> Sorry to keep spamming this thread. It looks like the correct way to > implement MapRunnable using the new mapreduce classes (instead of the > deprecated mapred) is to override the run() method of the mapper class. > This is actually nice and convenient since everyone should already be using > Mapper class (org.apache.hadoop.mapreduce.Maper<KEYIN, VALUEIN, KEYOUT, > VALUEOUT> for their mappers. > > ~Ed > > > On Thu, Oct 21, 2010 at 12:14 PM, ed <[email protected]> wrote: > >> Just checked the Hadoop 0.21.0 API docs (I was looking in the wrong docs >> before) and it doesn't look like MapRunner is deprecated so I'll try >> catching the error there and will report back if it's a good solution. >> Thanks! >> >> ~Ed >> >> >> On Thu, Oct 21, 2010 at 11:23 AM, ed <[email protected]> wrote: >> >>> Hello, >>> >>> The MapRunner classes looks promising. I noticed it is in the deprecated >>> mapred package but I didn't see an equivalent class in the mapreduce >>> package. Is this going to ported to mapreduce or is it no longer being >>> supported? Thanks! >>> >>> ~Ed >>> >>> >>> On Thu, Oct 21, 2010 at 6:36 AM, Harsh J <[email protected]> wrote: >>> >>>> If it occurs eventually as your record reader reads it, then you may >>>> use a MapRunner class instead of a Mapper IFace/Subclass. This way, >>>> you may try/catch over the record reader itself, and call your map >>>> function only on valid next()s. I think this ought to work. >>>> >>>> You can set it via JobConf.setMapRunnerClass(...). >>>> >>>> Ref: MapRunner API @ >>>> >>>> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/MapRunner.html >>>> >>>> On Wed, Oct 20, 2010 at 4:14 AM, ed <[email protected]> wrote: >>>> > Hello, >>>> > >>>> > I have a simple map-reduce job that reads in zipped files and converts >>>> them >>>> > to lzo compression. Some of the files are not properly zipped which >>>> results >>>> > in Hadoop throwing an "java.io.EOFException: Unexpected end of input >>>> stream >>>> > error" and causes the job to fail. Is there a way to catch this >>>> exception >>>> > and tell hadoop to just ignore the file and move on? I think the >>>> exception >>>> > is being thrown by the class reading in the Gzip file and not my >>>> mapper >>>> > class. Is this correct? Is there a way to handle this type of error >>>> > gracefully? >>>> > >>>> > Thank you! >>>> > >>>> > ~Ed >>>> > >>>> >>>> >>>> >>>> -- >>>> Harsh J >>>> www.harshj.com >>>> >>> >>> >> >
