Create a UDF that verifies the format, and go through a filtering step first. If you would like to save the malformated records so you can look at them later, you can use the SPLIT operator to route the good records to your regular workflow, and the bad records some place on HDFS.
-D On Mon, Jan 10, 2011 at 9:58 PM, hadoop n00b <[email protected]> wrote: > Hello, > > I have a pig script that uses piggy bank to calculate date differences. > Sometimes, when I get a wierd date or wrong format in the input, the script > throws and error and aborts. > > Is there a way I could trap these errors and move on without stopping the > execution? > > Thanks > > PS: I'm using CDH2 with Pig 0.5 >
