Yes, you understand my task right. What is putNext? I'm new to pig, and didn't customize udfs.
2011/11/8 pablomar <[email protected]> > sorry, I didn't understand completely > > do you want to read a line, if the date is invalid (performing a > IsoToUnix directly and not a regex before) you want to skip it ? it > that ? > if yes, you can replace the field with your converted date (unix > format), and if it fails put a null or nothing > > I mean, in your overridden putNext, you have you individual columns, > you can try to convert the date in there and put in the output your > unix date. > > sorry if I misunderstood again your problem > > On 11/8/11, Rauan Maemirov <[email protected]> wrote: > > Sure, but now I'm just omiting the rows _after_ regex matching. > > What I want to do is to avoid additional filtering by regex and ignore > > invalid rows right after unsuccessful IsoToUnix(). > > > > 2011/11/8 pablomar <[email protected]> > > > >> can you write something else (a null, for example) in your putNext > >> method for that field when the date is invalid ? > >> > >> On 11/8/11, Rauan Maemirov <[email protected]> wrote: > >> > Well, I solved this issue via regex matching, but I wonder if it's too > >> > costful. > >> > Is there anyway the way to ignore exceptions and move on just by > omiting > >> > the wrong tuples? > >> > > >> > 2011/11/8 Rauan Maemirov <[email protected]> > >> > > >> >> Hi, all. I've got custom log (csv delimited by comma) with iso dates, > >> >> sometimes log writing lags and I'm having exceptions with wrong iso > >> >> date > >> >> format. > >> >> Here's exception: https://gist.github.com/1347406. (Date is the last > >> >> "parameter" in the row, and it's incorrectly overwritten at the end > by > >> >> another string). > >> >> > >> >> The question is how can I filter out all wrong dates or at least > force > >> pig > >> >> to ignore them instead of failing? > >> >> > >> > > >> > > >
