What is the last field in your output? (1,event1,3) (1,event2,2) (1,event3,1)
On Thu, Jan 26, 2012 at 4:02 PM, Grig Gheorghiu <[email protected]>wrote: > Let's say I have this dataset: > > 1,undefined,text1 > 1,,text2 > 1,event1,text3 > 1,undefined,text4 > 1,event2,text5 > 1,event3,text6 > > I would like to group by 1st value, but not quite an ordinary > grouping. I would like all lines that contain either an empty value or > 'undefined' on the 2nd position to be rolled up in the first line that > contains a proper value in the 2nd position. So basically I'd like to > obtain this relation: > > (1,event1,3) > (1,event2,2) > (1,event3,1) > > (where the 3rd value is the count of lines that were seen before a > proper 'event' line was seen). > > Is this possible with Pig? > > Thanks! > > Grig >
