The count of lines seen up to and including a proper event value (3 lines for event1, 2 for event2, 1 for event3).
On Thu, Jan 26, 2012 at 4:06 PM, Prashant Kommireddi <[email protected]> wrote: > What is the last field in your output? > > (1,event1,3) > (1,event2,2) > (1,event3,1) > > On Thu, Jan 26, 2012 at 4:02 PM, Grig Gheorghiu > <[email protected]>wrote: > >> Let's say I have this dataset: >> >> 1,undefined,text1 >> 1,,text2 >> 1,event1,text3 >> 1,undefined,text4 >> 1,event2,text5 >> 1,event3,text6 >> >> I would like to group by 1st value, but not quite an ordinary >> grouping. I would like all lines that contain either an empty value or >> 'undefined' on the 2nd position to be rolled up in the first line that >> contains a proper value in the 2nd position. So basically I'd like to >> obtain this relation: >> >> (1,event1,3) >> (1,event2,2) >> (1,event3,1) >> >> (where the 3rd value is the count of lines that were seen before a >> proper 'event' line was seen). >> >> Is this possible with Pig? >> >> Thanks! >> >> Grig >>
