What is the last field in your output?

(1,event1,3)
(1,event2,2)
(1,event3,1)

On Thu, Jan 26, 2012 at 4:02 PM, Grig Gheorghiu <[email protected]>wrote:

> Let's say I have this dataset:
>
> 1,undefined,text1
> 1,,text2
> 1,event1,text3
> 1,undefined,text4
> 1,event2,text5
> 1,event3,text6
>
> I would like to group by 1st value, but not quite an ordinary
> grouping. I would like all lines that contain either an empty value or
> 'undefined' on the 2nd position to be rolled up in the first line that
> contains a proper value in the 2nd position. So basically I'd like to
> obtain this relation:
>
> (1,event1,3)
> (1,event2,2)
> (1,event3,1)
>
> (where the 3rd value is the count of lines that were seen before a
> proper 'event' line was seen).
>
> Is this possible with Pig?
>
> Thanks!
>
> Grig
>

Reply via email to