Could you even do it with an UDF? In a regular programming language
you can easily do it with a sentinel that you keep track of, but in
Pig I can't figure it out....

On Thu, Jan 26, 2012 at 4:24 PM, Prashant Kommireddi
<[email protected]> wrote:
> Grig, I am afraid there is nothing built into Pig to do this.
>
> On Thu, Jan 26, 2012 at 4:08 PM, Grig Gheorghiu 
> <[email protected]>wrote:
>
>> The count of lines seen up to and including a proper event value (3
>> lines for event1, 2 for event2, 1 for event3).
>>
>> On Thu, Jan 26, 2012 at 4:06 PM, Prashant Kommireddi
>> <[email protected]> wrote:
>> > What is the last field in your output?
>> >
>> > (1,event1,3)
>> > (1,event2,2)
>> > (1,event3,1)
>> >
>> > On Thu, Jan 26, 2012 at 4:02 PM, Grig Gheorghiu <
>> [email protected]>wrote:
>> >
>> >> Let's say I have this dataset:
>> >>
>> >> 1,undefined,text1
>> >> 1,,text2
>> >> 1,event1,text3
>> >> 1,undefined,text4
>> >> 1,event2,text5
>> >> 1,event3,text6
>> >>
>> >> I would like to group by 1st value, but not quite an ordinary
>> >> grouping. I would like all lines that contain either an empty value or
>> >> 'undefined' on the 2nd position to be rolled up in the first line that
>> >> contains a proper value in the 2nd position. So basically I'd like to
>> >> obtain this relation:
>> >>
>> >> (1,event1,3)
>> >> (1,event2,2)
>> >> (1,event3,1)
>> >>
>> >> (where the 3rd value is the count of lines that were seen before a
>> >> proper 'event' line was seen).
>> >>
>> >> Is this possible with Pig?
>> >>
>> >> Thanks!
>> >>
>> >> Grig
>> >>
>>

Reply via email to