Seems like you are looking to group by "id" and get the MIN and MAX timestamp for each group?
On Thu, Aug 30, 2012 at 1:00 AM, Marco Cadetg <[email protected]> wrote: > Hi there, > > I do have some user session which look something on the following lines: > > id:chararray, start:long(unix timestamp), end:long(unix timestamp) > xxx,1,3 > xxx,4,7 > yyy,1,2 > yyy,5,7 > zzz,6,7 > zzz,7,10 > > I would like to to combine the rows which belong to a continues session > e.g. in my example the result should be the following: > xxx,1,7 > yyy,1,2 > yyy,5,7 > zzz,6,10 > > I guess there is no way to do this directly in pig but rather by using a > UDF. Can someone give me a pointer on how you would achieve this? > > Thanks, > -Marco >
