Thank you so much Gianmarco,Nitin.

On Thu, Oct 9, 2014 at 11:43 AM, Gianmarco De Francisci Morales <
g...@apache.org> wrote:

> I guess one way to do this is to use RANK twice, once on the original
> relationship, and once on the original relationship \ the first point. Then
> join on the rank and subtract.
>
> A = load 'data';
> B = filter A by timestamp > 20141014120523; -- remove the first point
> C= RANK A by timestamp;
> D= RANK B by timestamp;
> E = JOIN C by $0; D by $0; -- join on the rank
> F = foreach E generate C.timestamp - D.timestamp'
>
>
> Disclaimer: the script is just off the top of my head and is not tested.
>
> Cheers,
>
> --
> Gianmarco
>
> On 8 October 2014 09:01, Krishna Kalyan <krishnakaly...@gmail.com> wrote:
>
> > Hi Everybody,
> >
> > Input File : Records are sorted based on the time stamp
> > Expected input file size will be :2-3TB
> >
> > timestamp
> > ==============
> > 20141014120523
> > 20141014120534
> > 20141014120537
> > 20141014120542
> > 20141014120549
> > 20141014120555
> > 20141014120565
> > 20141014120570
> > 20141014120512
> > ...
> > ...
> >
> >
> > Using PIG I need to find the time difference between the Nth record and
> > Nth-1 Record time stamp (20141014120534 - 20141014120523 = 11 secs).
> > I need to loop through all the records to get the time difference from
> > previous record
> >
> > Example Output
> > 0
> > 11
> > 3
> > 5
> > ...
> >
> > Please guide.
> >
> > Regards,
> > Krishna Kalyan
> >
>

Reply via email to