Thank you so much Gianmarco,Nitin.
On Thu, Oct 9, 2014 at 11:43 AM, Gianmarco De Francisci Morales < g...@apache.org> wrote: > I guess one way to do this is to use RANK twice, once on the original > relationship, and once on the original relationship \ the first point. Then > join on the rank and subtract. > > A = load 'data'; > B = filter A by timestamp > 20141014120523; -- remove the first point > C= RANK A by timestamp; > D= RANK B by timestamp; > E = JOIN C by $0; D by $0; -- join on the rank > F = foreach E generate C.timestamp - D.timestamp' > > > Disclaimer: the script is just off the top of my head and is not tested. > > Cheers, > > -- > Gianmarco > > On 8 October 2014 09:01, Krishna Kalyan <krishnakaly...@gmail.com> wrote: > > > Hi Everybody, > > > > Input File : Records are sorted based on the time stamp > > Expected input file size will be :2-3TB > > > > timestamp > > ============== > > 20141014120523 > > 20141014120534 > > 20141014120537 > > 20141014120542 > > 20141014120549 > > 20141014120555 > > 20141014120565 > > 20141014120570 > > 20141014120512 > > ... > > ... > > > > > > Using PIG I need to find the time difference between the Nth record and > > Nth-1 Record time stamp (20141014120534 - 20141014120523 = 11 secs). > > I need to loop through all the records to get the time difference from > > previous record > > > > Example Output > > 0 > > 11 > > 3 > > 5 > > ... > > > > Please guide. > > > > Regards, > > Krishna Kalyan > > >