I haven't understood your data/schema.

I am hoping this is close to what you are trying to solve -
schema Inp: (timestamp : int, user, url);

user_url_group = group inp by (user, url);
session_duration = foreach user_url_group generate group.user as user, group.url as url, MAX(inp.timestamp) - MIN(inp.timestamp) as duration;

-Thejas



On 1/25/12 2:12 AM, David Houston wrote:
Hi,

I have an group of records that gets outputted like the below.

((1010046645226466896,http://www.url.com/),1277793285)
((1010046645226466896,http:///www.url.com/?image=580),1277793315)
((1010046645226466896,http:///www.url.com/?image=582),1277793359)
((1010046645226466896,http:///www.url.com/?image=582),1277793470)
((1010046645226466896,http:///www.url.com/?image=585),1277793387)


The code that gets me here is;

ht = FOREACH A GENERATE CONCAT(visid_high,visid_low) AS guid, service, 
hit_time_gmt, page_url as url;

grpd = GROUP ht BY (guid, url) PARALLEL 20;

B = FOREACH grpd {
t = DISTINCT ht.hit_time_gmt;

GENERATE group, flatten(t);
}


What I'm having difficultly doing is working out how I would subtract next 
value from the last to work out how long a user spent on each page.

Any help would be greatly appreciated.


Many thanks

Dave
#####################################################################################
Note:

Any views or opinions are solely those of the author and do not necessarily 
represent
those of Channel Four Television Corporation unless specifically stated. This 
email
and any files transmitted are confidential and intended solely for the use of 
the
individual or entity to which they are addressed. If you have received this 
email in
error, please notify [email protected]

Thank You.

Channel Four Television Corporation, created by statute under English law, is 
at 124 Horseferry Road, London, SW1P 2TX .

4 Ventures Limited (Company No. 04106849), incorporated in England and Wales 
has its registered office at 124 Horseferry Road, London SW1P 2TX.

VAT no: GB 626475817

#####################################################################################

Reply via email to