I haven't understood your data/schema.
I am hoping this is close to what you are trying to solve -
schema Inp: (timestamp : int, user, url);
user_url_group = group inp by (user, url);
session_duration = foreach user_url_group generate group.user as user,
group.url as url, MAX(inp.timestamp) - MIN(inp.timestamp) as duration;
-Thejas
On 1/25/12 2:12 AM, David Houston wrote:
Hi,
I have an group of records that gets outputted like the below.
((1010046645226466896,http://www.url.com/),1277793285)
((1010046645226466896,http:///www.url.com/?image=580),1277793315)
((1010046645226466896,http:///www.url.com/?image=582),1277793359)
((1010046645226466896,http:///www.url.com/?image=582),1277793470)
((1010046645226466896,http:///www.url.com/?image=585),1277793387)
The code that gets me here is;
ht = FOREACH A GENERATE CONCAT(visid_high,visid_low) AS guid, service,
hit_time_gmt, page_url as url;
grpd = GROUP ht BY (guid, url) PARALLEL 20;
B = FOREACH grpd {
t = DISTINCT ht.hit_time_gmt;
GENERATE group, flatten(t);
}
What I'm having difficultly doing is working out how I would subtract next
value from the last to work out how long a user spent on each page.
Any help would be greatly appreciated.
Many thanks
Dave
#####################################################################################
Note:
Any views or opinions are solely those of the author and do not necessarily
represent
those of Channel Four Television Corporation unless specifically stated. This
email
and any files transmitted are confidential and intended solely for the use of
the
individual or entity to which they are addressed. If you have received this
email in
error, please notify [email protected]
Thank You.
Channel Four Television Corporation, created by statute under English law, is
at 124 Horseferry Road, London, SW1P 2TX .
4 Ventures Limited (Company No. 04106849), incorporated in England and Wales
has its registered office at 124 Horseferry Road, London SW1P 2TX.
VAT no: GB 626475817
#####################################################################################