Forwarding to the analytics list for reference. ---------- Forwarded message --------- From: Ho Chung <[email protected]> Date: Mon, Mar 15, 2021 at 11:45 AM Subject: Re: [Analytics] About: refine_webrequest.hql To: Joseph Allemandou <[email protected]>
Hello Thanks for your reply Because i was research your Analytics team public discuss history and wikiteah about web request time stamp https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest https://phabricator.wikimedia.org/T212529 I have been in doubt at that time, you're used java technology, but your HIVE version did not support java before October 2018. The wmf.webrequest file is located in HIVE. When collecting the privacy data of readership , whether the time stamp used the reader's computer system clock instead of the Wikipedia computer server clock when reading and browsing the page Now I am more clear. On the public discussion page of your analysis team, said that all the time is utc by Ottomata It’s just that you technicians don’t want to unify the expression of the time stamp format, but in fact all of them use UTC 在 2021年3月15日週一 16:14,Joseph Allemandou <[email protected]> 寫道: > Hi, > the `dt` field is the time in UTC (no timezone specified) at which the > request ends being processed by Varnish. > Cheers > Joseph > > On Mon, Mar 15, 2021 at 8:36 AM Luca Toscano <[email protected]> > wrote: > >> +A mailing list for the Analytics Team at WMF and everybody who has an >> interest in Wikipedia and analytics. <[email protected]> >> >> Hi! >> >> I added the Analytics mailing list in Cc so other people can chime in, >> this is the canonical way to follow up with us and the community, please >> avoid direct email if possible :) >> >> Thanks! >> >> Luca >> >> >> >> On Sat, Mar 13, 2021 at 10:57 PM Ho Chung <[email protected]> wrote: >> >>> Hello >>> >>> I have some problem request , about refine_webrequest.hql >>> >>> >>> In this file timestamp is use utc ? >>> >>> This file is it connect wmf_raw.webrequest and wmf.webrequest ? >>> >>> Because i can't read the code have add Z / +/- zone time >>> >>> >>> >>> -- Hack to get a correct timestamp because of hive inconsistent >>> conversion >>> >>> CAST(unix_timestamp(dt, "yyyy-MM-dd'T'HH:mm:ss") * 1.0 as timestamp) as >>> ts, >>> >>> >>> https://github.com/wikimedia/analytics-refinery/blob/master/oozie/webrequest/load/refine_webrequest.hql >>> >>> I emailed wiki legal request 3 month they not sure , can you clearly ask >>> me . >>> >>> If not use utc, is use your server clock or , my computer clock? >>> >>> >>> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > -- > Joseph Allemandou (joal) (he / him) > Staff Data Engineer > Wikimedia Foundation > -- Joseph Allemandou (joal) (he / him) Staff Data Engineer Wikimedia Foundation
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
