I tried to move Zero analytics to the new table, and decided to test the new wonderful fields like agent_type ... and it only works on the most recent hours of data ((
https://phabricator.wikimedia.org/T95806 On Fri, Apr 10, 2015 at 8:51 PM, Yuri Astrakhan <[email protected]> wrote: > Please clarify why the field "is_zero" is needed, as it is nothing more > than a test for ("zero=" in x_analytics). Does having this field > significantly improve performance for zero queries, e.g. "select count(*) > from requests where iszero = true" ? Because otherwise it simply identifies > "zero partner" traffic, not "was that request actually zero rated or not". > > Thanks! > > On Fri, Apr 10, 2015 at 5:16 PM, Oliver Keyes <[email protected]> > wrote: > >> Cool! >> >> On 10 April 2015 at 17:12, Joseph Allemandou <[email protected]> >> wrote: >> > Yes Oliver, the agent_type = spider includes IsCrawler UDF. >> > >> > On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes <[email protected]> >> wrote: >> >> >> >> What does agent-type add? In the sense that if we're pre-parsing the >> >> user agent, surely the difference is between "WHERE agent_type != >> >> 'spider'" and "WHERE user_agent_map['device_family'] != 'Spider'"? >> >> Does agent_type include the isCrawler UDF results? >> >> >> >> On 10 April 2015 at 16:47, Joseph Allemandou < >> [email protected]> >> >> wrote: >> >> > And I forgot one field : >> >> > >> >> > is_zero - True if a request is made on a zero provider. >> >> > >> >> > >> >> > On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia <[email protected]> >> wrote: >> >> >> >> >> >> Hi Joseph, >> >> >> >> >> >> Thanks for the update, and for doing this. These three items make >> >> >> the >> >> >> analysis of the data much easier on our end. We've had many >> requests in >> >> >> the >> >> >> past that required agent_type and access_method information and >> having >> >> >> them >> >> >> readily available is awesome! :-) >> >> >> >> >> >> Have a great weekend! >> >> >> >> >> >> Leila >> >> >> >> >> >> On Fri, Apr 10, 2015 at 1:21 PM, Joseph Allemandou >> >> >> <[email protected]> wrote: >> >> >>> >> >> >>> Hi Analytics people, >> >> >>> >> >> >>> Today happens another bunch of addition to the refined webrequest >> >> >>> table >> >> >>> in hive. >> >> >>> Now the table contains: >> >> >>> >> >> >>> ts - The unix timestamp (milliseconds) version of the dt date >> >> >>> access_method - The method used to access the site, being one of >> the >> >> >>> three [mobile app | mobile web | desktop] >> >> >>> agent_type - To differentiate easily between spiders and users >> (more >> >> >>> values may be added later). >> >> >>> >> >> >>> These additions are based on the "tags", as defined here: >> >> >>> https://meta.wikimedia.org/wiki/Research:Page_view >> >> >>> >> >> >>> Have a good weekend ! >> >> >>> >> >> >>> -- >> >> >>> Joseph Allemandou >> >> >>> Data Engineer @ Wikimedia Foundation >> >> >>> IRC: joal >> >> >>> >> >> >>> _______________________________________________ >> >> >>> Analytics mailing list >> >> >>> [email protected] >> >> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> >> Analytics mailing list >> >> >> [email protected] >> >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Joseph Allemandou >> >> > Data Engineer @ Wikimedia Foundation >> >> > IRC: joal >> >> > >> >> > _______________________________________________ >> >> > Analytics mailing list >> >> > [email protected] >> >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > >> >> >> >> >> >> >> >> -- >> >> Oliver Keyes >> >> Research Analyst >> >> Wikimedia Foundation >> >> >> >> _______________________________________________ >> >> Analytics mailing list >> >> [email protected] >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> > >> > >> > >> > -- >> > Joseph Allemandou >> > Data Engineer @ Wikimedia Foundation >> > IRC: joal >> > >> > _______________________________________________ >> > Analytics mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> >> >> >> -- >> Oliver Keyes >> Research Analyst >> Wikimedia Foundation >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
