I tried to move Zero analytics to the new table, and decided to test the
new wonderful fields like agent_type ... and it only works on the most
recent hours of data ((

https://phabricator.wikimedia.org/T95806

On Fri, Apr 10, 2015 at 8:51 PM, Yuri Astrakhan <[email protected]>
wrote:

> Please clarify why the field "is_zero" is needed, as it is nothing more
> than a test for ("zero=" in x_analytics). Does having this field
> significantly improve performance for zero queries, e.g. "select count(*)
> from requests where iszero = true" ? Because otherwise it simply identifies
> "zero partner" traffic, not "was that request actually zero rated or not".
>
> Thanks!
>
> On Fri, Apr 10, 2015 at 5:16 PM, Oliver Keyes <[email protected]>
> wrote:
>
>> Cool!
>>
>> On 10 April 2015 at 17:12, Joseph Allemandou <[email protected]>
>> wrote:
>> > Yes Oliver, the agent_type = spider includes IsCrawler UDF.
>> >
>> > On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes <[email protected]>
>> wrote:
>> >>
>> >> What does agent-type add? In the sense that if we're pre-parsing the
>> >> user agent, surely the difference is between "WHERE agent_type !=
>> >> 'spider'" and "WHERE user_agent_map['device_family'] != 'Spider'"?
>> >> Does agent_type include the isCrawler UDF results?
>> >>
>> >> On 10 April 2015 at 16:47, Joseph Allemandou <
>> [email protected]>
>> >> wrote:
>> >> > And I forgot one field :
>> >> >
>> >> > is_zero - True if a request is made on a zero provider.
>> >> >
>> >> >
>> >> > On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia <[email protected]>
>> wrote:
>> >> >>
>> >> >> Hi Joseph,
>> >> >>
>> >> >>    Thanks for the update, and for doing this. These three items make
>> >> >> the
>> >> >> analysis of the data much easier on our end. We've had many
>> requests in
>> >> >> the
>> >> >> past that required agent_type and access_method information and
>> having
>> >> >> them
>> >> >> readily available is awesome! :-)
>> >> >>
>> >> >> Have a great weekend!
>> >> >>
>> >> >> Leila
>> >> >>
>> >> >> On Fri, Apr 10, 2015 at 1:21 PM, Joseph Allemandou
>> >> >> <[email protected]> wrote:
>> >> >>>
>> >> >>> Hi Analytics people,
>> >> >>>
>> >> >>> Today happens another bunch of addition to the refined webrequest
>> >> >>> table
>> >> >>> in hive.
>> >> >>> Now the table contains:
>> >> >>>
>> >> >>> ts - The unix timestamp (milliseconds) version of the dt date
>> >> >>> access_method - The method used to access the site, being one of
>> the
>> >> >>> three [mobile app | mobile web | desktop]
>> >> >>> agent_type - To differentiate easily between spiders and users
>> (more
>> >> >>> values may be added later).
>> >> >>>
>> >> >>> These additions are based on the "tags", as defined here:
>> >> >>> https://meta.wikimedia.org/wiki/Research:Page_view
>> >> >>>
>> >> >>> Have a good weekend !
>> >> >>>
>> >> >>> --
>> >> >>> Joseph Allemandou
>> >> >>> Data Engineer @ Wikimedia Foundation
>> >> >>> IRC: joal
>> >> >>>
>> >> >>> _______________________________________________
>> >> >>> Analytics mailing list
>> >> >>> [email protected]
>> >> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >> >>>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> Analytics mailing list
>> >> >> [email protected]
>> >> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Joseph Allemandou
>> >> > Data Engineer @ Wikimedia Foundation
>> >> > IRC: joal
>> >> >
>> >> > _______________________________________________
>> >> > Analytics mailing list
>> >> > [email protected]
>> >> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Oliver Keyes
>> >> Research Analyst
>> >> Wikimedia Foundation
>> >>
>> >> _______________________________________________
>> >> Analytics mailing list
>> >> [email protected]
>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>> >
>> >
>> >
>> > --
>> > Joseph Allemandou
>> > Data Engineer @ Wikimedia Foundation
>> > IRC: joal
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to