Cool!

On 10 April 2015 at 17:12, Joseph Allemandou <[email protected]> wrote:
> Yes Oliver, the agent_type = spider includes IsCrawler UDF.
>
> On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes <[email protected]> wrote:
>>
>> What does agent-type add? In the sense that if we're pre-parsing the
>> user agent, surely the difference is between "WHERE agent_type !=
>> 'spider'" and "WHERE user_agent_map['device_family'] != 'Spider'"?
>> Does agent_type include the isCrawler UDF results?
>>
>> On 10 April 2015 at 16:47, Joseph Allemandou <[email protected]>
>> wrote:
>> > And I forgot one field :
>> >
>> > is_zero - True if a request is made on a zero provider.
>> >
>> >
>> > On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia <[email protected]> wrote:
>> >>
>> >> Hi Joseph,
>> >>
>> >>    Thanks for the update, and for doing this. These three items make
>> >> the
>> >> analysis of the data much easier on our end. We've had many requests in
>> >> the
>> >> past that required agent_type and access_method information and having
>> >> them
>> >> readily available is awesome! :-)
>> >>
>> >> Have a great weekend!
>> >>
>> >> Leila
>> >>
>> >> On Fri, Apr 10, 2015 at 1:21 PM, Joseph Allemandou
>> >> <[email protected]> wrote:
>> >>>
>> >>> Hi Analytics people,
>> >>>
>> >>> Today happens another bunch of addition to the refined webrequest
>> >>> table
>> >>> in hive.
>> >>> Now the table contains:
>> >>>
>> >>> ts - The unix timestamp (milliseconds) version of the dt date
>> >>> access_method - The method used to access the site, being one of the
>> >>> three [mobile app | mobile web | desktop]
>> >>> agent_type - To differentiate easily between spiders and users (more
>> >>> values may be added later).
>> >>>
>> >>> These additions are based on the "tags", as defined here:
>> >>> https://meta.wikimedia.org/wiki/Research:Page_view
>> >>>
>> >>> Have a good weekend !
>> >>>
>> >>> --
>> >>> Joseph Allemandou
>> >>> Data Engineer @ Wikimedia Foundation
>> >>> IRC: joal
>> >>>
>> >>> _______________________________________________
>> >>> Analytics mailing list
>> >>> [email protected]
>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>>
>> >>
>> >>
>> >> _______________________________________________
>> >> Analytics mailing list
>> >> [email protected]
>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >
>> >
>> >
>> > --
>> > Joseph Allemandou
>> > Data Engineer @ Wikimedia Foundation
>> > IRC: joal
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
>
> --
> Joseph Allemandou
> Data Engineer @ Wikimedia Foundation
> IRC: joal
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to