Yes Oliver, the agent_type = spider includes IsCrawler UDF.

On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes <[email protected]> wrote:

> What does agent-type add? In the sense that if we're pre-parsing the
> user agent, surely the difference is between "WHERE agent_type !=
> 'spider'" and "WHERE user_agent_map['device_family'] != 'Spider'"?
> Does agent_type include the isCrawler UDF results?
>
> On 10 April 2015 at 16:47, Joseph Allemandou <[email protected]>
> wrote:
> > And I forgot one field :
> >
> > is_zero - True if a request is made on a zero provider.
> >
> >
> > On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia <[email protected]> wrote:
> >>
> >> Hi Joseph,
> >>
> >>    Thanks for the update, and for doing this. These three items make the
> >> analysis of the data much easier on our end. We've had many requests in
> the
> >> past that required agent_type and access_method information and having
> them
> >> readily available is awesome! :-)
> >>
> >> Have a great weekend!
> >>
> >> Leila
> >>
> >> On Fri, Apr 10, 2015 at 1:21 PM, Joseph Allemandou
> >> <[email protected]> wrote:
> >>>
> >>> Hi Analytics people,
> >>>
> >>> Today happens another bunch of addition to the refined webrequest table
> >>> in hive.
> >>> Now the table contains:
> >>>
> >>> ts - The unix timestamp (milliseconds) version of the dt date
> >>> access_method - The method used to access the site, being one of the
> >>> three [mobile app | mobile web | desktop]
> >>> agent_type - To differentiate easily between spiders and users (more
> >>> values may be added later).
> >>>
> >>> These additions are based on the "tags", as defined here:
> >>> https://meta.wikimedia.org/wiki/Research:Page_view
> >>>
> >>> Have a good weekend !
> >>>
> >>> --
> >>> Joseph Allemandou
> >>> Data Engineer @ Wikimedia Foundation
> >>> IRC: joal
> >>>
> >>> _______________________________________________
> >>> Analytics mailing list
> >>> [email protected]
> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>>
> >>
> >>
> >> _______________________________________________
> >> Analytics mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >
> >
> >
> > --
> > Joseph Allemandou
> > Data Engineer @ Wikimedia Foundation
> > IRC: joal
> >
> > _______________________________________________
> > Analytics mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
*Joseph Allemandou*
Data Engineer @ Wikimedia Foundation
IRC: joal
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to