Thanks Oliver! Is there a way to handle it in hql? E.g if(
exists(is_pageview),is_pageview,null)?  Finding out if field exists by
observing query crash seems wrong ))
On Apr 12, 2015 06:53, "Oliver Keyes" <[email protected]> wrote:

> (Duplicated from bug):
>
> That's not a bug. The complexity of regenerating ~60 days of data,
> where a day is 24*60*125000 rows, is extreme, and adding new fields
> means doing just that - regenerating the entire thing. As such, the
> decision was made to add to the field definition and only add actual
> values going forward from the point at which the patch was merged.
> This was true of the is_pageview calculation, the user agent data and
> the geolocation elements previously added, and is still true now.
>
> On 11 April 2015 at 03:33, Yuri Astrakhan <[email protected]>
> wrote:
> > I tried to move Zero analytics to the new table, and decided to test the
> new
> > wonderful fields like agent_type ... and it only works on the most recent
> > hours of data ((
> >
> > https://phabricator.wikimedia.org/T95806
> >
> >
> > On Fri, Apr 10, 2015 at 8:51 PM, Yuri Astrakhan <
> [email protected]>
> > wrote:
> >>
> >> Please clarify why the field "is_zero" is needed, as it is nothing more
> >> than a test for ("zero=" in x_analytics). Does having this field
> >> significantly improve performance for zero queries, e.g. "select
> count(*)
> >> from requests where iszero = true" ? Because otherwise it simply
> identifies
> >> "zero partner" traffic, not "was that request actually zero rated or
> not".
> >>
> >> Thanks!
> >>
> >> On Fri, Apr 10, 2015 at 5:16 PM, Oliver Keyes <[email protected]>
> >> wrote:
> >>>
> >>> Cool!
> >>>
> >>> On 10 April 2015 at 17:12, Joseph Allemandou <
> [email protected]>
> >>> wrote:
> >>> > Yes Oliver, the agent_type = spider includes IsCrawler UDF.
> >>> >
> >>> > On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes <[email protected]
> >
> >>> > wrote:
> >>> >>
> >>> >> What does agent-type add? In the sense that if we're pre-parsing the
> >>> >> user agent, surely the difference is between "WHERE agent_type !=
> >>> >> 'spider'" and "WHERE user_agent_map['device_family'] != 'Spider'"?
> >>> >> Does agent_type include the isCrawler UDF results?
> >>> >>
> >>> >> On 10 April 2015 at 16:47, Joseph Allemandou
> >>> >> <[email protected]>
> >>> >> wrote:
> >>> >> > And I forgot one field :
> >>> >> >
> >>> >> > is_zero - True if a request is made on a zero provider.
> >>> >> >
> >>> >> >
> >>> >> > On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia <[email protected]>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> Hi Joseph,
> >>> >> >>
> >>> >> >>    Thanks for the update, and for doing this. These three items
> >>> >> >> make
> >>> >> >> the
> >>> >> >> analysis of the data much easier on our end. We've had many
> >>> >> >> requests in
> >>> >> >> the
> >>> >> >> past that required agent_type and access_method information and
> >>> >> >> having
> >>> >> >> them
> >>> >> >> readily available is awesome! :-)
> >>> >> >>
> >>> >> >> Have a great weekend!
> >>> >> >>
> >>> >> >> Leila
> >>> >> >>
> >>> >> >> On Fri, Apr 10, 2015 at 1:21 PM, Joseph Allemandou
> >>> >> >> <[email protected]> wrote:
> >>> >> >>>
> >>> >> >>> Hi Analytics people,
> >>> >> >>>
> >>> >> >>> Today happens another bunch of addition to the refined
> webrequest
> >>> >> >>> table
> >>> >> >>> in hive.
> >>> >> >>> Now the table contains:
> >>> >> >>>
> >>> >> >>> ts - The unix timestamp (milliseconds) version of the dt date
> >>> >> >>> access_method - The method used to access the site, being one of
> >>> >> >>> the
> >>> >> >>> three [mobile app | mobile web | desktop]
> >>> >> >>> agent_type - To differentiate easily between spiders and users
> >>> >> >>> (more
> >>> >> >>> values may be added later).
> >>> >> >>>
> >>> >> >>> These additions are based on the "tags", as defined here:
> >>> >> >>> https://meta.wikimedia.org/wiki/Research:Page_view
> >>> >> >>>
> >>> >> >>> Have a good weekend !
> >>> >> >>>
> >>> >> >>> --
> >>> >> >>> Joseph Allemandou
> >>> >> >>> Data Engineer @ Wikimedia Foundation
> >>> >> >>> IRC: joal
> >>> >> >>>
> >>> >> >>> _______________________________________________
> >>> >> >>> Analytics mailing list
> >>> >> >>> [email protected]
> >>> >> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>> >> >>>
> >>> >> >>
> >>> >> >>
> >>> >> >> _______________________________________________
> >>> >> >> Analytics mailing list
> >>> >> >> [email protected]
> >>> >> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>> >> >>
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > --
> >>> >> > Joseph Allemandou
> >>> >> > Data Engineer @ Wikimedia Foundation
> >>> >> > IRC: joal
> >>> >> >
> >>> >> > _______________________________________________
> >>> >> > Analytics mailing list
> >>> >> > [email protected]
> >>> >> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >>> >> >
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Oliver Keyes
> >>> >> Research Analyst
> >>> >> Wikimedia Foundation
> >>> >>
> >>> >> _______________________________________________
> >>> >> Analytics mailing list
> >>> >> [email protected]
> >>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Joseph Allemandou
> >>> > Data Engineer @ Wikimedia Foundation
> >>> > IRC: joal
> >>> >
> >>> > _______________________________________________
> >>> > Analytics mailing list
> >>> > [email protected]
> >>> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Oliver Keyes
> >>> Research Analyst
> >>> Wikimedia Foundation
> >>>
> >>> _______________________________________________
> >>> Analytics mailing list
> >>> [email protected]
> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >>
> >
> >
> > _______________________________________________
> > Analytics mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to