Can we compare the monthly totals?

On Thu, Mar 12, 2015 at 3:29 PM, Oliver Keyes <oke...@wikimedia.org> wrote:

> Well, again; the wikistats data that Erik refers to doesn't have any
> granularity within the period this dataset covers. Monthly data misses
> sub-monthly noise - like a massive transition that only kicks in on
> the day-by-day.
>
> On 12 March 2015 at 18:21, Toby Negrin <tneg...@wikimedia.org> wrote:
> > I'm also confused. As I understand it, stats.wikimedia.org is consuming
> the
> > data that is represented by the green line in your graph. Therefore we
> would
> > see this drop in the wikistats data that Erik referred to, but we don't.
> I
> > think we need to understand why this is so.
> >
> > -Toby
> >
> > On Thu, Mar 12, 2015 at 3:10 PM, Oliver Keyes <oke...@wikimedia.org>
> wrote:
> >>
> >> Well, I'm no longer our resident anything expert, merely /a/ anything
> >> expert :).
> >>
> >> The "concoction", as you put it, comes from the webrequest_all_sites
> >> data that is consumed by stats.wikimedia.org's primary report - I
> >> can't speak for how the dashboard you're linking to is constructed.
> >> Perhaps you could? I doubt this is a "concoction" problem given that,
> >> as you will note if you've studied the visualisations, both the UDF
> >> and the hive query implementation (which were written by two different
> >> people, and code reviewed by two /more/ people) agree that this
> >> dramatic, unexplained and untracked drop happened. And, since we've
> >> been using the hive query implementation for all our high-level
> >> numbers for about six months, a bug of this magnitude in the
> >> /implementation/ of the definition would be....worrying.
> >>
> >> Indeed, your report says 20B per month (again, is it drawing from the
> >> same data source as the aggregate, high-level number?) - I never
> >> claimed 1.1B a day, you did. Instead, it started off as approximately
> >> 1.1-1.2Bn, before dropping down to between 600m and 700m, where it has
> >> resided ever since. That sounds, averaged, like approximately 0.75B,
> >> no? The disadvantage of comparing a single monthly number against a
> >> more granular dataset.
> >>
> >> On 12 March 2015 at 17:55, Erik Zachte <ezac...@wikimedia.org> wrote:
> >> > I'd rather see you explain this, Oliver, as our incumbent page views
> >> > expert.
> >> > Your concoction of legacy PV seems to suggest 'Old definition, UDF'
> was
> >> > about 1.1B per day.
> >> >
> >> > Yet
> http://stats.wikimedia.org/EN/TablesPageViewsMonthlyAllProjects.htm
> >> > shows 20B per month, 0.75B per day
> >> >
> >> > Erik
> >> >
> >> > -----Original Message-----
> >> > From: analytics-boun...@lists.wikimedia.org
> >> > [mailto:analytics-boun...@lists.wikimedia.org] On Behalf Of Oliver
> Keyes
> >> > Sent: Thursday, March 12, 2015 19:38
> >> > To: A mailing list for the Analytics Team at WMF and everybody who has
> >> > an interest in Wikipedia and analytics.
> >> > Subject: [Analytics] [Technical] final pageviews QA
> >> >
> >> > Hey all,
> >> >
> >> > After the patches to the definition following the previous hand-coding
> >> > run (see older threads) I've run a second set of tests. These can be
> seen at
> >> > https://commons.wikimedia.org/wiki/File:Pageviews_QA_2.png and
> >> > https://commons.wikimedia.org/wiki/File:Pageviews_QA_jittered_2.png
> >> >
> >> > There's nothing particularly shocking in the new definition; it
> follows
> >> > the seasonal pattern that we're used to. I think we can call the new
> >> > definition done, with these tweaks! It's also not as unstable as the
> legacy
> >> > definition (good luck to whoever now has the responsibility of
> explaining
> >> > why pageviews abruptly halved in the middle of February).
> >> >
> >> >
> >> > Have fun,
> >> > --
> >> > Oliver Keyes
> >> > Research Analyst
> >> > Wikimedia Foundation
> >> >
> >> > _______________________________________________
> >> > Analytics mailing list
> >> > Analytics@lists.wikimedia.org
> >> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >> >
> >> >
> >> > _______________________________________________
> >> > Analytics mailing list
> >> > Analytics@lists.wikimedia.org
> >> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >>
> >>
> >> --
> >> Oliver Keyes
> >> Research Analyst
> >> Wikimedia Foundation
> >>
> >> _______________________________________________
> >> Analytics mailing list
> >> Analytics@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >
> >
> >
> > _______________________________________________
> > Analytics mailing list
> > Analytics@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to