Certainly; running now.

On 12 March 2015 at 18:33, Toby Negrin <tneg...@wikimedia.org> wrote:
> Can we compare the monthly totals?
>
> On Thu, Mar 12, 2015 at 3:29 PM, Oliver Keyes <oke...@wikimedia.org> wrote:
>>
>> Well, again; the wikistats data that Erik refers to doesn't have any
>> granularity within the period this dataset covers. Monthly data misses
>> sub-monthly noise - like a massive transition that only kicks in on
>> the day-by-day.
>>
>> On 12 March 2015 at 18:21, Toby Negrin <tneg...@wikimedia.org> wrote:
>> > I'm also confused. As I understand it, stats.wikimedia.org is consuming
>> > the
>> > data that is represented by the green line in your graph. Therefore we
>> > would
>> > see this drop in the wikistats data that Erik referred to, but we don't.
>> > I
>> > think we need to understand why this is so.
>> >
>> > -Toby
>> >
>> > On Thu, Mar 12, 2015 at 3:10 PM, Oliver Keyes <oke...@wikimedia.org>
>> > wrote:
>> >>
>> >> Well, I'm no longer our resident anything expert, merely /a/ anything
>> >> expert :).
>> >>
>> >> The "concoction", as you put it, comes from the webrequest_all_sites
>> >> data that is consumed by stats.wikimedia.org's primary report - I
>> >> can't speak for how the dashboard you're linking to is constructed.
>> >> Perhaps you could? I doubt this is a "concoction" problem given that,
>> >> as you will note if you've studied the visualisations, both the UDF
>> >> and the hive query implementation (which were written by two different
>> >> people, and code reviewed by two /more/ people) agree that this
>> >> dramatic, unexplained and untracked drop happened. And, since we've
>> >> been using the hive query implementation for all our high-level
>> >> numbers for about six months, a bug of this magnitude in the
>> >> /implementation/ of the definition would be....worrying.
>> >>
>> >> Indeed, your report says 20B per month (again, is it drawing from the
>> >> same data source as the aggregate, high-level number?) - I never
>> >> claimed 1.1B a day, you did. Instead, it started off as approximately
>> >> 1.1-1.2Bn, before dropping down to between 600m and 700m, where it has
>> >> resided ever since. That sounds, averaged, like approximately 0.75B,
>> >> no? The disadvantage of comparing a single monthly number against a
>> >> more granular dataset.
>> >>
>> >> On 12 March 2015 at 17:55, Erik Zachte <ezac...@wikimedia.org> wrote:
>> >> > I'd rather see you explain this, Oliver, as our incumbent page views
>> >> > expert.
>> >> > Your concoction of legacy PV seems to suggest 'Old definition, UDF'
>> >> > was
>> >> > about 1.1B per day.
>> >> >
>> >> > Yet
>> >> > http://stats.wikimedia.org/EN/TablesPageViewsMonthlyAllProjects.htm
>> >> > shows 20B per month, 0.75B per day
>> >> >
>> >> > Erik
>> >> >
>> >> > -----Original Message-----
>> >> > From: analytics-boun...@lists.wikimedia.org
>> >> > [mailto:analytics-boun...@lists.wikimedia.org] On Behalf Of Oliver
>> >> > Keyes
>> >> > Sent: Thursday, March 12, 2015 19:38
>> >> > To: A mailing list for the Analytics Team at WMF and everybody who
>> >> > has
>> >> > an interest in Wikipedia and analytics.
>> >> > Subject: [Analytics] [Technical] final pageviews QA
>> >> >
>> >> > Hey all,
>> >> >
>> >> > After the patches to the definition following the previous
>> >> > hand-coding
>> >> > run (see older threads) I've run a second set of tests. These can be
>> >> > seen at
>> >> > https://commons.wikimedia.org/wiki/File:Pageviews_QA_2.png and
>> >> > https://commons.wikimedia.org/wiki/File:Pageviews_QA_jittered_2.png
>> >> >
>> >> > There's nothing particularly shocking in the new definition; it
>> >> > follows
>> >> > the seasonal pattern that we're used to. I think we can call the new
>> >> > definition done, with these tweaks! It's also not as unstable as the
>> >> > legacy
>> >> > definition (good luck to whoever now has the responsibility of
>> >> > explaining
>> >> > why pageviews abruptly halved in the middle of February).
>> >> >
>> >> >
>> >> > Have fun,
>> >> > --
>> >> > Oliver Keyes
>> >> > Research Analyst
>> >> > Wikimedia Foundation
>> >> >
>> >> > _______________________________________________
>> >> > Analytics mailing list
>> >> > Analytics@lists.wikimedia.org
>> >> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Analytics mailing list
>> >> > Analytics@lists.wikimedia.org
>> >> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >>
>> >>
>> >> --
>> >> Oliver Keyes
>> >> Research Analyst
>> >> Wikimedia Foundation
>> >>
>> >> _______________________________________________
>> >> Analytics mailing list
>> >> Analytics@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to