Thanks Oliver!

We would like this data for as broad of a time period as you can muster.
The more days, months and year represented in the dataset, the better.


> Okay, so:
>
> I took an hour from the pageviews logs,[0] and aggregated pageviews to
> enwiki (mobile and desktop both) by timestamp, down to one-second
> resolution levels. The lowest number of pageviews to enwiki per second
> was 2,981
>
> So, I don't personally have a problem with generating a release of:
>
> 1. Pageviews per second;
> 2. To enwiki;
> 3. Over $TIME_PERIOD;
> 4. grouping the mobile and desktop site
>
> But Dario or someone should chip in before I touch anything ;p
>
> 6am yesterday. 6am because it should be low-traffic, right? At least
> given our biases towards north america and europe
>
> On 13 April 2015 at 11:54, Oliver Keyes <[email protected]> wrote:
> > Then that sounds much more viable. I'll run a quick test now to see
> > how much clustering we'd see at, say, the one-second resolution level,
> > and throw it out here so we can make more informed decisions about a
> > data release on this.
> >
> > On 13 April 2015 at 08:08, Hirav Gandhi <[email protected]> wrote:
> >> Hi Oliver,
> >>
> >> Re: Hirav: would you be looking for temporally /and/ contextually
> granular
> >> pageviews, i.e. "a view to X page at Y time", or just temporally
> granular,
> >> so "a view to a page on enwiki at X time"? If the latter you've got
> more of
> >> a shot, I suspect.
> >>
> >> I only want the latter - I am not concerned with the context so much as
> just
> >> “a view to a page on enwiki at X time.”
> >>
> >> Hirav
> >>
> >>
> >> On Apr 13, 2015, at 5:00 AM, [email protected]
> wrote:
> >>
> >> Send Analytics mailing list submissions to
> >> [email protected]
> >>
> >> To subscribe or unsubscribe via the World Wide Web, visit
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >> or, via email, send a message with subject or body 'help' to
> >> [email protected]
> >>
> >> You can reach the person managing the list at
> >> [email protected]
> >>
> >> When replying, please edit your Subject line so it is more specific
> >> than "Re: Contents of Analytics digest..."
> >>
> >>
> >> Today's Topics:
> >>
> >>   1. Re: Page views on a more frequent than hourly basis (Pine W)
> >>   2. Re: Page views on a more frequent than hourly basis (Oliver Keyes)
> >>
> >>
> >> ----------------------------------------------------------------------
> >>
> >> Message: 1
> >> Date: Mon, 13 Apr 2015 00:47:31 -0700
> >> From: Pine W <[email protected]>
> >> To: "A mailing list for the Analytics Team at WMF and everybody who
> >> has an interest in Wikipedia and analytics."
> >> <[email protected]>
> >> Cc: Bharath Sitaraman <[email protected]>
> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
> >> basis
> >> Message-ID:
> >> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >>
> >> Hi,
> >>
> >> This issue of pageview data granularity has been discussed before, and
> the
> >> answer has been that hourly is the smallest increment allowed to be
> >> revealed publicly, for privacy reasons.
> >>
> >> I believe that the person you will want to discuss your request with is
> >> Toby, who I have cc'd here.
> >>
> >> Pine
> >> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]>
> wrote:
> >>
> >> Hi Wikimedia Analytics Team,
> >>
> >> My colleague Bharath and I are doing research on dynamic server
> allocation
> >> algorithms and we were looking for a suitable datasets to test our
> >> predictive algorithm on. We noticed that Wikimedia has an amazing data
> set
> >> of hourly page views, but we were looking for something a bit more
> >> granular, such as aggregated page requests to English Wikipedia on a
> minute
> >> by minute basis or second by second basis if possible.
> >>
> >> We are more than happy to pour through any raw data you might have that
> >> would help us calculate page requests at this granular level. Please
> let us
> >> know if it would be possible to get such data and if so how. Thank you
> in
> >> advance for your help.
> >>
> >> Best,
> >>
> >> Hirav Gandhi
> >> _______________________________________________
> >> Analytics mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >> -------------- next part --------------
> >> An HTML attachment was scrubbed...
> >> URL:
> >> <
> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html
> >
> >>
> >> ------------------------------
> >>
> >> Message: 2
> >> Date: Mon, 13 Apr 2015 06:39:45 -0400
> >> From: Oliver Keyes <[email protected]>
> >> To: "A mailing list for the Analytics Team at WMF and everybody who
> >> has an interest in Wikipedia and analytics."
> >> <[email protected]>
> >> Cc: Bharath Sitaraman <[email protected]>
> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
> >> basis
> >> Message-ID:
> >> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com>
> >> Content-Type: text/plain; charset=UTF-8
> >>
> >>
> >> Preeetty sure that Toby is on the analytics list, Pine. He's the
> >> director of analytics.
> >>
> >> Hirav: would you be looking for temporally /and/ contextually granular
> >> pageviews, i.e. "a view to X page at Y time", or just temporally
> >> granular, so "a view to a page on enwiki at X time"? If the latter
> >> you've got more of a shot, I suspect.
> >>
> >> On 13 April 2015 at 03:47, Pine W <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> This issue of pageview data granularity has been discussed before, and
> the
> >> answer has been that hourly is the smallest increment allowed to be
> revealed
> >> publicly, for privacy reasons.
> >>
> >> I believe that the person you will want to discuss your request with is
> >> Toby, who I have cc'd here.
> >>
> >> Pine
> >>
> >> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]>
> wrote:
> >>
> >>
> >> Hi Wikimedia Analytics Team,
> >>
> >> My colleague Bharath and I are doing research on dynamic server
> allocation
> >> algorithms and we were looking for a suitable datasets to test our
> >> predictive algorithm on. We noticed that Wikimedia has an amazing data
> set
> >> of hourly page views, but we were looking for something a bit more
> granular,
> >> such as aggregated page requests to English Wikipedia on a minute by
> minute
> >> basis or second by second basis if possible.
> >>
> >> We are more than happy to pour through any raw data you might have that
> >> would help us calculate page requests at this granular level. Please
> let us
> >> know if it would be possible to get such data and if so how. Thank you
> in
> >> advance for your help.
> >>
> >> Best,
> >>
> >> Hirav Gandhi
> >> _______________________________________________
> >> Analytics mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >>
> >>
> >> _______________________________________________
> >> Analytics mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >>
> >>
> >>
> >> --
> >> Oliver Keyes
> >> Research Analyst
> >> Wikimedia Foundation
> >>
> >>
> >>
> >> ------------------------------
> >>
> >> _______________________________________________
> >> Analytics mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >>
> >> End of Analytics Digest, Vol 38, Issue 21
> >> *****************************************
> >>
> >>
> >>
> >> _______________________________________________
> >> Analytics mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>
> >
> >
> >
> > --
> > Oliver Keyes
> > Research Analyst
> > Wikimedia Foundation
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
>
>
> ------------------------------
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to