Okay, so: I took an hour from the pageviews logs,[0] and aggregated pageviews to enwiki (mobile and desktop both) by timestamp, down to one-second resolution levels. The lowest number of pageviews to enwiki per second was 2,981
So, I don't personally have a problem with generating a release of: 1. Pageviews per second; 2. To enwiki; 3. Over $TIME_PERIOD; 4. grouping the mobile and desktop site But Dario or someone should chip in before I touch anything ;p 6am yesterday. 6am because it should be low-traffic, right? At least given our biases towards north america and europe On 13 April 2015 at 11:54, Oliver Keyes <[email protected]> wrote: > Then that sounds much more viable. I'll run a quick test now to see > how much clustering we'd see at, say, the one-second resolution level, > and throw it out here so we can make more informed decisions about a > data release on this. > > On 13 April 2015 at 08:08, Hirav Gandhi <[email protected]> wrote: >> Hi Oliver, >> >> Re: Hirav: would you be looking for temporally /and/ contextually granular >> pageviews, i.e. "a view to X page at Y time", or just temporally granular, >> so "a view to a page on enwiki at X time"? If the latter you've got more of >> a shot, I suspect. >> >> I only want the latter - I am not concerned with the context so much as just >> “a view to a page on enwiki at X time.” >> >> Hirav >> >> >> On Apr 13, 2015, at 5:00 AM, [email protected] wrote: >> >> Send Analytics mailing list submissions to >> [email protected] >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://lists.wikimedia.org/mailman/listinfo/analytics >> or, via email, send a message with subject or body 'help' to >> [email protected] >> >> You can reach the person managing the list at >> [email protected] >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Analytics digest..." >> >> >> Today's Topics: >> >> 1. Re: Page views on a more frequent than hourly basis (Pine W) >> 2. Re: Page views on a more frequent than hourly basis (Oliver Keyes) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Mon, 13 Apr 2015 00:47:31 -0700 >> From: Pine W <[email protected]> >> To: "A mailing list for the Analytics Team at WMF and everybody who >> has an interest in Wikipedia and analytics." >> <[email protected]> >> Cc: Bharath Sitaraman <[email protected]> >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> basis >> Message-ID: >> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> >> Hi, >> >> This issue of pageview data granularity has been discussed before, and the >> answer has been that hourly is the smallest increment allowed to be >> revealed publicly, for privacy reasons. >> >> I believe that the person you will want to discuss your request with is >> Toby, who I have cc'd here. >> >> Pine >> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> wrote: >> >> Hi Wikimedia Analytics Team, >> >> My colleague Bharath and I are doing research on dynamic server allocation >> algorithms and we were looking for a suitable datasets to test our >> predictive algorithm on. We noticed that Wikimedia has an amazing data set >> of hourly page views, but we were looking for something a bit more >> granular, such as aggregated page requests to English Wikipedia on a minute >> by minute basis or second by second basis if possible. >> >> We are more than happy to pour through any raw data you might have that >> would help us calculate page requests at this granular level. Please let us >> know if it would be possible to get such data and if so how. Thank you in >> advance for your help. >> >> Best, >> >> Hirav Gandhi >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> <https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html> >> >> ------------------------------ >> >> Message: 2 >> Date: Mon, 13 Apr 2015 06:39:45 -0400 >> From: Oliver Keyes <[email protected]> >> To: "A mailing list for the Analytics Team at WMF and everybody who >> has an interest in Wikipedia and analytics." >> <[email protected]> >> Cc: Bharath Sitaraman <[email protected]> >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> basis >> Message-ID: >> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com> >> Content-Type: text/plain; charset=UTF-8 >> >> >> Preeetty sure that Toby is on the analytics list, Pine. He's the >> director of analytics. >> >> Hirav: would you be looking for temporally /and/ contextually granular >> pageviews, i.e. "a view to X page at Y time", or just temporally >> granular, so "a view to a page on enwiki at X time"? If the latter >> you've got more of a shot, I suspect. >> >> On 13 April 2015 at 03:47, Pine W <[email protected]> wrote: >> >> Hi, >> >> This issue of pageview data granularity has been discussed before, and the >> answer has been that hourly is the smallest increment allowed to be revealed >> publicly, for privacy reasons. >> >> I believe that the person you will want to discuss your request with is >> Toby, who I have cc'd here. >> >> Pine >> >> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> wrote: >> >> >> Hi Wikimedia Analytics Team, >> >> My colleague Bharath and I are doing research on dynamic server allocation >> algorithms and we were looking for a suitable datasets to test our >> predictive algorithm on. We noticed that Wikimedia has an amazing data set >> of hourly page views, but we were looking for something a bit more granular, >> such as aggregated page requests to English Wikipedia on a minute by minute >> basis or second by second basis if possible. >> >> We are more than happy to pour through any raw data you might have that >> would help us calculate page requests at this granular level. Please let us >> know if it would be possible to get such data and if so how. Thank you in >> advance for your help. >> >> Best, >> >> Hirav Gandhi >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> -- >> Oliver Keyes >> Research Analyst >> Wikimedia Foundation >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> End of Analytics Digest, Vol 38, Issue 21 >> ***************************************** >> >> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation -- Oliver Keyes Research Analyst Wikimedia Foundation _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
