Thanks Oliver! We would like this data for as broad of a time period as you can muster. The more days, months and year represented in the dataset, the better.
> Okay, so: > > I took an hour from the pageviews logs,[0] and aggregated pageviews to > enwiki (mobile and desktop both) by timestamp, down to one-second > resolution levels. The lowest number of pageviews to enwiki per second > was 2,981 > > So, I don't personally have a problem with generating a release of: > > 1. Pageviews per second; > 2. To enwiki; > 3. Over $TIME_PERIOD; > 4. grouping the mobile and desktop site > > But Dario or someone should chip in before I touch anything ;p > > 6am yesterday. 6am because it should be low-traffic, right? At least > given our biases towards north america and europe > > On 13 April 2015 at 11:54, Oliver Keyes <[email protected]> wrote: > > Then that sounds much more viable. I'll run a quick test now to see > > how much clustering we'd see at, say, the one-second resolution level, > > and throw it out here so we can make more informed decisions about a > > data release on this. > > > > On 13 April 2015 at 08:08, Hirav Gandhi <[email protected]> wrote: > >> Hi Oliver, > >> > >> Re: Hirav: would you be looking for temporally /and/ contextually > granular > >> pageviews, i.e. "a view to X page at Y time", or just temporally > granular, > >> so "a view to a page on enwiki at X time"? If the latter you've got > more of > >> a shot, I suspect. > >> > >> I only want the latter - I am not concerned with the context so much as > just > >> “a view to a page on enwiki at X time.” > >> > >> Hirav > >> > >> > >> On Apr 13, 2015, at 5:00 AM, [email protected] > wrote: > >> > >> Send Analytics mailing list submissions to > >> [email protected] > >> > >> To subscribe or unsubscribe via the World Wide Web, visit > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> or, via email, send a message with subject or body 'help' to > >> [email protected] > >> > >> You can reach the person managing the list at > >> [email protected] > >> > >> When replying, please edit your Subject line so it is more specific > >> than "Re: Contents of Analytics digest..." > >> > >> > >> Today's Topics: > >> > >> 1. Re: Page views on a more frequent than hourly basis (Pine W) > >> 2. Re: Page views on a more frequent than hourly basis (Oliver Keyes) > >> > >> > >> ---------------------------------------------------------------------- > >> > >> Message: 1 > >> Date: Mon, 13 Apr 2015 00:47:31 -0700 > >> From: Pine W <[email protected]> > >> To: "A mailing list for the Analytics Team at WMF and everybody who > >> has an interest in Wikipedia and analytics." > >> <[email protected]> > >> Cc: Bharath Sitaraman <[email protected]> > >> Subject: Re: [Analytics] Page views on a more frequent than hourly > >> basis > >> Message-ID: > >> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com> > >> Content-Type: text/plain; charset="utf-8" > >> > >> > >> Hi, > >> > >> This issue of pageview data granularity has been discussed before, and > the > >> answer has been that hourly is the smallest increment allowed to be > >> revealed publicly, for privacy reasons. > >> > >> I believe that the person you will want to discuss your request with is > >> Toby, who I have cc'd here. > >> > >> Pine > >> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> > wrote: > >> > >> Hi Wikimedia Analytics Team, > >> > >> My colleague Bharath and I are doing research on dynamic server > allocation > >> algorithms and we were looking for a suitable datasets to test our > >> predictive algorithm on. We noticed that Wikimedia has an amazing data > set > >> of hourly page views, but we were looking for something a bit more > >> granular, such as aggregated page requests to English Wikipedia on a > minute > >> by minute basis or second by second basis if possible. > >> > >> We are more than happy to pour through any raw data you might have that > >> would help us calculate page requests at this granular level. Please > let us > >> know if it would be possible to get such data and if so how. Thank you > in > >> advance for your help. > >> > >> Best, > >> > >> Hirav Gandhi > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > >> -------------- next part -------------- > >> An HTML attachment was scrubbed... > >> URL: > >> < > https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html > > > >> > >> ------------------------------ > >> > >> Message: 2 > >> Date: Mon, 13 Apr 2015 06:39:45 -0400 > >> From: Oliver Keyes <[email protected]> > >> To: "A mailing list for the Analytics Team at WMF and everybody who > >> has an interest in Wikipedia and analytics." > >> <[email protected]> > >> Cc: Bharath Sitaraman <[email protected]> > >> Subject: Re: [Analytics] Page views on a more frequent than hourly > >> basis > >> Message-ID: > >> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com> > >> Content-Type: text/plain; charset=UTF-8 > >> > >> > >> Preeetty sure that Toby is on the analytics list, Pine. He's the > >> director of analytics. > >> > >> Hirav: would you be looking for temporally /and/ contextually granular > >> pageviews, i.e. "a view to X page at Y time", or just temporally > >> granular, so "a view to a page on enwiki at X time"? If the latter > >> you've got more of a shot, I suspect. > >> > >> On 13 April 2015 at 03:47, Pine W <[email protected]> wrote: > >> > >> Hi, > >> > >> This issue of pageview data granularity has been discussed before, and > the > >> answer has been that hourly is the smallest increment allowed to be > revealed > >> publicly, for privacy reasons. > >> > >> I believe that the person you will want to discuss your request with is > >> Toby, who I have cc'd here. > >> > >> Pine > >> > >> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> > wrote: > >> > >> > >> Hi Wikimedia Analytics Team, > >> > >> My colleague Bharath and I are doing research on dynamic server > allocation > >> algorithms and we were looking for a suitable datasets to test our > >> predictive algorithm on. We noticed that Wikimedia has an amazing data > set > >> of hourly page views, but we were looking for something a bit more > granular, > >> such as aggregated page requests to English Wikipedia on a minute by > minute > >> basis or second by second basis if possible. > >> > >> We are more than happy to pour through any raw data you might have that > >> would help us calculate page requests at this granular level. Please > let us > >> know if it would be possible to get such data and if so how. Thank you > in > >> advance for your help. > >> > >> Best, > >> > >> Hirav Gandhi > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > >> > >> > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > >> > >> > >> > >> -- > >> Oliver Keyes > >> Research Analyst > >> Wikimedia Foundation > >> > >> > >> > >> ------------------------------ > >> > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > >> > >> End of Analytics Digest, Vol 38, Issue 21 > >> ***************************************** > >> > >> > >> > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > >> > > > > > > > > -- > > Oliver Keyes > > Research Analyst > > Wikimedia Foundation > > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation > > > > ------------------------------ > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
