And thanks for doing all of this for us! We do greatly appreciate it! Cheers, Bharath ᐧ
On Wed, Apr 15, 2015 at 10:40 AM, Bharath Sitaraman <[email protected] > wrote: > Interested in an Erlang book? :P Pretty sure I have one of those laying > around here... > > Cheers, > Bharath > ᐧ > > On Wed, Apr 15, 2015 at 10:38 AM, Oliver Keyes <[email protected]> > wrote: > >> I accept payment in books, pull requests and speaking invitations ;p. >> >> (Updated check-the-minimum query running now!) >> >> On 15 April 2015 at 13:35, Hirav Gandhi <[email protected]> wrote: >> > Sorry Oliver. Let me know where I can send the beer/coffee money to >> > compensate you for the hard work :) >> > >> > >> > >> > On Wed, Apr 15, 2015 at 10:34 AM, Oliver Keyes <[email protected]> >> wrote: >> >> >> >> /This/ you say 2.5 seconds after I've launched the query ;p. Yes, it >> >> is possible, but I'll have to recalculate the likely minimum and check >> >> that it's still okay. >> >> >> >> On 15 April 2015 at 13:32, Hirav Gandhi <[email protected]> >> wrote: >> >> > Hi Dario, >> >> > >> >> > One last question - would it be possible to break it out into mobile >> vs >> >> > desktop? We are also concerned there might be seasonality effects in >> >> > there >> >> > as well. Please let us know. >> >> > >> >> > Best, >> >> > >> >> > Hirav >> >> > >> >> > >> >> > >> >> > On Wed, Apr 15, 2015 at 10:27 AM, Dario Taraborelli >> >> > <[email protected]> wrote: >> >> >> >> >> >> thanks, both. Let's go ahead with English only and no spiders >> filtered >> >> >> or >> >> >> mobile/desktop breakdown, per Oliver. >> >> >> >> >> >> Michelle – given the aggregation level I am fine moving forward with >> >> >> this >> >> >> release, but let me know off-thread if you have any questions. >> >> >> >> >> >> Dario >> >> >> >> >> >> On Wed, Apr 15, 2015 at 9:53 AM, Oliver Keyes <[email protected] >> > >> >> >> wrote: >> >> >>> >> >> >>> Dario, >> >> >>> >> >> >>> No spider filtering, and no split between mobile and desktop; >> mobile >> >> >>> and desktop are grouped. >> >> >>> >> >> >>> On 15 April 2015 at 12:46, Hirav Gandhi <[email protected]> >> >> >>> wrote: >> >> >>> > e.g. German* >> >> >>> > >> >> >>> > I need more coffee. >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > On Wed, Apr 15, 2015 at 9:35 AM, Hirav Gandhi >> >> >>> > <[email protected]> >> >> >>> > wrote: >> >> >>> >> >> >> >>> >> Dario - we just want a representative samples of traffic for a >> >> >>> >> popular >> >> >>> >> site like Wikipedia. We thought limiting to the English >> Wikipedia >> >> >>> >> would be >> >> >>> >> easier. >> >> >>> >> >> >> >>> >> If we get aggregated data across all language Wikipedia sites, >> we >> >> >>> >> would >> >> >>> >> need someway to tease out which language is being queried when. >> >> >>> >> Some >> >> >>> >> languages (for e.g. German) we would hypothesize would have more >> >> >>> >> daily >> >> >>> >> seasonality than languages like English. >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> On Wed, Apr 15, 2015 at 9:32 AM, Dario Taraborelli >> >> >>> >> <[email protected]> wrote: >> >> >>> >>> >> >> >>> >>> Hirav, Bharath – I also want to hear from you if there's a >> >> >>> >>> specific >> >> >>> >>> reason to ask for English Wikipedia only or if a dataset >> >> >>> >>> encompassing >> >> >>> >>> aggregate pageviews across all Wikimedia properties would do >> the >> >> >>> >>> job. >> >> >>> >>> >> >> >>> >>> Dario >> >> >>> >>> >> >> >>> >>> On Wed, Apr 15, 2015 at 9:09 AM, Dario Taraborelli >> >> >>> >>> <[email protected]> wrote: >> >> >>> >>>> >> >> >>> >>>> Oliver -- thanks for running a preliminary check, I'm fine >> >> >>> >>>> releasing >> >> >>> >>>> this data in aggregate under CC0, I believe it would be >> valuable >> >> >>> >>>> for >> >> >>> >>>> this >> >> >>> >>>> and other research projects (copying Michelle from Legal). >> >> >>> >>>> >> >> >>> >>>> Before we do so, though, I want to confirm the specs: >> aggregate >> >> >>> >>>> pageviews per second to English Wikipedia, excluding bot >> traffic, >> >> >>> >>>> broken >> >> >>> >>>> down by access method (mobile web vs desktop site, not apps) >> for >> >> >>> >>>> a >> >> >>> >>>> 60-day >> >> >>> >>>> period. Oliver – are these the filters you used to identify >> the >> >> >>> >>>> data >> >> >>> >>>> point >> >> >>> >>>> with the smallest number of observations? >> >> >>> >>>> >> >> >>> >>>> Obviously, we will need to take into account this release >> when we >> >> >>> >>>> start >> >> >>> >>>> working on projects such as >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_edits >> >> >>> >>>> and >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_pageviews >> >> >>> >>>> >> >> >>> >>>> Dario >> >> >>> >>>> >> >> >>> >>>> On Mon, Apr 13, 2015 at 9:37 PM, Oliver Keyes >> >> >>> >>>> <[email protected]> >> >> >>> >>>> wrote: >> >> >>> >>>>> >> >> >>> >>>>> Bumping for Dario, per Pine's excellent example :) >> >> >>> >>>>> >> >> >>> >>>>> On 13 April 2015 at 22:18, Hirav Gandhi < >> [email protected]> >> >> >>> >>>>> wrote: >> >> >>> >>>>> > Oliver: Two months is fine. Thank you so much for your >> help! >> >> >>> >>>>> > >> >> >>> >>>>> >> On Apr 13, 2015, at 4:40 PM, >> >> >>> >>>>> >> [email protected] >> >> >>> >>>>> >> wrote: >> >> >>> >>>>> >> >> >> >>> >>>>> >> Send Analytics mailing list submissions to >> >> >>> >>>>> >> [email protected] >> >> >>> >>>>> >> >> >> >>> >>>>> >> To subscribe or unsubscribe via the World Wide Web, visit >> >> >>> >>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >> or, via email, send a message with subject or body 'help' >> to >> >> >>> >>>>> >> [email protected] >> >> >>> >>>>> >> >> >> >>> >>>>> >> You can reach the person managing the list at >> >> >>> >>>>> >> [email protected] >> >> >>> >>>>> >> >> >> >>> >>>>> >> When replying, please edit your Subject line so it is more >> >> >>> >>>>> >> specific >> >> >>> >>>>> >> than "Re: Contents of Analytics digest..." >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> Today's Topics: >> >> >>> >>>>> >> >> >> >>> >>>>> >> 1. Re: Page views on a more frequent than hourly basis >> (Pine >> >> >>> >>>>> >> W) >> >> >>> >>>>> >> 2. Re: Page views on a more frequent than hourly basis >> (Hirav >> >> >>> >>>>> >> Gandhi) >> >> >>> >>>>> >> 3. Re: Page views on a more frequent than hourly basis >> >> >>> >>>>> >> (Oliver >> >> >>> >>>>> >> Keyes) >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> ---------------------------------------------------------------------- >> >> >>> >>>>> >> >> >> >>> >>>>> >> Message: 1 >> >> >>> >>>>> >> Date: Mon, 13 Apr 2015 13:34:23 -0700 >> >> >>> >>>>> >> From: Pine W <[email protected]> >> >> >>> >>>>> >> To: "A mailing list for the Analytics Team at WMF and >> >> >>> >>>>> >> everybody >> >> >>> >>>>> >> who >> >> >>> >>>>> >> has an interest in Wikipedia and analytics." >> >> >>> >>>>> >> <[email protected]> >> >> >>> >>>>> >> Subject: Re: [Analytics] Page views on a more frequent >> than >> >> >>> >>>>> >> hourly >> >> >>> >>>>> >> basis >> >> >>> >>>>> >> Message-ID: >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> <CAF= >> [email protected]> >> >> >>> >>>>> >> Content-Type: text/plain; charset="utf-8" >> >> >>> >>>>> >> >> >> >>> >>>>> >> Hi Oliver, re ccing people who are on list, this is the >> >> >>> >>>>> >> protocol >> >> >>> >>>>> >> we >> >> >>> >>>>> >> followed in IEGCom to ping people who are subscribed and >> >> >>> >>>>> >> mentioned >> >> >>> >>>>> >> in >> >> >>> >>>>> >> certain emails but, like many of us, may automatically >> move >> >> >>> >>>>> >> emails >> >> >>> >>>>> >> from >> >> >>> >>>>> >> lists directly to folders where they may be unread for >> days. >> >> >>> >>>>> >> So >> >> >>> >>>>> >> there is a >> >> >>> >>>>> >> reason to do this. >> >> >>> >>>>> >> >> >> >>> >>>>> >> Thanks, >> >> >>> >>>>> >> >> >> >>> >>>>> >> Pine >> >> >>> >>>>> >> -------------- next part -------------- >> >> >>> >>>>> >> An HTML attachment was scrubbed... >> >> >>> >>>>> >> URL: >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> < >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/aac0ef89/attachment-0001.html >> > >> >> >>> >>>>> >> >> >> >>> >>>>> >> ------------------------------ >> >> >>> >>>>> >> >> >> >>> >>>>> >> Message: 2 >> >> >>> >>>>> >> Date: Mon, 13 Apr 2015 16:30:43 -0700 >> >> >>> >>>>> >> From: Hirav Gandhi <[email protected]> >> >> >>> >>>>> >> To: [email protected] >> >> >>> >>>>> >> Subject: Re: [Analytics] Page views on a more frequent >> than >> >> >>> >>>>> >> hourly >> >> >>> >>>>> >> basis >> >> >>> >>>>> >> Message-ID: >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> <CANzC_EOvi4MP7G_SsxvW= >> [email protected]> >> >> >>> >>>>> >> Content-Type: text/plain; charset="utf-8" >> >> >>> >>>>> >> >> >> >>> >>>>> >> Thanks Oliver! >> >> >>> >>>>> >> >> >> >>> >>>>> >> We would like this data for as broad of a time period as >> you >> >> >>> >>>>> >> can >> >> >>> >>>>> >> muster. >> >> >>> >>>>> >> The more days, months and year represented in the dataset, >> >> >>> >>>>> >> the >> >> >>> >>>>> >> better. >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >>> Okay, so: >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> I took an hour from the pageviews logs,[0] and aggregated >> >> >>> >>>>> >>> pageviews >> >> >>> >>>>> >>> to >> >> >>> >>>>> >>> enwiki (mobile and desktop both) by timestamp, down to >> >> >>> >>>>> >>> one-second >> >> >>> >>>>> >>> resolution levels. The lowest number of pageviews to >> enwiki >> >> >>> >>>>> >>> per >> >> >>> >>>>> >>> second >> >> >>> >>>>> >>> was 2,981 >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> So, I don't personally have a problem with generating a >> >> >>> >>>>> >>> release >> >> >>> >>>>> >>> of: >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> 1. Pageviews per second; >> >> >>> >>>>> >>> 2. To enwiki; >> >> >>> >>>>> >>> 3. Over $TIME_PERIOD; >> >> >>> >>>>> >>> 4. grouping the mobile and desktop site >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> But Dario or someone should chip in before I touch >> anything >> >> >>> >>>>> >>> ;p >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> 6am yesterday. 6am because it should be low-traffic, >> right? >> >> >>> >>>>> >>> At >> >> >>> >>>>> >>> least >> >> >>> >>>>> >>> given our biases towards north america and europe >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> On 13 April 2015 at 11:54, Oliver Keyes >> >> >>> >>>>> >>> <[email protected]> >> >> >>> >>>>> >>> wrote: >> >> >>> >>>>> >>>> Then that sounds much more viable. I'll run a quick test >> >> >>> >>>>> >>>> now >> >> >>> >>>>> >>>> to >> >> >>> >>>>> >>>> see >> >> >>> >>>>> >>>> how much clustering we'd see at, say, the one-second >> >> >>> >>>>> >>>> resolution >> >> >>> >>>>> >>>> level, >> >> >>> >>>>> >>>> and throw it out here so we can make more informed >> >> >>> >>>>> >>>> decisions >> >> >>> >>>>> >>>> about >> >> >>> >>>>> >>>> a >> >> >>> >>>>> >>>> data release on this. >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> On 13 April 2015 at 08:08, Hirav Gandhi >> >> >>> >>>>> >>>> <[email protected]> >> >> >>> >>>>> >>>> wrote: >> >> >>> >>>>> >>>>> Hi Oliver, >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Re: Hirav: would you be looking for temporally /and/ >> >> >>> >>>>> >>>>> contextually >> >> >>> >>>>> >>> granular >> >> >>> >>>>> >>>>> pageviews, i.e. "a view to X page at Y time", or just >> >> >>> >>>>> >>>>> temporally >> >> >>> >>>>> >>> granular, >> >> >>> >>>>> >>>>> so "a view to a page on enwiki at X time"? If the >> latter >> >> >>> >>>>> >>>>> you've >> >> >>> >>>>> >>>>> got >> >> >>> >>>>> >>> more of >> >> >>> >>>>> >>>>> a shot, I suspect. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> I only want the latter - I am not concerned with the >> >> >>> >>>>> >>>>> context >> >> >>> >>>>> >>>>> so >> >> >>> >>>>> >>>>> much as >> >> >>> >>>>> >>> just >> >> >>> >>>>> >>>>> “a view to a page on enwiki at X time.” >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hirav >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> On Apr 13, 2015, at 5:00 AM, >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>> wrote: >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Send Analytics mailing list submissions to >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> To subscribe or unsubscribe via the World Wide Web, >> visit >> >> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>> or, via email, send a message with subject or body >> 'help' >> >> >>> >>>>> >>>>> to >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> You can reach the person managing the list at >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> When replying, please edit your Subject line so it is >> more >> >> >>> >>>>> >>>>> specific >> >> >>> >>>>> >>>>> than "Re: Contents of Analytics digest..." >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Today's Topics: >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> 1. Re: Page views on a more frequent than hourly basis >> >> >>> >>>>> >>>>> (Pine >> >> >>> >>>>> >>>>> W) >> >> >>> >>>>> >>>>> 2. Re: Page views on a more frequent than hourly basis >> >> >>> >>>>> >>>>> (Oliver >> >> >>> >>>>> >>>>> Keyes) >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> ---------------------------------------------------------------------- >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Message: 1 >> >> >>> >>>>> >>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700 >> >> >>> >>>>> >>>>> From: Pine W <[email protected]> >> >> >>> >>>>> >>>>> To: "A mailing list for the Analytics Team at WMF and >> >> >>> >>>>> >>>>> everybody >> >> >>> >>>>> >>>>> who >> >> >>> >>>>> >>>>> has an interest in Wikipedia and analytics." >> >> >>> >>>>> >>>>> <[email protected]> >> >> >>> >>>>> >>>>> Cc: Bharath Sitaraman <[email protected]> >> >> >>> >>>>> >>>>> Subject: Re: [Analytics] Page views on a more frequent >> >> >>> >>>>> >>>>> than >> >> >>> >>>>> >>>>> hourly >> >> >>> >>>>> >>>>> basis >> >> >>> >>>>> >>>>> Message-ID: >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> <CAF= >> [email protected]> >> >> >>> >>>>> >>>>> Content-Type: text/plain; charset="utf-8" >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hi, >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> This issue of pageview data granularity has been >> discussed >> >> >>> >>>>> >>>>> before, and >> >> >>> >>>>> >>> the >> >> >>> >>>>> >>>>> answer has been that hourly is the smallest increment >> >> >>> >>>>> >>>>> allowed >> >> >>> >>>>> >>>>> to >> >> >>> >>>>> >>>>> be >> >> >>> >>>>> >>>>> revealed publicly, for privacy reasons. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> I believe that the person you will want to discuss your >> >> >>> >>>>> >>>>> request >> >> >>> >>>>> >>>>> with is >> >> >>> >>>>> >>>>> Toby, who I have cc'd here. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Pine >> >> >>> >>>>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" >> >> >>> >>>>> >>>>> <[email protected]> >> >> >>> >>>>> >>> wrote: >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hi Wikimedia Analytics Team, >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> My colleague Bharath and I are doing research on >> dynamic >> >> >>> >>>>> >>>>> server >> >> >>> >>>>> >>> allocation >> >> >>> >>>>> >>>>> algorithms and we were looking for a suitable datasets >> to >> >> >>> >>>>> >>>>> test >> >> >>> >>>>> >>>>> our >> >> >>> >>>>> >>>>> predictive algorithm on. We noticed that Wikimedia has >> an >> >> >>> >>>>> >>>>> amazing >> >> >>> >>>>> >>>>> data >> >> >>> >>>>> >>> set >> >> >>> >>>>> >>>>> of hourly page views, but we were looking for >> something a >> >> >>> >>>>> >>>>> bit >> >> >>> >>>>> >>>>> more >> >> >>> >>>>> >>>>> granular, such as aggregated page requests to English >> >> >>> >>>>> >>>>> Wikipedia >> >> >>> >>>>> >>>>> on a >> >> >>> >>>>> >>> minute >> >> >>> >>>>> >>>>> by minute basis or second by second basis if possible. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> We are more than happy to pour through any raw data you >> >> >>> >>>>> >>>>> might >> >> >>> >>>>> >>>>> have that >> >> >>> >>>>> >>>>> would help us calculate page requests at this granular >> >> >>> >>>>> >>>>> level. >> >> >>> >>>>> >>>>> Please >> >> >>> >>>>> >>> let us >> >> >>> >>>>> >>>>> know if it would be possible to get such data and if so >> >> >>> >>>>> >>>>> how. >> >> >>> >>>>> >>>>> Thank you >> >> >>> >>>>> >>> in >> >> >>> >>>>> >>>>> advance for your help. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Best, >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hirav Gandhi >> >> >>> >>>>> >>>>> _______________________________________________ >> >> >>> >>>>> >>>>> Analytics mailing list >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> -------------- next part -------------- >> >> >>> >>>>> >>>>> An HTML attachment was scrubbed... >> >> >>> >>>>> >>>>> URL: >> >> >>> >>>>> >>>>> < >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> ------------------------------ >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Message: 2 >> >> >>> >>>>> >>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400 >> >> >>> >>>>> >>>>> From: Oliver Keyes <[email protected]> >> >> >>> >>>>> >>>>> To: "A mailing list for the Analytics Team at WMF and >> >> >>> >>>>> >>>>> everybody >> >> >>> >>>>> >>>>> who >> >> >>> >>>>> >>>>> has an interest in Wikipedia and analytics." >> >> >>> >>>>> >>>>> <[email protected]> >> >> >>> >>>>> >>>>> Cc: Bharath Sitaraman <[email protected]> >> >> >>> >>>>> >>>>> Subject: Re: [Analytics] Page views on a more frequent >> >> >>> >>>>> >>>>> than >> >> >>> >>>>> >>>>> hourly >> >> >>> >>>>> >>>>> basis >> >> >>> >>>>> >>>>> Message-ID: >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-= >> [email protected]> >> >> >>> >>>>> >>>>> Content-Type: text/plain; charset=UTF-8 >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Preeetty sure that Toby is on the analytics list, Pine. >> >> >>> >>>>> >>>>> He's >> >> >>> >>>>> >>>>> the >> >> >>> >>>>> >>>>> director of analytics. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hirav: would you be looking for temporally /and/ >> >> >>> >>>>> >>>>> contextually >> >> >>> >>>>> >>>>> granular >> >> >>> >>>>> >>>>> pageviews, i.e. "a view to X page at Y time", or just >> >> >>> >>>>> >>>>> temporally >> >> >>> >>>>> >>>>> granular, so "a view to a page on enwiki at X time"? If >> >> >>> >>>>> >>>>> the >> >> >>> >>>>> >>>>> latter >> >> >>> >>>>> >>>>> you've got more of a shot, I suspect. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> On 13 April 2015 at 03:47, Pine W <[email protected] >> > >> >> >>> >>>>> >>>>> wrote: >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hi, >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> This issue of pageview data granularity has been >> discussed >> >> >>> >>>>> >>>>> before, and >> >> >>> >>>>> >>> the >> >> >>> >>>>> >>>>> answer has been that hourly is the smallest increment >> >> >>> >>>>> >>>>> allowed >> >> >>> >>>>> >>>>> to >> >> >>> >>>>> >>>>> be >> >> >>> >>>>> >>> revealed >> >> >>> >>>>> >>>>> publicly, for privacy reasons. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> I believe that the person you will want to discuss your >> >> >>> >>>>> >>>>> request >> >> >>> >>>>> >>>>> with is >> >> >>> >>>>> >>>>> Toby, who I have cc'd here. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Pine >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" >> >> >>> >>>>> >>>>> <[email protected]> >> >> >>> >>>>> >>> wrote: >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hi Wikimedia Analytics Team, >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> My colleague Bharath and I are doing research on >> dynamic >> >> >>> >>>>> >>>>> server >> >> >>> >>>>> >>> allocation >> >> >>> >>>>> >>>>> algorithms and we were looking for a suitable datasets >> to >> >> >>> >>>>> >>>>> test >> >> >>> >>>>> >>>>> our >> >> >>> >>>>> >>>>> predictive algorithm on. We noticed that Wikimedia has >> an >> >> >>> >>>>> >>>>> amazing >> >> >>> >>>>> >>>>> data >> >> >>> >>>>> >>> set >> >> >>> >>>>> >>>>> of hourly page views, but we were looking for >> something a >> >> >>> >>>>> >>>>> bit >> >> >>> >>>>> >>>>> more >> >> >>> >>>>> >>> granular, >> >> >>> >>>>> >>>>> such as aggregated page requests to English Wikipedia >> on a >> >> >>> >>>>> >>>>> minute >> >> >>> >>>>> >>>>> by >> >> >>> >>>>> >>> minute >> >> >>> >>>>> >>>>> basis or second by second basis if possible. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> We are more than happy to pour through any raw data you >> >> >>> >>>>> >>>>> might >> >> >>> >>>>> >>>>> have that >> >> >>> >>>>> >>>>> would help us calculate page requests at this granular >> >> >>> >>>>> >>>>> level. >> >> >>> >>>>> >>>>> Please >> >> >>> >>>>> >>> let us >> >> >>> >>>>> >>>>> know if it would be possible to get such data and if so >> >> >>> >>>>> >>>>> how. >> >> >>> >>>>> >>>>> Thank you >> >> >>> >>>>> >>> in >> >> >>> >>>>> >>>>> advance for your help. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Best, >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> Hirav Gandhi >> >> >>> >>>>> >>>>> _______________________________________________ >> >> >>> >>>>> >>>>> Analytics mailing list >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> _______________________________________________ >> >> >>> >>>>> >>>>> Analytics mailing list >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> -- >> >> >>> >>>>> >>>>> Oliver Keyes >> >> >>> >>>>> >>>>> Research Analyst >> >> >>> >>>>> >>>>> Wikimedia Foundation >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> ------------------------------ >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> _______________________________________________ >> >> >>> >>>>> >>>>> Analytics mailing list >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> End of Analytics Digest, Vol 38, Issue 21 >> >> >>> >>>>> >>>>> ***************************************** >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> _______________________________________________ >> >> >>> >>>>> >>>>> Analytics mailing list >> >> >>> >>>>> >>>>> [email protected] >> >> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> -- >> >> >>> >>>>> >>>> Oliver Keyes >> >> >>> >>>>> >>>> Research Analyst >> >> >>> >>>>> >>>> Wikimedia Foundation >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> -- >> >> >>> >>>>> >>> Oliver Keyes >> >> >>> >>>>> >>> Research Analyst >> >> >>> >>>>> >>> Wikimedia Foundation >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> ------------------------------ >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> _______________________________________________ >> >> >>> >>>>> >>> Analytics mailing list >> >> >>> >>>>> >>> [email protected] >> >> >>> >>>>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>> >> >> >>> >>>>> >> -------------- next part -------------- >> >> >>> >>>>> >> An HTML attachment was scrubbed... >> >> >>> >>>>> >> URL: >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> < >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/3a5df491/attachment-0001.html >> > >> >> >>> >>>>> >> >> >> >>> >>>>> >> ------------------------------ >> >> >>> >>>>> >> >> >> >>> >>>>> >> Message: 3 >> >> >>> >>>>> >> Date: Mon, 13 Apr 2015 19:40:04 -0400 >> >> >>> >>>>> >> From: Oliver Keyes <[email protected]> >> >> >>> >>>>> >> To: "A mailing list for the Analytics Team at WMF and >> >> >>> >>>>> >> everybody >> >> >>> >>>>> >> who >> >> >>> >>>>> >> has an interest in Wikipedia and analytics." >> >> >>> >>>>> >> <[email protected]> >> >> >>> >>>>> >> Subject: Re: [Analytics] Page views on a more frequent >> than >> >> >>> >>>>> >> hourly >> >> >>> >>>>> >> basis >> >> >>> >>>>> >> Message-ID: >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> < >> caauqgdd6z5ussu11vw49fdmbsrhyejxku9yopyserib79j-...@mail.gmail.com> >> >> >>> >>>>> >> Content-Type: text/plain; charset=UTF-8 >> >> >>> >>>>> >> >> >> >>> >>>>> >> .... >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> ...years? >> >> >>> >>>>> >> >> >> >>> >>>>> >> We have unsampled logs for, ah. 2 months. >> >> >>> >>>>> >> >> >> >>> >>>>> >> On 13 April 2015 at 19:30, Hirav Gandhi >> >> >>> >>>>> >> <[email protected]> >> >> >>> >>>>> >> wrote: >> >> >>> >>>>> >>> Thanks Oliver! >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> We would like this data for as broad of a time period as >> you >> >> >>> >>>>> >>> can >> >> >>> >>>>> >>> muster. The >> >> >>> >>>>> >>> more days, months and year represented in the dataset, >> the >> >> >>> >>>>> >>> better. >> >> >>> >>>>> >>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> Okay, so: >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> I took an hour from the pageviews logs,[0] and >> aggregated >> >> >>> >>>>> >>>> pageviews to >> >> >>> >>>>> >>>> enwiki (mobile and desktop both) by timestamp, down to >> >> >>> >>>>> >>>> one-second >> >> >>> >>>>> >>>> resolution levels. The lowest number of pageviews to >> enwiki >> >> >>> >>>>> >>>> per >> >> >>> >>>>> >>>> second >> >> >>> >>>>> >>>> was 2,981 >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> So, I don't personally have a problem with generating a >> >> >>> >>>>> >>>> release >> >> >>> >>>>> >>>> of: >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> 1. Pageviews per second; >> >> >>> >>>>> >>>> 2. To enwiki; >> >> >>> >>>>> >>>> 3. Over $TIME_PERIOD; >> >> >>> >>>>> >>>> 4. grouping the mobile and desktop site >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> But Dario or someone should chip in before I touch >> anything >> >> >>> >>>>> >>>> ;p >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> 6am yesterday. 6am because it should be low-traffic, >> right? >> >> >>> >>>>> >>>> At >> >> >>> >>>>> >>>> least >> >> >>> >>>>> >>>> given our biases towards north america and europe >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> On 13 April 2015 at 11:54, Oliver Keyes >> >> >>> >>>>> >>>> <[email protected]> >> >> >>> >>>>> >>>> wrote: >> >> >>> >>>>> >>>>> Then that sounds much more viable. I'll run a quick >> test >> >> >>> >>>>> >>>>> now >> >> >>> >>>>> >>>>> to >> >> >>> >>>>> >>>>> see >> >> >>> >>>>> >>>>> how much clustering we'd see at, say, the one-second >> >> >>> >>>>> >>>>> resolution >> >> >>> >>>>> >>>>> level, >> >> >>> >>>>> >>>>> and throw it out here so we can make more informed >> >> >>> >>>>> >>>>> decisions >> >> >>> >>>>> >>>>> about a >> >> >>> >>>>> >>>>> data release on this. >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> On 13 April 2015 at 08:08, Hirav Gandhi >> >> >>> >>>>> >>>>> <[email protected]> >> >> >>> >>>>> >>>>> wrote: >> >> >>> >>>>> >>>>>> Hi Oliver, >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Re: Hirav: would you be looking for temporally /and/ >> >> >>> >>>>> >>>>>> contextually >> >> >>> >>>>> >>>>>> granular >> >> >>> >>>>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just >> >> >>> >>>>> >>>>>> temporally >> >> >>> >>>>> >>>>>> granular, >> >> >>> >>>>> >>>>>> so "a view to a page on enwiki at X time"? If the >> latter >> >> >>> >>>>> >>>>>> you've >> >> >>> >>>>> >>>>>> got >> >> >>> >>>>> >>>>>> more of >> >> >>> >>>>> >>>>>> a shot, I suspect. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> I only want the latter - I am not concerned with the >> >> >>> >>>>> >>>>>> context >> >> >>> >>>>> >>>>>> so >> >> >>> >>>>> >>>>>> much as >> >> >>> >>>>> >>>>>> just >> >> >>> >>>>> >>>>>> “a view to a page on enwiki at X time.” >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hirav >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> On Apr 13, 2015, at 5:00 AM, >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> wrote: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Send Analytics mailing list submissions to >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> To subscribe or unsubscribe via the World Wide Web, >> visit >> >> >>> >>>>> >>>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>>> or, via email, send a message with subject or body >> 'help' >> >> >>> >>>>> >>>>>> to >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> You can reach the person managing the list at >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> When replying, please edit your Subject line so it is >> >> >>> >>>>> >>>>>> more >> >> >>> >>>>> >>>>>> specific >> >> >>> >>>>> >>>>>> than "Re: Contents of Analytics digest..." >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Today's Topics: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> 1. Re: Page views on a more frequent than hourly basis >> >> >>> >>>>> >>>>>> (Pine W) >> >> >>> >>>>> >>>>>> 2. Re: Page views on a more frequent than hourly basis >> >> >>> >>>>> >>>>>> (Oliver >> >> >>> >>>>> >>>>>> Keyes) >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> ---------------------------------------------------------------------- >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Message: 1 >> >> >>> >>>>> >>>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700 >> >> >>> >>>>> >>>>>> From: Pine W <[email protected]> >> >> >>> >>>>> >>>>>> To: "A mailing list for the Analytics Team at WMF and >> >> >>> >>>>> >>>>>> everybody >> >> >>> >>>>> >>>>>> who >> >> >>> >>>>> >>>>>> has an interest in Wikipedia and analytics." >> >> >>> >>>>> >>>>>> <[email protected]> >> >> >>> >>>>> >>>>>> Cc: Bharath Sitaraman <[email protected]> >> >> >>> >>>>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent >> >> >>> >>>>> >>>>>> than >> >> >>> >>>>> >>>>>> hourly >> >> >>> >>>>> >>>>>> basis >> >> >>> >>>>> >>>>>> Message-ID: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> <CAF= >> [email protected]> >> >> >>> >>>>> >>>>>> Content-Type: text/plain; charset="utf-8" >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hi, >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> This issue of pageview data granularity has been >> >> >>> >>>>> >>>>>> discussed >> >> >>> >>>>> >>>>>> before, and >> >> >>> >>>>> >>>>>> the >> >> >>> >>>>> >>>>>> answer has been that hourly is the smallest increment >> >> >>> >>>>> >>>>>> allowed to >> >> >>> >>>>> >>>>>> be >> >> >>> >>>>> >>>>>> revealed publicly, for privacy reasons. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> I believe that the person you will want to discuss >> your >> >> >>> >>>>> >>>>>> request >> >> >>> >>>>> >>>>>> with is >> >> >>> >>>>> >>>>>> Toby, who I have cc'd here. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Pine >> >> >>> >>>>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" >> >> >>> >>>>> >>>>>> <[email protected]> >> >> >>> >>>>> >>>>>> wrote: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hi Wikimedia Analytics Team, >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> My colleague Bharath and I are doing research on >> dynamic >> >> >>> >>>>> >>>>>> server >> >> >>> >>>>> >>>>>> allocation >> >> >>> >>>>> >>>>>> algorithms and we were looking for a suitable >> datasets to >> >> >>> >>>>> >>>>>> test >> >> >>> >>>>> >>>>>> our >> >> >>> >>>>> >>>>>> predictive algorithm on. We noticed that Wikimedia >> has an >> >> >>> >>>>> >>>>>> amazing data >> >> >>> >>>>> >>>>>> set >> >> >>> >>>>> >>>>>> of hourly page views, but we were looking for >> something a >> >> >>> >>>>> >>>>>> bit >> >> >>> >>>>> >>>>>> more >> >> >>> >>>>> >>>>>> granular, such as aggregated page requests to English >> >> >>> >>>>> >>>>>> Wikipedia >> >> >>> >>>>> >>>>>> on a >> >> >>> >>>>> >>>>>> minute >> >> >>> >>>>> >>>>>> by minute basis or second by second basis if possible. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> We are more than happy to pour through any raw data >> you >> >> >>> >>>>> >>>>>> might >> >> >>> >>>>> >>>>>> have that >> >> >>> >>>>> >>>>>> would help us calculate page requests at this granular >> >> >>> >>>>> >>>>>> level. >> >> >>> >>>>> >>>>>> Please >> >> >>> >>>>> >>>>>> let us >> >> >>> >>>>> >>>>>> know if it would be possible to get such data and if >> so >> >> >>> >>>>> >>>>>> how. >> >> >>> >>>>> >>>>>> Thank you >> >> >>> >>>>> >>>>>> in >> >> >>> >>>>> >>>>>> advance for your help. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Best, >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hirav Gandhi >> >> >>> >>>>> >>>>>> _______________________________________________ >> >> >>> >>>>> >>>>>> Analytics mailing list >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> -------------- next part -------------- >> >> >>> >>>>> >>>>>> An HTML attachment was scrubbed... >> >> >>> >>>>> >>>>>> URL: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> < >> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html >> > >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> ------------------------------ >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Message: 2 >> >> >>> >>>>> >>>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400 >> >> >>> >>>>> >>>>>> From: Oliver Keyes <[email protected]> >> >> >>> >>>>> >>>>>> To: "A mailing list for the Analytics Team at WMF and >> >> >>> >>>>> >>>>>> everybody >> >> >>> >>>>> >>>>>> who >> >> >>> >>>>> >>>>>> has an interest in Wikipedia and analytics." >> >> >>> >>>>> >>>>>> <[email protected]> >> >> >>> >>>>> >>>>>> Cc: Bharath Sitaraman <[email protected]> >> >> >>> >>>>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent >> >> >>> >>>>> >>>>>> than >> >> >>> >>>>> >>>>>> hourly >> >> >>> >>>>> >>>>>> basis >> >> >>> >>>>> >>>>>> Message-ID: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-= >> [email protected]> >> >> >>> >>>>> >>>>>> Content-Type: text/plain; charset=UTF-8 >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Preeetty sure that Toby is on the analytics list, >> Pine. >> >> >>> >>>>> >>>>>> He's >> >> >>> >>>>> >>>>>> the >> >> >>> >>>>> >>>>>> director of analytics. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hirav: would you be looking for temporally /and/ >> >> >>> >>>>> >>>>>> contextually >> >> >>> >>>>> >>>>>> granular >> >> >>> >>>>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just >> >> >>> >>>>> >>>>>> temporally >> >> >>> >>>>> >>>>>> granular, so "a view to a page on enwiki at X time"? >> If >> >> >>> >>>>> >>>>>> the >> >> >>> >>>>> >>>>>> latter >> >> >>> >>>>> >>>>>> you've got more of a shot, I suspect. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> On 13 April 2015 at 03:47, Pine W < >> [email protected]> >> >> >>> >>>>> >>>>>> wrote: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hi, >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> This issue of pageview data granularity has been >> >> >>> >>>>> >>>>>> discussed >> >> >>> >>>>> >>>>>> before, and >> >> >>> >>>>> >>>>>> the >> >> >>> >>>>> >>>>>> answer has been that hourly is the smallest increment >> >> >>> >>>>> >>>>>> allowed to >> >> >>> >>>>> >>>>>> be >> >> >>> >>>>> >>>>>> revealed >> >> >>> >>>>> >>>>>> publicly, for privacy reasons. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> I believe that the person you will want to discuss >> your >> >> >>> >>>>> >>>>>> request >> >> >>> >>>>> >>>>>> with is >> >> >>> >>>>> >>>>>> Toby, who I have cc'd here. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Pine >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" >> >> >>> >>>>> >>>>>> <[email protected]> >> >> >>> >>>>> >>>>>> wrote: >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hi Wikimedia Analytics Team, >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> My colleague Bharath and I are doing research on >> dynamic >> >> >>> >>>>> >>>>>> server >> >> >>> >>>>> >>>>>> allocation >> >> >>> >>>>> >>>>>> algorithms and we were looking for a suitable >> datasets to >> >> >>> >>>>> >>>>>> test >> >> >>> >>>>> >>>>>> our >> >> >>> >>>>> >>>>>> predictive algorithm on. We noticed that Wikimedia >> has an >> >> >>> >>>>> >>>>>> amazing data >> >> >>> >>>>> >>>>>> set >> >> >>> >>>>> >>>>>> of hourly page views, but we were looking for >> something a >> >> >>> >>>>> >>>>>> bit >> >> >>> >>>>> >>>>>> more >> >> >>> >>>>> >>>>>> granular, >> >> >>> >>>>> >>>>>> such as aggregated page requests to English Wikipedia >> on >> >> >>> >>>>> >>>>>> a >> >> >>> >>>>> >>>>>> minute by >> >> >>> >>>>> >>>>>> minute >> >> >>> >>>>> >>>>>> basis or second by second basis if possible. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> We are more than happy to pour through any raw data >> you >> >> >>> >>>>> >>>>>> might >> >> >>> >>>>> >>>>>> have that >> >> >>> >>>>> >>>>>> would help us calculate page requests at this granular >> >> >>> >>>>> >>>>>> level. >> >> >>> >>>>> >>>>>> Please >> >> >>> >>>>> >>>>>> let us >> >> >>> >>>>> >>>>>> know if it would be possible to get such data and if >> so >> >> >>> >>>>> >>>>>> how. >> >> >>> >>>>> >>>>>> Thank you >> >> >>> >>>>> >>>>>> in >> >> >>> >>>>> >>>>>> advance for your help. >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Best, >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> Hirav Gandhi >> >> >>> >>>>> >>>>>> _______________________________________________ >> >> >>> >>>>> >>>>>> Analytics mailing list >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> _______________________________________________ >> >> >>> >>>>> >>>>>> Analytics mailing list >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> -- >> >> >>> >>>>> >>>>>> Oliver Keyes >> >> >>> >>>>> >>>>>> Research Analyst >> >> >>> >>>>> >>>>>> Wikimedia Foundation >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> ------------------------------ >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> _______________________________________________ >> >> >>> >>>>> >>>>>> Analytics mailing list >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> End of Analytics Digest, Vol 38, Issue 21 >> >> >>> >>>>> >>>>>> ***************************************** >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>>> _______________________________________________ >> >> >>> >>>>> >>>>>> Analytics mailing list >> >> >>> >>>>> >>>>>> [email protected] >> >> >>> >>>>> >>>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> >> >> >>> >>>>> >>>>> -- >> >> >>> >>>>> >>>>> Oliver Keyes >> >> >>> >>>>> >>>>> Research Analyst >> >> >>> >>>>> >>>>> Wikimedia Foundation >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> -- >> >> >>> >>>>> >>>> Oliver Keyes >> >> >>> >>>>> >>>> Research Analyst >> >> >>> >>>>> >>>> Wikimedia Foundation >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> ------------------------------ >> >> >>> >>>>> >>>> >> >> >>> >>>>> >>>> _______________________________________________ >> >> >>> >>>>> >>>> Analytics mailing list >> >> >>> >>>>> >>>> [email protected] >> >> >>> >>>>> >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> >> >> >>> >>>>> >>> _______________________________________________ >> >> >>> >>>>> >>> Analytics mailing list >> >> >>> >>>>> >>> [email protected] >> >> >>> >>>>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >>> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> -- >> >> >>> >>>>> >> Oliver Keyes >> >> >>> >>>>> >> Research Analyst >> >> >>> >>>>> >> Wikimedia Foundation >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> ------------------------------ >> >> >>> >>>>> >> >> >> >>> >>>>> >> _______________________________________________ >> >> >>> >>>>> >> Analytics mailing list >> >> >>> >>>>> >> [email protected] >> >> >>> >>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >> >> >> >>> >>>>> >> >> >> >>> >>>>> >> End of Analytics Digest, Vol 38, Issue 24 >> >> >>> >>>>> >> ***************************************** >> >> >>> >>>>> > >> >> >>> >>>>> > >> >> >>> >>>>> > _______________________________________________ >> >> >>> >>>>> > Analytics mailing list >> >> >>> >>>>> > [email protected] >> >> >>> >>>>> > https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >>> >>>>> >> >> >>> >>>>> >> >> >>> >>>>> >> >> >>> >>>>> -- >> >> >>> >>>>> Oliver Keyes >> >> >>> >>>>> Research Analyst >> >> >>> >>>>> Wikimedia Foundation >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> >>>> -- >> >> >>> >>>> Dario Taraborelli >> >> >>> >>>> Senior Research Scientist, Research and Data Lead >> >> >>> >>>> Wikimedia Foundation >> >> >>> >>>> http://wikimediafoundation.org >> >> >>> >>>> http://nitens.org/taraborelli >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> -- >> >> >>> >>> Dario Taraborelli >> >> >>> >>> Senior Research Scientist, Research and Data Lead >> >> >>> >>> Wikimedia Foundation >> >> >>> >>> http://wikimediafoundation.org >> >> >>> >>> http://nitens.org/taraborelli >> >> >>> >> >> >> >>> >> >> >> >>> > >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> Oliver Keyes >> >> >>> Research Analyst >> >> >>> Wikimedia Foundation >> >> >>> >> >> >>> _______________________________________________ >> >> >>> Analytics mailing list >> >> >>> [email protected] >> >> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Dario Taraborelli >> >> >> Senior Research Scientist, Research and Data Lead >> >> >> Wikimedia Foundation >> >> >> http://wikimediafoundation.org >> >> >> http://nitens.org/taraborelli >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > Analytics mailing list >> >> > [email protected] >> >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > >> >> >> >> >> >> >> >> -- >> >> Oliver Keyes >> >> Research Analyst >> >> Wikimedia Foundation >> >> >> >> _______________________________________________ >> >> Analytics mailing list >> >> [email protected] >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> > >> > >> > _______________________________________________ >> > Analytics mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> >> >> >> -- >> Oliver Keyes >> Research Analyst >> Wikimedia Foundation >> > > > > -- > Bharath Sitaraman > [email protected] > -- Bharath Sitaraman [email protected]
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
