Bumping for Dario, per Pine's excellent example :) On 13 April 2015 at 22:18, Hirav Gandhi <[email protected]> wrote: > Oliver: Two months is fine. Thank you so much for your help! > >> On Apr 13, 2015, at 4:40 PM, [email protected] wrote: >> >> Send Analytics mailing list submissions to >> [email protected] >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://lists.wikimedia.org/mailman/listinfo/analytics >> or, via email, send a message with subject or body 'help' to >> [email protected] >> >> You can reach the person managing the list at >> [email protected] >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Analytics digest..." >> >> >> Today's Topics: >> >> 1. Re: Page views on a more frequent than hourly basis (Pine W) >> 2. Re: Page views on a more frequent than hourly basis (Hirav Gandhi) >> 3. Re: Page views on a more frequent than hourly basis (Oliver Keyes) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Mon, 13 Apr 2015 13:34:23 -0700 >> From: Pine W <[email protected]> >> To: "A mailing list for the Analytics Team at WMF and everybody who >> has an interest in Wikipedia and analytics." >> <[email protected]> >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> basis >> Message-ID: >> <CAF=dyjjzmdfthz+0+lwnhb9m8xuod4wetgcfuxyb9qyf7cy...@mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Hi Oliver, re ccing people who are on list, this is the protocol we >> followed in IEGCom to ping people who are subscribed and mentioned in >> certain emails but, like many of us, may automatically move emails from >> lists directly to folders where they may be unread for days. So there is a >> reason to do this. >> >> Thanks, >> >> Pine >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> <https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/aac0ef89/attachment-0001.html> >> >> ------------------------------ >> >> Message: 2 >> Date: Mon, 13 Apr 2015 16:30:43 -0700 >> From: Hirav Gandhi <[email protected]> >> To: [email protected] >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> basis >> Message-ID: >> <CANzC_EOvi4MP7G_SsxvW=uojpt2vxbnfmhcipqn1pumace-...@mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Thanks Oliver! >> >> We would like this data for as broad of a time period as you can muster. >> The more days, months and year represented in the dataset, the better. >> >> >>> Okay, so: >>> >>> I took an hour from the pageviews logs,[0] and aggregated pageviews to >>> enwiki (mobile and desktop both) by timestamp, down to one-second >>> resolution levels. The lowest number of pageviews to enwiki per second >>> was 2,981 >>> >>> So, I don't personally have a problem with generating a release of: >>> >>> 1. Pageviews per second; >>> 2. To enwiki; >>> 3. Over $TIME_PERIOD; >>> 4. grouping the mobile and desktop site >>> >>> But Dario or someone should chip in before I touch anything ;p >>> >>> 6am yesterday. 6am because it should be low-traffic, right? At least >>> given our biases towards north america and europe >>> >>> On 13 April 2015 at 11:54, Oliver Keyes <[email protected]> wrote: >>>> Then that sounds much more viable. I'll run a quick test now to see >>>> how much clustering we'd see at, say, the one-second resolution level, >>>> and throw it out here so we can make more informed decisions about a >>>> data release on this. >>>> >>>> On 13 April 2015 at 08:08, Hirav Gandhi <[email protected]> wrote: >>>>> Hi Oliver, >>>>> >>>>> Re: Hirav: would you be looking for temporally /and/ contextually >>> granular >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >>> granular, >>>>> so "a view to a page on enwiki at X time"? If the latter you've got >>> more of >>>>> a shot, I suspect. >>>>> >>>>> I only want the latter - I am not concerned with the context so much as >>> just >>>>> “a view to a page on enwiki at X time.” >>>>> >>>>> Hirav >>>>> >>>>> >>>>> On Apr 13, 2015, at 5:00 AM, [email protected] >>> wrote: >>>>> >>>>> Send Analytics mailing list submissions to >>>>> [email protected] >>>>> >>>>> To subscribe or unsubscribe via the World Wide Web, visit >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> or, via email, send a message with subject or body 'help' to >>>>> [email protected] >>>>> >>>>> You can reach the person managing the list at >>>>> [email protected] >>>>> >>>>> When replying, please edit your Subject line so it is more specific >>>>> than "Re: Contents of Analytics digest..." >>>>> >>>>> >>>>> Today's Topics: >>>>> >>>>> 1. Re: Page views on a more frequent than hourly basis (Pine W) >>>>> 2. Re: Page views on a more frequent than hourly basis (Oliver Keyes) >>>>> >>>>> >>>>> ---------------------------------------------------------------------- >>>>> >>>>> Message: 1 >>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700 >>>>> From: Pine W <[email protected]> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >>>>> has an interest in Wikipedia and analytics." >>>>> <[email protected]> >>>>> Cc: Bharath Sitaraman <[email protected]> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >>>>> basis >>>>> Message-ID: >>>>> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com> >>>>> Content-Type: text/plain; charset="utf-8" >>>>> >>>>> >>>>> Hi, >>>>> >>>>> This issue of pageview data granularity has been discussed before, and >>> the >>>>> answer has been that hourly is the smallest increment allowed to be >>>>> revealed publicly, for privacy reasons. >>>>> >>>>> I believe that the person you will want to discuss your request with is >>>>> Toby, who I have cc'd here. >>>>> >>>>> Pine >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> >>> wrote: >>>>> >>>>> Hi Wikimedia Analytics Team, >>>>> >>>>> My colleague Bharath and I are doing research on dynamic server >>> allocation >>>>> algorithms and we were looking for a suitable datasets to test our >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing data >>> set >>>>> of hourly page views, but we were looking for something a bit more >>>>> granular, such as aggregated page requests to English Wikipedia on a >>> minute >>>>> by minute basis or second by second basis if possible. >>>>> >>>>> We are more than happy to pour through any raw data you might have that >>>>> would help us calculate page requests at this granular level. Please >>> let us >>>>> know if it would be possible to get such data and if so how. Thank you >>> in >>>>> advance for your help. >>>>> >>>>> Best, >>>>> >>>>> Hirav Gandhi >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> -------------- next part -------------- >>>>> An HTML attachment was scrubbed... >>>>> URL: >>>>> < >>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html >>>> >>>>> >>>>> ------------------------------ >>>>> >>>>> Message: 2 >>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400 >>>>> From: Oliver Keyes <[email protected]> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >>>>> has an interest in Wikipedia and analytics." >>>>> <[email protected]> >>>>> Cc: Bharath Sitaraman <[email protected]> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >>>>> basis >>>>> Message-ID: >>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com> >>>>> Content-Type: text/plain; charset=UTF-8 >>>>> >>>>> >>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the >>>>> director of analytics. >>>>> >>>>> Hirav: would you be looking for temporally /and/ contextually granular >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >>>>> granular, so "a view to a page on enwiki at X time"? If the latter >>>>> you've got more of a shot, I suspect. >>>>> >>>>> On 13 April 2015 at 03:47, Pine W <[email protected]> wrote: >>>>> >>>>> Hi, >>>>> >>>>> This issue of pageview data granularity has been discussed before, and >>> the >>>>> answer has been that hourly is the smallest increment allowed to be >>> revealed >>>>> publicly, for privacy reasons. >>>>> >>>>> I believe that the person you will want to discuss your request with is >>>>> Toby, who I have cc'd here. >>>>> >>>>> Pine >>>>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> >>> wrote: >>>>> >>>>> >>>>> Hi Wikimedia Analytics Team, >>>>> >>>>> My colleague Bharath and I are doing research on dynamic server >>> allocation >>>>> algorithms and we were looking for a suitable datasets to test our >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing data >>> set >>>>> of hourly page views, but we were looking for something a bit more >>> granular, >>>>> such as aggregated page requests to English Wikipedia on a minute by >>> minute >>>>> basis or second by second basis if possible. >>>>> >>>>> We are more than happy to pour through any raw data you might have that >>>>> would help us calculate page requests at this granular level. Please >>> let us >>>>> know if it would be possible to get such data and if so how. Thank you >>> in >>>>> advance for your help. >>>>> >>>>> Best, >>>>> >>>>> Hirav Gandhi >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Oliver Keyes >>>>> Research Analyst >>>>> Wikimedia Foundation >>>>> >>>>> >>>>> >>>>> ------------------------------ >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>>> End of Analytics Digest, Vol 38, Issue 21 >>>>> ***************************************** >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>> >>>> >>>> >>>> -- >>>> Oliver Keyes >>>> Research Analyst >>>> Wikimedia Foundation >>> >>> >>> >>> -- >>> Oliver Keyes >>> Research Analyst >>> Wikimedia Foundation >>> >>> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> <https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/3a5df491/attachment-0001.html> >> >> ------------------------------ >> >> Message: 3 >> Date: Mon, 13 Apr 2015 19:40:04 -0400 >> From: Oliver Keyes <[email protected]> >> To: "A mailing list for the Analytics Team at WMF and everybody who >> has an interest in Wikipedia and analytics." >> <[email protected]> >> Subject: Re: [Analytics] Page views on a more frequent than hourly >> basis >> Message-ID: >> <caauqgdd6z5ussu11vw49fdmbsrhyejxku9yopyserib79j-...@mail.gmail.com> >> Content-Type: text/plain; charset=UTF-8 >> >> .... >> >> >> ...years? >> >> We have unsampled logs for, ah. 2 months. >> >> On 13 April 2015 at 19:30, Hirav Gandhi <[email protected]> wrote: >>> Thanks Oliver! >>> >>> We would like this data for as broad of a time period as you can muster. The >>> more days, months and year represented in the dataset, the better. >>> >>>> >>>> Okay, so: >>>> >>>> I took an hour from the pageviews logs,[0] and aggregated pageviews to >>>> enwiki (mobile and desktop both) by timestamp, down to one-second >>>> resolution levels. The lowest number of pageviews to enwiki per second >>>> was 2,981 >>>> >>>> So, I don't personally have a problem with generating a release of: >>>> >>>> 1. Pageviews per second; >>>> 2. To enwiki; >>>> 3. Over $TIME_PERIOD; >>>> 4. grouping the mobile and desktop site >>>> >>>> But Dario or someone should chip in before I touch anything ;p >>>> >>>> 6am yesterday. 6am because it should be low-traffic, right? At least >>>> given our biases towards north america and europe >>>> >>>> On 13 April 2015 at 11:54, Oliver Keyes <[email protected]> wrote: >>>>> Then that sounds much more viable. I'll run a quick test now to see >>>>> how much clustering we'd see at, say, the one-second resolution level, >>>>> and throw it out here so we can make more informed decisions about a >>>>> data release on this. >>>>> >>>>> On 13 April 2015 at 08:08, Hirav Gandhi <[email protected]> wrote: >>>>>> Hi Oliver, >>>>>> >>>>>> Re: Hirav: would you be looking for temporally /and/ contextually >>>>>> granular >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >>>>>> granular, >>>>>> so "a view to a page on enwiki at X time"? If the latter you've got >>>>>> more of >>>>>> a shot, I suspect. >>>>>> >>>>>> I only want the latter - I am not concerned with the context so much as >>>>>> just >>>>>> “a view to a page on enwiki at X time.” >>>>>> >>>>>> Hirav >>>>>> >>>>>> >>>>>> On Apr 13, 2015, at 5:00 AM, [email protected] >>>>>> wrote: >>>>>> >>>>>> Send Analytics mailing list submissions to >>>>>> [email protected] >>>>>> >>>>>> To subscribe or unsubscribe via the World Wide Web, visit >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> or, via email, send a message with subject or body 'help' to >>>>>> [email protected] >>>>>> >>>>>> You can reach the person managing the list at >>>>>> [email protected] >>>>>> >>>>>> When replying, please edit your Subject line so it is more specific >>>>>> than "Re: Contents of Analytics digest..." >>>>>> >>>>>> >>>>>> Today's Topics: >>>>>> >>>>>> 1. Re: Page views on a more frequent than hourly basis (Pine W) >>>>>> 2. Re: Page views on a more frequent than hourly basis (Oliver Keyes) >>>>>> >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> >>>>>> Message: 1 >>>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700 >>>>>> From: Pine W <[email protected]> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >>>>>> has an interest in Wikipedia and analytics." >>>>>> <[email protected]> >>>>>> Cc: Bharath Sitaraman <[email protected]> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >>>>>> basis >>>>>> Message-ID: >>>>>> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com> >>>>>> Content-Type: text/plain; charset="utf-8" >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> This issue of pageview data granularity has been discussed before, and >>>>>> the >>>>>> answer has been that hourly is the smallest increment allowed to be >>>>>> revealed publicly, for privacy reasons. >>>>>> >>>>>> I believe that the person you will want to discuss your request with is >>>>>> Toby, who I have cc'd here. >>>>>> >>>>>> Pine >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> >>>>>> wrote: >>>>>> >>>>>> Hi Wikimedia Analytics Team, >>>>>> >>>>>> My colleague Bharath and I are doing research on dynamic server >>>>>> allocation >>>>>> algorithms and we were looking for a suitable datasets to test our >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing data >>>>>> set >>>>>> of hourly page views, but we were looking for something a bit more >>>>>> granular, such as aggregated page requests to English Wikipedia on a >>>>>> minute >>>>>> by minute basis or second by second basis if possible. >>>>>> >>>>>> We are more than happy to pour through any raw data you might have that >>>>>> would help us calculate page requests at this granular level. Please >>>>>> let us >>>>>> know if it would be possible to get such data and if so how. Thank you >>>>>> in >>>>>> advance for your help. >>>>>> >>>>>> Best, >>>>>> >>>>>> Hirav Gandhi >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> -------------- next part -------------- >>>>>> An HTML attachment was scrubbed... >>>>>> URL: >>>>>> >>>>>> <https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html> >>>>>> >>>>>> ------------------------------ >>>>>> >>>>>> Message: 2 >>>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400 >>>>>> From: Oliver Keyes <[email protected]> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who >>>>>> has an interest in Wikipedia and analytics." >>>>>> <[email protected]> >>>>>> Cc: Bharath Sitaraman <[email protected]> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly >>>>>> basis >>>>>> Message-ID: >>>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com> >>>>>> Content-Type: text/plain; charset=UTF-8 >>>>>> >>>>>> >>>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the >>>>>> director of analytics. >>>>>> >>>>>> Hirav: would you be looking for temporally /and/ contextually granular >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally >>>>>> granular, so "a view to a page on enwiki at X time"? If the latter >>>>>> you've got more of a shot, I suspect. >>>>>> >>>>>> On 13 April 2015 at 03:47, Pine W <[email protected]> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> This issue of pageview data granularity has been discussed before, and >>>>>> the >>>>>> answer has been that hourly is the smallest increment allowed to be >>>>>> revealed >>>>>> publicly, for privacy reasons. >>>>>> >>>>>> I believe that the person you will want to discuss your request with is >>>>>> Toby, who I have cc'd here. >>>>>> >>>>>> Pine >>>>>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]> >>>>>> wrote: >>>>>> >>>>>> >>>>>> Hi Wikimedia Analytics Team, >>>>>> >>>>>> My colleague Bharath and I are doing research on dynamic server >>>>>> allocation >>>>>> algorithms and we were looking for a suitable datasets to test our >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing data >>>>>> set >>>>>> of hourly page views, but we were looking for something a bit more >>>>>> granular, >>>>>> such as aggregated page requests to English Wikipedia on a minute by >>>>>> minute >>>>>> basis or second by second basis if possible. >>>>>> >>>>>> We are more than happy to pour through any raw data you might have that >>>>>> would help us calculate page requests at this granular level. Please >>>>>> let us >>>>>> know if it would be possible to get such data and if so how. Thank you >>>>>> in >>>>>> advance for your help. >>>>>> >>>>>> Best, >>>>>> >>>>>> Hirav Gandhi >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Oliver Keyes >>>>>> Research Analyst >>>>>> Wikimedia Foundation >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------ >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>>> >>>>>> End of Analytics Digest, Vol 38, Issue 21 >>>>>> ***************************************** >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Oliver Keyes >>>>> Research Analyst >>>>> Wikimedia Foundation >>>> >>>> >>>> >>>> -- >>>> Oliver Keyes >>>> Research Analyst >>>> Wikimedia Foundation >>>> >>>> >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> >> >> -- >> Oliver Keyes >> Research Analyst >> Wikimedia Foundation >> >> >> >> ------------------------------ >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> End of Analytics Digest, Vol 38, Issue 24 >> ***************************************** > > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics
-- Oliver Keyes Research Analyst Wikimedia Foundation _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
