e.g. German* 



I need more coffee.

On Wed, Apr 15, 2015 at 9:35 AM, Hirav Gandhi <[email protected]>
wrote:

> Dario - we just want a representative samples of traffic for a popular site 
> like Wikipedia. We thought limiting to the English Wikipedia would be easier.
> If we get aggregated data across all language Wikipedia sites, we would need 
> someway to tease out which language is being queried when. Some languages 
> (for e.g. German) we would hypothesize would have more daily seasonality than 
> languages like English.
> On Wed, Apr 15, 2015 at 9:32 AM, Dario Taraborelli
> <[email protected]> wrote:
>> Hirav, Bharath – I also want to hear from you if there's a specific reason
>> to ask for English Wikipedia only or if a dataset encompassing aggregate
>> pageviews across all Wikimedia properties would do the job.
>> Dario
>> On Wed, Apr 15, 2015 at 9:09 AM, Dario Taraborelli <
>> [email protected]> wrote:
>>> Oliver -- thanks for running a preliminary check, I'm fine releasing this
>>> data in aggregate under CC0, I believe it would be valuable for this and
>>> other research projects (copying Michelle from Legal).
>>>
>>> Before we do so, though, I want to confirm the specs: aggregate pageviews
>>> per second to English Wikipedia, excluding bot traffic, broken down by
>>> access method (mobile web vs desktop site, not apps) for a 60-day period.
>>> Oliver – are these the filters you used to identify the data point with the
>>> smallest number of observations?
>>>
>>> Obviously, we will need to take into account this release when we start
>>> working on projects such as
>>> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_edits
>>> and
>>> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_pageviews
>>>
>>> Dario
>>>
>>> On Mon, Apr 13, 2015 at 9:37 PM, Oliver Keyes <[email protected]>
>>> wrote:
>>>
>>>> Bumping for Dario, per Pine's excellent example :)
>>>>
>>>> On 13 April 2015 at 22:18, Hirav Gandhi <[email protected]> wrote:
>>>> > Oliver: Two months is fine. Thank you so much for your help!
>>>> >
>>>> >> On Apr 13, 2015, at 4:40 PM, [email protected]
>>>> wrote:
>>>> >>
>>>> >> Send Analytics mailing list submissions to
>>>> >>       [email protected]
>>>> >>
>>>> >> To subscribe or unsubscribe via the World Wide Web, visit
>>>> >>       https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >> or, via email, send a message with subject or body 'help' to
>>>> >>       [email protected]
>>>> >>
>>>> >> You can reach the person managing the list at
>>>> >>       [email protected]
>>>> >>
>>>> >> When replying, please edit your Subject line so it is more specific
>>>> >> than "Re: Contents of Analytics digest..."
>>>> >>
>>>> >>
>>>> >> Today's Topics:
>>>> >>
>>>> >>   1. Re: Page views on a more frequent than hourly basis (Pine W)
>>>> >>   2. Re: Page views on a more frequent than hourly basis (Hirav Gandhi)
>>>> >>   3. Re: Page views on a more frequent than hourly basis (Oliver Keyes)
>>>> >>
>>>> >>
>>>> >> ----------------------------------------------------------------------
>>>> >>
>>>> >> Message: 1
>>>> >> Date: Mon, 13 Apr 2015 13:34:23 -0700
>>>> >> From: Pine W <[email protected]>
>>>> >> To: "A mailing list for the Analytics Team at WMF and everybody who
>>>> >>       has an  interest in Wikipedia and analytics."
>>>> >>       <[email protected]>
>>>> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
>>>> >>       basis
>>>> >> Message-ID:
>>>> >>       <CAF=
>>>> [email protected]>
>>>> >> Content-Type: text/plain; charset="utf-8"
>>>> >>
>>>> >> Hi Oliver, re ccing people who are on list, this is the protocol we
>>>> >> followed in IEGCom to ping people who are subscribed and mentioned in
>>>> >> certain emails but, like many of us, may automatically move emails from
>>>> >> lists directly to folders where they may be unread for days. So there
>>>> is a
>>>> >> reason to do this.
>>>> >>
>>>> >> Thanks,
>>>> >>
>>>> >> Pine
>>>> >> -------------- next part --------------
>>>> >> An HTML attachment was scrubbed...
>>>> >> URL: <
>>>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/aac0ef89/attachment-0001.html
>>>> >
>>>> >>
>>>> >> ------------------------------
>>>> >>
>>>> >> Message: 2
>>>> >> Date: Mon, 13 Apr 2015 16:30:43 -0700
>>>> >> From: Hirav Gandhi <[email protected]>
>>>> >> To: [email protected]
>>>> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
>>>> >>       basis
>>>> >> Message-ID:
>>>> >>       <CANzC_EOvi4MP7G_SsxvW=
>>>> [email protected]>
>>>> >> Content-Type: text/plain; charset="utf-8"
>>>> >>
>>>> >> Thanks Oliver!
>>>> >>
>>>> >> We would like this data for as broad of a time period as you can
>>>> muster.
>>>> >> The more days, months and year represented in the dataset, the better.
>>>> >>
>>>> >>
>>>> >>> Okay, so:
>>>> >>>
>>>> >>> I took an hour from the pageviews logs,[0] and aggregated pageviews to
>>>> >>> enwiki (mobile and desktop both) by timestamp, down to one-second
>>>> >>> resolution levels. The lowest number of pageviews to enwiki per second
>>>> >>> was 2,981
>>>> >>>
>>>> >>> So, I don't personally have a problem with generating a release of:
>>>> >>>
>>>> >>> 1. Pageviews per second;
>>>> >>> 2. To enwiki;
>>>> >>> 3. Over $TIME_PERIOD;
>>>> >>> 4. grouping the mobile and desktop site
>>>> >>>
>>>> >>> But Dario or someone should chip in before I touch anything ;p
>>>> >>>
>>>> >>> 6am yesterday. 6am because it should be low-traffic, right? At least
>>>> >>> given our biases towards north america and europe
>>>> >>>
>>>> >>> On 13 April 2015 at 11:54, Oliver Keyes <[email protected]> wrote:
>>>> >>>> Then that sounds much more viable. I'll run a quick test now to see
>>>> >>>> how much clustering we'd see at, say, the one-second resolution
>>>> level,
>>>> >>>> and throw it out here so we can make more informed decisions about a
>>>> >>>> data release on this.
>>>> >>>>
>>>> >>>> On 13 April 2015 at 08:08, Hirav Gandhi <[email protected]>
>>>> wrote:
>>>> >>>>> Hi Oliver,
>>>> >>>>>
>>>> >>>>> Re: Hirav: would you be looking for temporally /and/ contextually
>>>> >>> granular
>>>> >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>>>> >>> granular,
>>>> >>>>> so "a view to a page on enwiki at X time"? If the latter you've got
>>>> >>> more of
>>>> >>>>> a shot, I suspect.
>>>> >>>>>
>>>> >>>>> I only want the latter - I am not concerned with the context so
>>>> much as
>>>> >>> just
>>>> >>>>> “a view to a page on enwiki at X time.”
>>>> >>>>>
>>>> >>>>> Hirav
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Apr 13, 2015, at 5:00 AM, [email protected]
>>>> >>> wrote:
>>>> >>>>>
>>>> >>>>> Send Analytics mailing list submissions to
>>>> >>>>> [email protected]
>>>> >>>>>
>>>> >>>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>> or, via email, send a message with subject or body 'help' to
>>>> >>>>> [email protected]
>>>> >>>>>
>>>> >>>>> You can reach the person managing the list at
>>>> >>>>> [email protected]
>>>> >>>>>
>>>> >>>>> When replying, please edit your Subject line so it is more specific
>>>> >>>>> than "Re: Contents of Analytics digest..."
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Today's Topics:
>>>> >>>>>
>>>> >>>>>  1. Re: Page views on a more frequent than hourly basis (Pine W)
>>>> >>>>>  2. Re: Page views on a more frequent than hourly basis (Oliver
>>>> Keyes)
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> ----------------------------------------------------------------------
>>>> >>>>>
>>>> >>>>> Message: 1
>>>> >>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700
>>>> >>>>> From: Pine W <[email protected]>
>>>> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>>>> >>>>> has an interest in Wikipedia and analytics."
>>>> >>>>> <[email protected]>
>>>> >>>>> Cc: Bharath Sitaraman <[email protected]>
>>>> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>>>> >>>>> basis
>>>> >>>>> Message-ID:
>>>> >>>>> <CAF=dyjgnut+t6n6mujq16duyiwp7et6ruht3_-tzdnsep+2...@mail.gmail.com
>>>> >
>>>> >>>>> Content-Type: text/plain; charset="utf-8"
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Hi,
>>>> >>>>>
>>>> >>>>> This issue of pageview data granularity has been discussed before,
>>>> and
>>>> >>> the
>>>> >>>>> answer has been that hourly is the smallest increment allowed to be
>>>> >>>>> revealed publicly, for privacy reasons.
>>>> >>>>>
>>>> >>>>> I believe that the person you will want to discuss your request
>>>> with is
>>>> >>>>> Toby, who I have cc'd here.
>>>> >>>>>
>>>> >>>>> Pine
>>>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]>
>>>> >>> wrote:
>>>> >>>>>
>>>> >>>>> Hi Wikimedia Analytics Team,
>>>> >>>>>
>>>> >>>>> My colleague Bharath and I are doing research on dynamic server
>>>> >>> allocation
>>>> >>>>> algorithms and we were looking for a suitable datasets to test our
>>>> >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>>>> data
>>>> >>> set
>>>> >>>>> of hourly page views, but we were looking for something a bit more
>>>> >>>>> granular, such as aggregated page requests to English Wikipedia on a
>>>> >>> minute
>>>> >>>>> by minute basis or second by second basis if possible.
>>>> >>>>>
>>>> >>>>> We are more than happy to pour through any raw data you might have
>>>> that
>>>> >>>>> would help us calculate page requests at this granular level. Please
>>>> >>> let us
>>>> >>>>> know if it would be possible to get such data and if so how. Thank
>>>> you
>>>> >>> in
>>>> >>>>> advance for your help.
>>>> >>>>>
>>>> >>>>> Best,
>>>> >>>>>
>>>> >>>>> Hirav Gandhi
>>>> >>>>> _______________________________________________
>>>> >>>>> Analytics mailing list
>>>> >>>>> [email protected]
>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>
>>>> >>>>> -------------- next part --------------
>>>> >>>>> An HTML attachment was scrubbed...
>>>> >>>>> URL:
>>>> >>>>> <
>>>> >>>
>>>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html
>>>> >>>>
>>>> >>>>>
>>>> >>>>> ------------------------------
>>>> >>>>>
>>>> >>>>> Message: 2
>>>> >>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400
>>>> >>>>> From: Oliver Keyes <[email protected]>
>>>> >>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>>>> >>>>> has an interest in Wikipedia and analytics."
>>>> >>>>> <[email protected]>
>>>> >>>>> Cc: Bharath Sitaraman <[email protected]>
>>>> >>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>>>> >>>>> basis
>>>> >>>>> Message-ID:
>>>> >>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=h...@mail.gmail.com
>>>> >
>>>> >>>>> Content-Type: text/plain; charset=UTF-8
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the
>>>> >>>>> director of analytics.
>>>> >>>>>
>>>> >>>>> Hirav: would you be looking for temporally /and/ contextually
>>>> granular
>>>> >>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>>>> >>>>> granular, so "a view to a page on enwiki at X time"? If the latter
>>>> >>>>> you've got more of a shot, I suspect.
>>>> >>>>>
>>>> >>>>> On 13 April 2015 at 03:47, Pine W <[email protected]> wrote:
>>>> >>>>>
>>>> >>>>> Hi,
>>>> >>>>>
>>>> >>>>> This issue of pageview data granularity has been discussed before,
>>>> and
>>>> >>> the
>>>> >>>>> answer has been that hourly is the smallest increment allowed to be
>>>> >>> revealed
>>>> >>>>> publicly, for privacy reasons.
>>>> >>>>>
>>>> >>>>> I believe that the person you will want to discuss your request
>>>> with is
>>>> >>>>> Toby, who I have cc'd here.
>>>> >>>>>
>>>> >>>>> Pine
>>>> >>>>>
>>>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]>
>>>> >>> wrote:
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Hi Wikimedia Analytics Team,
>>>> >>>>>
>>>> >>>>> My colleague Bharath and I are doing research on dynamic server
>>>> >>> allocation
>>>> >>>>> algorithms and we were looking for a suitable datasets to test our
>>>> >>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>>>> data
>>>> >>> set
>>>> >>>>> of hourly page views, but we were looking for something a bit more
>>>> >>> granular,
>>>> >>>>> such as aggregated page requests to English Wikipedia on a minute by
>>>> >>> minute
>>>> >>>>> basis or second by second basis if possible.
>>>> >>>>>
>>>> >>>>> We are more than happy to pour through any raw data you might have
>>>> that
>>>> >>>>> would help us calculate page requests at this granular level. Please
>>>> >>> let us
>>>> >>>>> know if it would be possible to get such data and if so how. Thank
>>>> you
>>>> >>> in
>>>> >>>>> advance for your help.
>>>> >>>>>
>>>> >>>>> Best,
>>>> >>>>>
>>>> >>>>> Hirav Gandhi
>>>> >>>>> _______________________________________________
>>>> >>>>> Analytics mailing list
>>>> >>>>> [email protected]
>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> Analytics mailing list
>>>> >>>>> [email protected]
>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> Oliver Keyes
>>>> >>>>> Research Analyst
>>>> >>>>> Wikimedia Foundation
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> ------------------------------
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> Analytics mailing list
>>>> >>>>> [email protected]
>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> End of Analytics Digest, Vol 38, Issue 21
>>>> >>>>> *****************************************
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> Analytics mailing list
>>>> >>>>> [email protected]
>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> --
>>>> >>>> Oliver Keyes
>>>> >>>> Research Analyst
>>>> >>>> Wikimedia Foundation
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Oliver Keyes
>>>> >>> Research Analyst
>>>> >>> Wikimedia Foundation
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> ------------------------------
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> Analytics mailing list
>>>> >>> [email protected]
>>>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>
>>>> >> -------------- next part --------------
>>>> >> An HTML attachment was scrubbed...
>>>> >> URL: <
>>>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/3a5df491/attachment-0001.html
>>>> >
>>>> >>
>>>> >> ------------------------------
>>>> >>
>>>> >> Message: 3
>>>> >> Date: Mon, 13 Apr 2015 19:40:04 -0400
>>>> >> From: Oliver Keyes <[email protected]>
>>>> >> To: "A mailing list for the Analytics Team at WMF and everybody who
>>>> >>       has an  interest in Wikipedia and analytics."
>>>> >>       <[email protected]>
>>>> >> Subject: Re: [Analytics] Page views on a more frequent than hourly
>>>> >>       basis
>>>> >> Message-ID:
>>>> >>       <
>>>> caauqgdd6z5ussu11vw49fdmbsrhyejxku9yopyserib79j-...@mail.gmail.com>
>>>> >> Content-Type: text/plain; charset=UTF-8
>>>> >>
>>>> >> ....
>>>> >>
>>>> >>
>>>> >> ...years?
>>>> >>
>>>> >> We have unsampled logs for, ah. 2 months.
>>>> >>
>>>> >> On 13 April 2015 at 19:30, Hirav Gandhi <[email protected]>
>>>> wrote:
>>>> >>> Thanks Oliver!
>>>> >>>
>>>> >>> We would like this data for as broad of a time period as you can
>>>> muster. The
>>>> >>> more days, months and year represented in the dataset, the better.
>>>> >>>
>>>> >>>>
>>>> >>>> Okay, so:
>>>> >>>>
>>>> >>>> I took an hour from the pageviews logs,[0] and aggregated pageviews
>>>> to
>>>> >>>> enwiki (mobile and desktop both) by timestamp, down to one-second
>>>> >>>> resolution levels. The lowest number of pageviews to enwiki per
>>>> second
>>>> >>>> was 2,981
>>>> >>>>
>>>> >>>> So, I don't personally have a problem with generating a release of:
>>>> >>>>
>>>> >>>> 1. Pageviews per second;
>>>> >>>> 2. To enwiki;
>>>> >>>> 3. Over $TIME_PERIOD;
>>>> >>>> 4. grouping the mobile and desktop site
>>>> >>>>
>>>> >>>> But Dario or someone should chip in before I touch anything ;p
>>>> >>>>
>>>> >>>> 6am yesterday. 6am because it should be low-traffic, right? At least
>>>> >>>> given our biases towards north america and europe
>>>> >>>>
>>>> >>>> On 13 April 2015 at 11:54, Oliver Keyes <[email protected]>
>>>> wrote:
>>>> >>>>> Then that sounds much more viable. I'll run a quick test now to see
>>>> >>>>> how much clustering we'd see at, say, the one-second resolution
>>>> level,
>>>> >>>>> and throw it out here so we can make more informed decisions about a
>>>> >>>>> data release on this.
>>>> >>>>>
>>>> >>>>> On 13 April 2015 at 08:08, Hirav Gandhi <[email protected]>
>>>> wrote:
>>>> >>>>>> Hi Oliver,
>>>> >>>>>>
>>>> >>>>>> Re: Hirav: would you be looking for temporally /and/ contextually
>>>> >>>>>> granular
>>>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>>>> >>>>>> granular,
>>>> >>>>>> so "a view to a page on enwiki at X time"? If the latter you've got
>>>> >>>>>> more of
>>>> >>>>>> a shot, I suspect.
>>>> >>>>>>
>>>> >>>>>> I only want the latter - I am not concerned with the context so
>>>> much as
>>>> >>>>>> just
>>>> >>>>>> “a view to a page on enwiki at X time.”
>>>> >>>>>>
>>>> >>>>>> Hirav
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Apr 13, 2015, at 5:00 AM, [email protected]
>>>> >>>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Send Analytics mailing list submissions to
>>>> >>>>>> [email protected]
>>>> >>>>>>
>>>> >>>>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>> or, via email, send a message with subject or body 'help' to
>>>> >>>>>> [email protected]
>>>> >>>>>>
>>>> >>>>>> You can reach the person managing the list at
>>>> >>>>>> [email protected]
>>>> >>>>>>
>>>> >>>>>> When replying, please edit your Subject line so it is more specific
>>>> >>>>>> than "Re: Contents of Analytics digest..."
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Today's Topics:
>>>> >>>>>>
>>>> >>>>>>  1. Re: Page views on a more frequent than hourly basis (Pine W)
>>>> >>>>>>  2. Re: Page views on a more frequent than hourly basis (Oliver
>>>> Keyes)
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> ----------------------------------------------------------------------
>>>> >>>>>>
>>>> >>>>>> Message: 1
>>>> >>>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700
>>>> >>>>>> From: Pine W <[email protected]>
>>>> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>>>> >>>>>> has an interest in Wikipedia and analytics."
>>>> >>>>>> <[email protected]>
>>>> >>>>>> Cc: Bharath Sitaraman <[email protected]>
>>>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>>>> >>>>>> basis
>>>> >>>>>> Message-ID:
>>>> >>>>>> <CAF=
>>>> [email protected]>
>>>> >>>>>> Content-Type: text/plain; charset="utf-8"
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> This issue of pageview data granularity has been discussed before,
>>>> and
>>>> >>>>>> the
>>>> >>>>>> answer has been that hourly is the smallest increment allowed to be
>>>> >>>>>> revealed publicly, for privacy reasons.
>>>> >>>>>>
>>>> >>>>>> I believe that the person you will want to discuss your request
>>>> with is
>>>> >>>>>> Toby, who I have cc'd here.
>>>> >>>>>>
>>>> >>>>>> Pine
>>>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]>
>>>> >>>>>> wrote:
>>>> >>>>>>
>>>> >>>>>> Hi Wikimedia Analytics Team,
>>>> >>>>>>
>>>> >>>>>> My colleague Bharath and I are doing research on dynamic server
>>>> >>>>>> allocation
>>>> >>>>>> algorithms and we were looking for a suitable datasets to test our
>>>> >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>>>> data
>>>> >>>>>> set
>>>> >>>>>> of hourly page views, but we were looking for something a bit more
>>>> >>>>>> granular, such as aggregated page requests to English Wikipedia on
>>>> a
>>>> >>>>>> minute
>>>> >>>>>> by minute basis or second by second basis if possible.
>>>> >>>>>>
>>>> >>>>>> We are more than happy to pour through any raw data you might have
>>>> that
>>>> >>>>>> would help us calculate page requests at this granular level.
>>>> Please
>>>> >>>>>> let us
>>>> >>>>>> know if it would be possible to get such data and if so how. Thank
>>>> you
>>>> >>>>>> in
>>>> >>>>>> advance for your help.
>>>> >>>>>>
>>>> >>>>>> Best,
>>>> >>>>>>
>>>> >>>>>> Hirav Gandhi
>>>> >>>>>> _______________________________________________
>>>> >>>>>> Analytics mailing list
>>>> >>>>>> [email protected]
>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>>
>>>> >>>>>> -------------- next part --------------
>>>> >>>>>> An HTML attachment was scrubbed...
>>>> >>>>>> URL:
>>>> >>>>>>
>>>> >>>>>> <
>>>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html
>>>> >
>>>> >>>>>>
>>>> >>>>>> ------------------------------
>>>> >>>>>>
>>>> >>>>>> Message: 2
>>>> >>>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400
>>>> >>>>>> From: Oliver Keyes <[email protected]>
>>>> >>>>>> To: "A mailing list for the Analytics Team at WMF and everybody who
>>>> >>>>>> has an interest in Wikipedia and analytics."
>>>> >>>>>> <[email protected]>
>>>> >>>>>> Cc: Bharath Sitaraman <[email protected]>
>>>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>>>> >>>>>> basis
>>>> >>>>>> Message-ID:
>>>> >>>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=
>>>> [email protected]>
>>>> >>>>>> Content-Type: text/plain; charset=UTF-8
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Preeetty sure that Toby is on the analytics list, Pine. He's the
>>>> >>>>>> director of analytics.
>>>> >>>>>>
>>>> >>>>>> Hirav: would you be looking for temporally /and/ contextually
>>>> granular
>>>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just temporally
>>>> >>>>>> granular, so "a view to a page on enwiki at X time"? If the latter
>>>> >>>>>> you've got more of a shot, I suspect.
>>>> >>>>>>
>>>> >>>>>> On 13 April 2015 at 03:47, Pine W <[email protected]> wrote:
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> This issue of pageview data granularity has been discussed before,
>>>> and
>>>> >>>>>> the
>>>> >>>>>> answer has been that hourly is the smallest increment allowed to be
>>>> >>>>>> revealed
>>>> >>>>>> publicly, for privacy reasons.
>>>> >>>>>>
>>>> >>>>>> I believe that the person you will want to discuss your request
>>>> with is
>>>> >>>>>> Toby, who I have cc'd here.
>>>> >>>>>>
>>>> >>>>>> Pine
>>>> >>>>>>
>>>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <[email protected]>
>>>> >>>>>> wrote:
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> Hi Wikimedia Analytics Team,
>>>> >>>>>>
>>>> >>>>>> My colleague Bharath and I are doing research on dynamic server
>>>> >>>>>> allocation
>>>> >>>>>> algorithms and we were looking for a suitable datasets to test our
>>>> >>>>>> predictive algorithm on. We noticed that Wikimedia has an amazing
>>>> data
>>>> >>>>>> set
>>>> >>>>>> of hourly page views, but we were looking for something a bit more
>>>> >>>>>> granular,
>>>> >>>>>> such as aggregated page requests to English Wikipedia on a minute
>>>> by
>>>> >>>>>> minute
>>>> >>>>>> basis or second by second basis if possible.
>>>> >>>>>>
>>>> >>>>>> We are more than happy to pour through any raw data you might have
>>>> that
>>>> >>>>>> would help us calculate page requests at this granular level.
>>>> Please
>>>> >>>>>> let us
>>>> >>>>>> know if it would be possible to get such data and if so how. Thank
>>>> you
>>>> >>>>>> in
>>>> >>>>>> advance for your help.
>>>> >>>>>>
>>>> >>>>>> Best,
>>>> >>>>>>
>>>> >>>>>> Hirav Gandhi
>>>> >>>>>> _______________________________________________
>>>> >>>>>> Analytics mailing list
>>>> >>>>>> [email protected]
>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> _______________________________________________
>>>> >>>>>> Analytics mailing list
>>>> >>>>>> [email protected]
>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>> Oliver Keyes
>>>> >>>>>> Research Analyst
>>>> >>>>>> Wikimedia Foundation
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> ------------------------------
>>>> >>>>>>
>>>> >>>>>> _______________________________________________
>>>> >>>>>> Analytics mailing list
>>>> >>>>>> [email protected]
>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> End of Analytics Digest, Vol 38, Issue 21
>>>> >>>>>> *****************************************
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> _______________________________________________
>>>> >>>>>> Analytics mailing list
>>>> >>>>>> [email protected]
>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> Oliver Keyes
>>>> >>>>> Research Analyst
>>>> >>>>> Wikimedia Foundation
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> --
>>>> >>>> Oliver Keyes
>>>> >>>> Research Analyst
>>>> >>>> Wikimedia Foundation
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> ------------------------------
>>>> >>>>
>>>> >>>> _______________________________________________
>>>> >>>> Analytics mailing list
>>>> >>>> [email protected]
>>>> >>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> Analytics mailing list
>>>> >>> [email protected]
>>>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Oliver Keyes
>>>> >> Research Analyst
>>>> >> Wikimedia Foundation
>>>> >>
>>>> >>
>>>> >>
>>>> >> ------------------------------
>>>> >>
>>>> >> _______________________________________________
>>>> >> Analytics mailing list
>>>> >> [email protected]
>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> >>
>>>> >>
>>>> >> End of Analytics Digest, Vol 38, Issue 24
>>>> >> *****************************************
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Analytics mailing list
>>>> > [email protected]
>>>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>>
>>>> --
>>>> Oliver Keyes
>>>> Research Analyst
>>>> Wikimedia Foundation
>>>>
>>>
>>>
>>>
>>> --
>>> Dario Taraborelli
>>> Senior Research Scientist, Research and Data Lead
>>> Wikimedia Foundation
>>> http://wikimediafoundation.org
>>> http://nitens.org/taraborelli
>>>
>> -- 
>> Dario Taraborelli
>> Senior Research Scientist, Research and Data Lead
>> Wikimedia Foundation
>> http://wikimediafoundation.org
>> http://nitens.org/taraborelli
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to