If you want it going back that far, I'm afraid the stats.grok.se style
data is all there is :(. The new API only covers the last few months
thus far.

On 14 December 2015 at 17:31,  <[email protected]> wrote:
> Hi all,
>
> Specifically what I am looking for is page view data for these pages, 
> preferably for all months on http://dumps.wikimedia.org/other/pagecounts-raw/ 
> (appeared as named 4 Dec):
> Abacarus hystrix
> Acarus siro
> Aceria tosichella
> Acyrthosiphon pisum
> Ahasverus advena
> Anthrenus flavipes
> Aphis craccivora
> Arhopalus
> Balaustium medicagoense
> Bemisia tabaci
> Brevicoryne brassicae
> Bruchus
> Ceratitis capitata
> Cicadulina
> Cryptolestes
> Daktulosphaira vitifoliae
> Delia
> Ephestia elutella
> Ephestia kuehniella
> Etiella behrii
> Frankliniella occidentalis
> Frankliniella
> Henosepilachna vigintioctopunctata
> Heteronychus arator
> Lachesilla quercus
> Lasioderma serricorne
> Liposcelis bostrychophila
> Macrosiphum euphorbiae
> Marchalina hellenica
> Myzus persicae
> Naupactus
> Nezara viridula
> Oligonychus ununguis
> Oryzaephilus surinamensis
> Panonychus ulmi
> Penthaleus
> Pieris rapae
> Piezodorus
> Plodia interpunctella
> Plutella xylostella
> Rhopalosiphon
> rhopalosiphum maidis
> Rhopalosiphum padi
> Rhyzopertha dominica
> Sirex noctilio
> Sitophilus granarius
> Sitophilus oryzae
> Sitotroga cerealella
> Sminthurus viridis
> Spodoptera exempta
> Stegobium paniceum
> Tetranychus
> Thrips palmi
> Thrips
> Tribolium castaneum
> Tribolium confusum
> Trogoderma granarium
> Trogoderma
>
> I then also want a total number of page views to standardise the individual 
> page views.
>
> I have looked at stats.gronk.se and wikitrends and I have two issues:
> 1. The data is only month by month and I want as many years of data as 
> possible.
> 2. Some pages have too few page views for wikitrends.
>
>
> Thanks for your help!
>
>
>
> -----Original Message-----
> From: Analytics [mailto:[email protected]] On Behalf Of 
> [email protected]
> Sent: Tuesday, 15 December 2015 4:11 AM
> To: [email protected]
> Subject: Analytics Digest, Vol 46, Issue 23
>
> Send Analytics mailing list submissions to
>         [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.wikimedia.org/mailman/listinfo/analytics
> or, via email, send a message with subject or body 'help' to
>         [email protected]
>
> You can reach the person managing the list at
>         [email protected]
>
> When replying, please edit your Subject line so it is more specific than "Re: 
> Contents of Analytics digest..."
>
>
> Today's Topics:
>
>    1. Re: Readership metrics for the fortnight until December 6,
>       2015 (Federico Leva (Nemo))
>    2. Re: Data collection (Erik Zachte)
>    3. Re: Data collection (Federico Leva (Nemo))
>    4. Re: Python client for the new pageview API (Dan Andreescu)
>    5. Re: mobile and zero legacy tsvs on stat1002 (Oliver Keyes)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 14 Dec 2015 13:08:11 +0100
> From: "Federico Leva (Nemo)" <[email protected]>
> To: A mailing list for the Analytics Team at WMF and everybody who has
>         an interest in Wikipedia and "analytics."
>         <[email protected]>
> Subject: Re: [Analytics] Readership metrics for the fortnight until
>         December 6, 2015
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Interesting country breakdown!
>
> Tilman Bayer, 14/12/2015 12:32:
>>
>> For the top three, I looked at how pageviews developed on a daily
>> basis during the last three month including the week after this large
>> change (until Dec 6):
>>
>>
>> In Greece, the +21.6% rise was the result of an isolated spike from
>> November 23-25. This can be traced to a single page on the Greek
>> Wiktionary which on most days before and after only saw a single-digit
>> number of pageviews, but on these three days received more than 2.8
>> million: τάλε κουάλε
>> <https://el.wiktionary.org/wiki/%CF%84%CE%AC%CE%BB%CE%B5_%CE%BA%CE%BF%CF%85%CE%AC%CE%BB%CE%B5>.
>> It’s about an expression that apparently comes from Latin via Italian
>> (“tale quale”) <https://en.wiktionary.org/wiki/tale_e_quale>and means
>> something like “exactly the same” or “spitting image”. From the form
>> of the spike, it was likely not the result of actual human interest,
>> rather an undetected bot trying to learn exactly the same about exactly the 
>> same.
>>
>>
>>
>> In Ireland, the -20.6% drop marked the end of a plateau whose start
>> had actually shown up in the report for the week until November 1
>> <https://lists.wikimedia.org/pipermail/mobile-l/2015-November/009919.h
>> tml>already, where the country was the top changer with a 40.2% rise.
>>
>>
>> For South Africa, the -20.6% drop does not form part of a clear pattern.
>>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 14 Dec 2015 14:14:17 +0100
> From: "Erik Zachte" <[email protected]>
> To: "'A mailing list for the Analytics Team at WMF and everybody who
>         has an interest in Wikipedia and analytics.'"
>         <[email protected]>
> Subject: Re: [Analytics] Data collection
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Caitlin,
>
>
>
> Here is a breakdown of categories within Phytopathology on English wikipedia: 
> http://ow.ly/VQNVL
>
> and the articles within those categories ranked by page view for Oct 2015 : 
> http://ow.ly/VQNCv
>
>
>
> I can run similar reports for earlier months.
>
>
>
> Cheers,
>
> Erik
>
>
>
>
>
> From: Analytics [mailto:[email protected]] On Behalf Of 
> Alex Druk
> Sent: Monday, December 14, 2015 10:44
> To: A mailing list for the Analytics Team at WMF and everybody who has an 
> interest in Wikipedia and analytics.
> Subject: Re: [Analytics] Data collection
>
>
>
> Hi Caitlin,
>
>
>
> If you have a list of relevant articles and understanding what time period 
> you would like to research, contact me of the list and I probably can help 
> you.
>
> Also my advise: have a look at wikipediatrends.com or stats.grok.se and try 
> some of your queries to get a better undestanding of possible results.
>
> Best wishes,
>
>
>
> On Mon, Dec 14, 2015 at 12:04 AM, <[email protected]> wrote:
>
> Hi All,
>
>
>
> I am a summer research intern with the Commonwealth Scientific and Industrial 
> Research Organisation (CSIRO) in Australia. I am studying a statistics degree 
> and so I don’t really have skills in the type of data collection required to 
> access the Wiki data for my research. I was wondering if someone might be 
> able to give me a hand (by pointing me in the right direction)?
>
>
>
> I have a list of pest species that I wish to find the total number of page 
> views via stats.grok.se or https://dumps.wikimedia.org/other/pagecounts-raw/ 
> . There must be a good method to go through and pick out page views by name 
> rather than by hand (which obviously isn’t feasible)? I’d also need to be 
> able to find the total number of page views for each period in order to 
> standardize the response to account for the increase in traffic over the 
> years.
>
>
>
> We are in the process of gathering similar data through a Plant Pest database 
> but due to privacy concerns, the organisation is arranging to reconcile the 
> data on our behalf and so I do not have a part in that.
>
>
>
> Any help would be really appreciated!
>
>
>
> Kind regards,
>
> Caitlin Gardner
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
>
>
>
>
> --
>
> Thank you.
>
> Alex Druk, PhD
>
> wikipediatrends.com
> [email protected]
> (775) 237-8550 Google voice
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <https://lists.wikimedia.org/pipermail/analytics/attachments/20151214/9ec9b28b/attachment-0001.html>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 14 Dec 2015 15:25:03 +0100
> From: "Federico Leva (Nemo)" <[email protected]>
> To: A mailing list for the Analytics Team at WMF and everybody who has
>         an interest in Wikipedia and "analytics."
>         <[email protected]>
> Subject: Re: [Analytics] Data collection
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Erik Zachte, 14/12/2015 14:14:
>> I can run similar reports for earlier months.
>
> Thanks for publishing that code too!
> https://github.com/wikimedia/analytics-wikistats/tree/master/dammit.lt/bash
>
> Nemo
>
>
>
> ------------------------------
>
> Message: 4
> Date: Mon, 14 Dec 2015 09:32:24 -0500
> From: Dan Andreescu <[email protected]>
> To: Analytics List <[email protected]>,  Research into
>         Wikimedia content and communities
>         <[email protected]>
> Subject: Re: [Analytics] Python client for the new pageview API
> Message-ID:
>         <ca+aepcs4n-z4qd-wzw7v_j5aipb01ncwzrluhtbiwwy_ofc...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> I wasn't aware of some conventions that came before me, so I moved the 
> project from milimetric/wmf to mediawiki-utilities/python-mwviews.  I promise 
> it'll stay there, sorry for the inconvenience.  Updated links:
>
> PyPI: https://pypi.python.org/pypi/mwviews/0.0.2
> code: https://github.com/mediawiki-utilities/python-mwviews (PRs still 
> welcome, thanks for the 2 you already helped with!)
>
> On Fri, Dec 11, 2015 at 10:36 PM, Dan Andreescu <[email protected]>
> wrote:
>
>> Along the same lines as Oliver's great R client [1], I just started
>> work on a python version:
>>
>> PyPI: https://pypi.python.org/pypi/wmf/0.1
>> code: https://github.com/milimetric/wmf (PRs welcome)
>>
>> And if you're trying to skip past all the setup repository cruft, the
>> meat:
>> https://github.com/milimetric/wmf/blob/master/wmf/analytics/api/pagevi
>> ews.py
>>
>>
>> [1] https://github.com/Ironholds/pageviews
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <https://lists.wikimedia.org/pipermail/analytics/attachments/20151214/c0a9adf5/attachment-0001.html>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 14 Dec 2015 12:10:50 -0500
> From: Oliver Keyes <[email protected]>
> To: "A mailing list for the Analytics Team at WMF and everybody who
>         has an interest in Wikipedia and analytics."
>         <[email protected]>
> Subject: Re: [Analytics] mobile and zero legacy tsvs on stat1002
> Message-ID:
>         <CAAUQgdADvcJdt_6+PgELg0P6nxM3C=6uydv3pboxjcpffc3...@mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Gotcha! Long as it's set for every request, perfect :)
>
> On 14 December 2015 at 04:50, Joseph Allemandou <[email protected]> 
> wrote:
>> @Oliver: I think the closest we'll have is the access-method field,
>> that can take values desktop, mobile-web, mobile-app.
>>
>> On Sun, Dec 13, 2015 at 8:37 PM, Oliver Keyes <[email protected]> wrote:
>>>
>>> Not an answer to the question, but a question of my own; will the
>>> nature of the content being served still be present as /some/ field?
>>> FWIW I've found it very helpful to be able to use webrequest_source
>>> to trivially distinguish mobile and desktop requests.
>>>
>>> On 11 December 2015 at 12:40, Andrew Otto <[email protected]> wrote:
>>> > Hi all,
>>> >
>>> > Soon, we will be merging the mobile web cache requests with the
>>> > text cache requests.  text caches will now serve requests for
>>> > mobile web[1].
>>> >
>>> > This means that the webrequest_source=‘mobile’ partition in the
>>> > webrequest table in Hive will soon be empty, and all data that was
>>> > previously in it will be found in the webrequest_source=‘text’
>>> > partition.
>>> >
>>> > There are only 3 datasets that currently only use the
>>> > webrequest_source=‘mobile’ partition:
>>> >
>>> > - /a/log/webrequest/archive/mobile
>>> > - /a/log/webrequest/archive/5xx-mobile
>>> > - /a/log/webrequest/archive/zero
>>> >
>>> > (These are paths on stat1002, but they also exist in HDFS.)
>>> >
>>> > These datasets originally came from udp2log, but since early last
>>> > year they have been generated from Hadoop.  With the upcoming cache
>>> > merge, these jobs will have to parse through all text requests,
>>> > which will make Hadoop busier.
>>> >
>>> > Do we know if these are being used?  Would anyone be upset if we no
>>> > longer generated these datasets?
>>> >
>>> > Thanks!
>>> > -Andrew
>>> >
>>> > [1] https://phabricator.wikimedia.org/T109286
>>> >
>>> >
>>> > _______________________________________________
>>> > Analytics mailing list
>>> > [email protected]
>>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>>> >
>>>
>>>
>>>
>>> --
>>> Oliver Keyes
>>> Count Logula
>>> Wikimedia Foundation
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>>
>>
>> --
>> Joseph Allemandou
>> Data Engineer @ Wikimedia Foundation
>> IRC: joal
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
>
> --
> Oliver Keyes
> Count Logula
> Wikimedia Foundation
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
> ------------------------------
>
> End of Analytics Digest, Vol 46, Issue 23
> *****************************************
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics



-- 
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to