Hi all,

Specifically what I am looking for is page view data for these pages, 
preferably for all months on http://dumps.wikimedia.org/other/pagecounts-raw/ 
(appeared as named 4 Dec):
Abacarus hystrix
Acarus siro
Aceria tosichella
Acyrthosiphon pisum
Ahasverus advena
Anthrenus flavipes
Aphis craccivora
Arhopalus
Balaustium medicagoense
Bemisia tabaci
Brevicoryne brassicae
Bruchus
Ceratitis capitata 
Cicadulina
Cryptolestes
Daktulosphaira vitifoliae
Delia
Ephestia elutella
Ephestia kuehniella
Etiella behrii
Frankliniella occidentalis
Frankliniella
Henosepilachna vigintioctopunctata
Heteronychus arator
Lachesilla quercus
Lasioderma serricorne
Liposcelis bostrychophila
Macrosiphum euphorbiae
Marchalina hellenica
Myzus persicae
Naupactus
Nezara viridula
Oligonychus ununguis
Oryzaephilus surinamensis
Panonychus ulmi 
Penthaleus
Pieris rapae
Piezodorus
Plodia interpunctella
Plutella xylostella
Rhopalosiphon
rhopalosiphum maidis
Rhopalosiphum padi
Rhyzopertha dominica
Sirex noctilio 
Sitophilus granarius
Sitophilus oryzae
Sitotroga cerealella
Sminthurus viridis
Spodoptera exempta
Stegobium paniceum
Tetranychus
Thrips palmi
Thrips
Tribolium castaneum
Tribolium confusum
Trogoderma granarium
Trogoderma

I then also want a total number of page views to standardise the individual 
page views. 

I have looked at stats.gronk.se and wikitrends and I have two issues: 
1. The data is only month by month and I want as many years of data as 
possible. 
2. Some pages have too few page views for wikitrends. 


Thanks for your help!



-----Original Message-----
From: Analytics [mailto:[email protected]] On Behalf Of 
[email protected]
Sent: Tuesday, 15 December 2015 4:11 AM
To: [email protected]
Subject: Analytics Digest, Vol 46, Issue 23

Send Analytics mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.wikimedia.org/mailman/listinfo/analytics
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Analytics digest..."


Today's Topics:

   1. Re: Readership metrics for the fortnight until December 6,
      2015 (Federico Leva (Nemo))
   2. Re: Data collection (Erik Zachte)
   3. Re: Data collection (Federico Leva (Nemo))
   4. Re: Python client for the new pageview API (Dan Andreescu)
   5. Re: mobile and zero legacy tsvs on stat1002 (Oliver Keyes)


----------------------------------------------------------------------

Message: 1
Date: Mon, 14 Dec 2015 13:08:11 +0100
From: "Federico Leva (Nemo)" <[email protected]>
To: A mailing list for the Analytics Team at WMF and everybody who has
        an interest in Wikipedia and "analytics."
        <[email protected]>
Subject: Re: [Analytics] Readership metrics for the fortnight until
        December 6, 2015
Message-ID: <[email protected]>
Content-Type: text/plain; charset=utf-8; format=flowed

Interesting country breakdown!

Tilman Bayer, 14/12/2015 12:32:
>
> For the top three, I looked at how pageviews developed on a daily 
> basis during the last three month including the week after this large 
> change (until Dec 6):
>
>
> In Greece, the +21.6% rise was the result of an isolated spike from 
> November 23-25. This can be traced to a single page on the Greek 
> Wiktionary which on most days before and after only saw a single-digit 
> number of pageviews, but on these three days received more than 2.8
> million: τάλε κουάλε
> <https://el.wiktionary.org/wiki/%CF%84%CE%AC%CE%BB%CE%B5_%CE%BA%CE%BF%CF%85%CE%AC%CE%BB%CE%B5>.
> It’s about an expression that apparently comes from Latin via Italian 
> (“tale quale”) <https://en.wiktionary.org/wiki/tale_e_quale>and means 
> something like “exactly the same” or “spitting image”. From the form 
> of the spike, it was likely not the result of actual human interest, 
> rather an undetected bot trying to learn exactly the same about exactly the 
> same.
>
>
>
> In Ireland, the -20.6% drop marked the end of a plateau whose start 
> had actually shown up in the report for the week until November 1 
> <https://lists.wikimedia.org/pipermail/mobile-l/2015-November/009919.h
> tml>already, where the country was the top changer with a 40.2% rise.
>
>
> For South Africa, the -20.6% drop does not form part of a clear pattern.
>



------------------------------

Message: 2
Date: Mon, 14 Dec 2015 14:14:17 +0100
From: "Erik Zachte" <[email protected]>
To: "'A mailing list for the Analytics Team at WMF and everybody who
        has an interest in Wikipedia and analytics.'"
        <[email protected]>
Subject: Re: [Analytics] Data collection
Message-ID: <[email protected]>
Content-Type: text/plain; charset="utf-8"

Hi Caitlin,

 

Here is a breakdown of categories within Phytopathology on English wikipedia: 
http://ow.ly/VQNVL

and the articles within those categories ranked by page view for Oct 2015 : 
http://ow.ly/VQNCv

 

I can run similar reports for earlier months.

 

Cheers,

Erik

 

 

From: Analytics [mailto:[email protected]] On Behalf Of 
Alex Druk
Sent: Monday, December 14, 2015 10:44
To: A mailing list for the Analytics Team at WMF and everybody who has an 
interest in Wikipedia and analytics.
Subject: Re: [Analytics] Data collection

 

Hi Caitlin,

 

If you have a list of relevant articles and understanding what time period you 
would like to research, contact me of the list and I probably can help you. 

Also my advise: have a look at wikipediatrends.com or stats.grok.se and try 
some of your queries to get a better undestanding of possible results.

Best wishes,

 

On Mon, Dec 14, 2015 at 12:04 AM, <[email protected]> wrote:

Hi All,

 

I am a summer research intern with the Commonwealth Scientific and Industrial 
Research Organisation (CSIRO) in Australia. I am studying a statistics degree 
and so I don’t really have skills in the type of data collection required to 
access the Wiki data for my research. I was wondering if someone might be able 
to give me a hand (by pointing me in the right direction)?

 

I have a list of pest species that I wish to find the total number of page 
views via stats.grok.se or https://dumps.wikimedia.org/other/pagecounts-raw/ . 
There must be a good method to go through and pick out page views by name 
rather than by hand (which obviously isn’t feasible)? I’d also need to be able 
to find the total number of page views for each period in order to standardize 
the response to account for the increase in traffic over the years. 

 

We are in the process of gathering similar data through a Plant Pest database 
but due to privacy concerns, the organisation is arranging to reconcile the 
data on our behalf and so I do not have a part in that.

 

Any help would be really appreciated!

 

Kind regards,

Caitlin Gardner


_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics





 

-- 

Thank you.

Alex Druk, PhD

wikipediatrends.com
[email protected]
(775) 237-8550 Google voice

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://lists.wikimedia.org/pipermail/analytics/attachments/20151214/9ec9b28b/attachment-0001.html>

------------------------------

Message: 3
Date: Mon, 14 Dec 2015 15:25:03 +0100
From: "Federico Leva (Nemo)" <[email protected]>
To: A mailing list for the Analytics Team at WMF and everybody who has
        an interest in Wikipedia and "analytics."
        <[email protected]>
Subject: Re: [Analytics] Data collection
Message-ID: <[email protected]>
Content-Type: text/plain; charset=utf-8; format=flowed

Erik Zachte, 14/12/2015 14:14:
> I can run similar reports for earlier months.

Thanks for publishing that code too! 
https://github.com/wikimedia/analytics-wikistats/tree/master/dammit.lt/bash

Nemo



------------------------------

Message: 4
Date: Mon, 14 Dec 2015 09:32:24 -0500
From: Dan Andreescu <[email protected]>
To: Analytics List <[email protected]>,  Research into
        Wikimedia content and communities
        <[email protected]>
Subject: Re: [Analytics] Python client for the new pageview API
Message-ID:
        <ca+aepcs4n-z4qd-wzw7v_j5aipb01ncwzrluhtbiwwy_ofc...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

I wasn't aware of some conventions that came before me, so I moved the project 
from milimetric/wmf to mediawiki-utilities/python-mwviews.  I promise it'll 
stay there, sorry for the inconvenience.  Updated links:

PyPI: https://pypi.python.org/pypi/mwviews/0.0.2
code: https://github.com/mediawiki-utilities/python-mwviews (PRs still welcome, 
thanks for the 2 you already helped with!)

On Fri, Dec 11, 2015 at 10:36 PM, Dan Andreescu <[email protected]>
wrote:

> Along the same lines as Oliver's great R client [1], I just started 
> work on a python version:
>
> PyPI: https://pypi.python.org/pypi/wmf/0.1
> code: https://github.com/milimetric/wmf (PRs welcome)
>
> And if you're trying to skip past all the setup repository cruft, the
> meat:
> https://github.com/milimetric/wmf/blob/master/wmf/analytics/api/pagevi
> ews.py
>
>
> [1] https://github.com/Ironholds/pageviews
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://lists.wikimedia.org/pipermail/analytics/attachments/20151214/c0a9adf5/attachment-0001.html>

------------------------------

Message: 5
Date: Mon, 14 Dec 2015 12:10:50 -0500
From: Oliver Keyes <[email protected]>
To: "A mailing list for the Analytics Team at WMF and everybody who
        has an interest in Wikipedia and analytics."
        <[email protected]>
Subject: Re: [Analytics] mobile and zero legacy tsvs on stat1002
Message-ID:
        <CAAUQgdADvcJdt_6+PgELg0P6nxM3C=6uydv3pboxjcpffc3...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Gotcha! Long as it's set for every request, perfect :)

On 14 December 2015 at 04:50, Joseph Allemandou <[email protected]> 
wrote:
> @Oliver: I think the closest we'll have is the access-method field, 
> that can take values desktop, mobile-web, mobile-app.
>
> On Sun, Dec 13, 2015 at 8:37 PM, Oliver Keyes <[email protected]> wrote:
>>
>> Not an answer to the question, but a question of my own; will the 
>> nature of the content being served still be present as /some/ field?
>> FWIW I've found it very helpful to be able to use webrequest_source 
>> to trivially distinguish mobile and desktop requests.
>>
>> On 11 December 2015 at 12:40, Andrew Otto <[email protected]> wrote:
>> > Hi all,
>> >
>> > Soon, we will be merging the mobile web cache requests with the 
>> > text cache requests.  text caches will now serve requests for 
>> > mobile web[1].
>> >
>> > This means that the webrequest_source=‘mobile’ partition in the 
>> > webrequest table in Hive will soon be empty, and all data that was 
>> > previously in it will be found in the webrequest_source=‘text’ 
>> > partition.
>> >
>> > There are only 3 datasets that currently only use the 
>> > webrequest_source=‘mobile’ partition:
>> >
>> > - /a/log/webrequest/archive/mobile
>> > - /a/log/webrequest/archive/5xx-mobile
>> > - /a/log/webrequest/archive/zero
>> >
>> > (These are paths on stat1002, but they also exist in HDFS.)
>> >
>> > These datasets originally came from udp2log, but since early last 
>> > year they have been generated from Hadoop.  With the upcoming cache 
>> > merge, these jobs will have to parse through all text requests, 
>> > which will make Hadoop busier.
>> >
>> > Do we know if these are being used?  Would anyone be upset if we no 
>> > longer generated these datasets?
>> >
>> > Thanks!
>> > -Andrew
>> >
>> > [1] https://phabricator.wikimedia.org/T109286
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Count Logula
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
>
> --
> Joseph Allemandou
> Data Engineer @ Wikimedia Foundation
> IRC: joal
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



--
Oliver Keyes
Count Logula
Wikimedia Foundation



------------------------------

Subject: Digest Footer

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics


------------------------------

End of Analytics Digest, Vol 46, Issue 23
*****************************************
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to