Hi Lars,

You have a couple of options:

1. download the data in lossless compressed form, https://dumps.wikimedia.
org/other/pagecounts-ez/  The format is clever and doesn't lose
granularity, should be a lot quicker than pagecounts-raw (this is basically
what stats.grok.se did with the data as well, so downloading this way
should be equivalent)
2. work on Toolforge, a virtual cloud that's on the same network as the
data, so getting the data is a lot faster and you can use our compute
resources (free, of course): https://wikitech.wikimedia.org/wiki/Portal:
Toolforge

If you decide to go with the second option, the IRC channel where they
support folks like you is #wikimedia-cloud and you can always find me there
as milimetric.


On Tue, Feb 20, 2018 at 12:51 PM, Lars Hillebrand <[email protected]
> wrote:

> Dear Analytics Team,
>
> I am a M.Sc. student at Copenhagen Business School. For my Master Thesis I
> would like to use page views data from certain Wikipedia articles. I found
> out that in July 2015 a new API was created which delivers this data.
> However, for my project I have to use data from before 2015.
> In my further search I found out that the old page views data exists (
> https://dumps.wikimedia.org/other/pagecounts-raw/) and until March 2017
> it could be queried by using stats.grok.se. Unfortunately, this site does
> no longer exists, which is why I cannot filter and query the raw data in
> .gz format on the webpage.
>
> Are there any possibilities to get the page views data for certain
> articles from before July 2017?
>
> Thanks a lot and best regards,
>
> Lars Hillebrand
>
> PS: I am conducting my research in R and for the post 2015 data the
> package “pageviews” works great.
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to