Emily, I believe the pagecount data was never collected in a structured way
before 2007.  See for example this discussion about some archive data that
took some pains to uncover: https://phabricator.wikimedia.org/T232563

If edits per article would work as a proxy for attention, or in combination
with views you can extrapolate somehow, we are in the process of vetting
and releasing a simple full history of editing on all wikis:
https://dumps.wikimedia.org/other/mediawiki_history/readme.html

On Thu, Jan 16, 2020 at 7:52 AM Ryan Kaldari <[email protected]> wrote:

> Note that the definition of pageviews has changed several times over the
> years. Only the data from 2015 to present is strictly comparable. I'm sure
> some data analysts will chime in with more details. Good luck with your
> project!
>
> On Jan 15, 2020, at 6:59 PM, Emily Chen <[email protected]> wrote:
>
> Hi,
>
> My name is Emily Chen and I'm a Computer Science Ph.D. student at the
> University of Southern California. I tried sending this email earlier
> before I had joined the mailer, so apologies if this email was sent out
> twice! I'm currently conducting research on collective attention decay in
> Wikipedia articles that are more heavily cited by other Wikipedia articles
> within the Wikipedia ecosystem. This work builds upon the observations made
> in Candia et al's paper on "The universal decay of collective memory and
> attention
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nature.com_articles_s41562-2D018-2D0474-2D5_&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=L28nNkR1PtjB2SmfWmCyJg&m=3zQGbMO5CmRLzz0FfWu7BXNsaJO9bff2gb1F5xG8EB8&s=tViSDkiMKEu9TZRabpoJ3dZ-BjHniCvK_5KtxIEVXts&e=>",
> and I have been using the number of page views articles receive as a proxy
> for attention.
>
> From what I can find, there is a maintained page view data set on
> dumps.wikipmedia.org
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__dumps.wikipmedia.org&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=L28nNkR1PtjB2SmfWmCyJg&m=3zQGbMO5CmRLzz0FfWu7BXNsaJO9bff2gb1F5xG8EB8&s=UtOnWjAQWI4l2Mz9WGXCjzGTD1DyHmyToCBOcoipq3c&e=>
> that spans 2011-current, and statistics that Domas Mituzas began collecting
> from 2007 - 2016. This data seems to capture the gradual decay in an
> individual article's pageviews, but doesn't capture the initial growth of
> an article's page views. Would you happen to know if there are article page
> view statistics from the earlier years of Wikipedia (2001-2007) or if there
> are any general page view statistics from that time frame? Or would you
> happen to know who I could contact for such a dataset? It would be really
> interesting to study the temporal page view dynamics over Wikipedia's
> lifespan alongside my current work in collective attention.
>
> Thank you so much for your time!
>
> Best,
> Emily Chen
>
>
> --
> Emily Chen (echen920 [at] usc [dot] edu)
> Ph.D. Student | Computer Science
> Viterbi School of Engineering & Information Sciences Institute
> University of Southern California
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to