Emily, I believe the pagecount data was never collected in a structured way before 2007. See for example this discussion about some archive data that took some pains to uncover: https://phabricator.wikimedia.org/T232563
If edits per article would work as a proxy for attention, or in combination with views you can extrapolate somehow, we are in the process of vetting and releasing a simple full history of editing on all wikis: https://dumps.wikimedia.org/other/mediawiki_history/readme.html On Thu, Jan 16, 2020 at 7:52 AM Ryan Kaldari <[email protected]> wrote: > Note that the definition of pageviews has changed several times over the > years. Only the data from 2015 to present is strictly comparable. I'm sure > some data analysts will chime in with more details. Good luck with your > project! > > On Jan 15, 2020, at 6:59 PM, Emily Chen <[email protected]> wrote: > > Hi, > > My name is Emily Chen and I'm a Computer Science Ph.D. student at the > University of Southern California. I tried sending this email earlier > before I had joined the mailer, so apologies if this email was sent out > twice! I'm currently conducting research on collective attention decay in > Wikipedia articles that are more heavily cited by other Wikipedia articles > within the Wikipedia ecosystem. This work builds upon the observations made > in Candia et al's paper on "The universal decay of collective memory and > attention > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nature.com_articles_s41562-2D018-2D0474-2D5_&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=L28nNkR1PtjB2SmfWmCyJg&m=3zQGbMO5CmRLzz0FfWu7BXNsaJO9bff2gb1F5xG8EB8&s=tViSDkiMKEu9TZRabpoJ3dZ-BjHniCvK_5KtxIEVXts&e=>", > and I have been using the number of page views articles receive as a proxy > for attention. > > From what I can find, there is a maintained page view data set on > dumps.wikipmedia.org > <https://urldefense.proofpoint.com/v2/url?u=http-3A__dumps.wikipmedia.org&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=L28nNkR1PtjB2SmfWmCyJg&m=3zQGbMO5CmRLzz0FfWu7BXNsaJO9bff2gb1F5xG8EB8&s=UtOnWjAQWI4l2Mz9WGXCjzGTD1DyHmyToCBOcoipq3c&e=> > that spans 2011-current, and statistics that Domas Mituzas began collecting > from 2007 - 2016. This data seems to capture the gradual decay in an > individual article's pageviews, but doesn't capture the initial growth of > an article's page views. Would you happen to know if there are article page > view statistics from the earlier years of Wikipedia (2001-2007) or if there > are any general page view statistics from that time frame? Or would you > happen to know who I could contact for such a dataset? It would be really > interesting to study the temporal page view dynamics over Wikipedia's > lifespan alongside my current work in collective attention. > > Thank you so much for your time! > > Best, > Emily Chen > > > -- > Emily Chen (echen920 [at] usc [dot] edu) > Ph.D. Student | Computer Science > Viterbi School of Engineering & Information Sciences Institute > University of Southern California > > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
