Hello all, Reminder that the Research Showcase will be this Wednesday. Details below.
On Fri, Nov 15, 2019 at 12:22 PM Janna Layton <jlay...@wikimedia.org> wrote: > Hi all, > > The next Research Showcase will be live-streamed on Wednesday, November > 20, 2019, at 9:30 AM PST/17:30 UTC. We’ll have a presentation from Martin > Potthast of Leipzig University on text reuse in Wikipedia and other > presentation from the Wikimedia Foundation’s Isaac Johnson on the > demographics and interests of Wikipedia’s readers. > > YouTube stream: https://www.youtube.com/watch?v=tIko_V1k09s > > As usual, you can join the conversation on IRC at #wikimedia-research. You > can also watch our past research showcases here: > https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase > > This month's presentations: > > Wikipedia Text Reuse: Within and Without > > By Martin Potthast, Leipzig University > > We study text reuse related to Wikipedia at scale by compiling the first > corpus of text reuse cases within Wikipedia as well as without (i.e., reuse > of Wikipedia text in a sample of the Common Crawl). To discover reuse > beyond verbatim copy and paste, we employ state-of-the-art text reuse > detection technology, scaling it for the first time to process the entire > Wikipedia as part of a distributed retrieval pipeline. We further report on > a pilot analysis of the 100 million reuse cases inside, and the 1.6 million > reuse cases outside Wikipedia that we discovered. Text reuse inside > Wikipedia gives rise to new tasks such as article template induction, > fixing quality flaws, or complementing Wikipedia’s ontology. Text reuse > outside Wikipedia yields a tangible metric for the emerging field of > quantifying Wikipedia’s influence on the web. To foster future research > into these tasks, and for reproducibility’s sake, the Wikipedia text reuse > corpus and the retrieval pipeline are made freely available. Paper > <https://webis.de/publications.html#?q=wikipedia%20ecir%202019>, Demo > <https://demo.webis.de/wikipedia-text-reuse/> > > > Characterizing Wikipedia Reader Demographics and Interests > > By Isaac Johnson, Wikimedia Foundation > > Building on two past surveys on the motivation and needs of Wikipedia > readers (Why We Read Wikipedia > <https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#November_2016>; > Why the World Reads Wikipedia > <https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#December_2018>), > we examine the relationship between Wikipedia reader demographics and their > interests and needs. Specifically, we run surveys in thirteen different > languages that ask readers three questions about their motivation for > reading Wikipedia (motivation, needs, and familiarity) and five questions > about their demographics (age, gender, education, locale, and native > language). We link these survey results with the respondents' reading > sessions -- i.e. sequence of Wikipedia page views -- to gain a more > fine-grained understanding of how a reader's context relates to their > activity on Wikipedia. We find that readers have a diversity of backgrounds > but that the high-level needs of readers do not correlate strongly with > individual demographics. We also find, however, that there are > relationships between demographics and specific topic interests that are > consistent across many cultures and languages. This work provides insights > into the reach of various Wikipedia language editions and the relationship > between content or contributor gaps and reader gaps. See the meta page > <https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour/Demographics_and_Wikipedia_use_cases#Reader_Surveys> > for more details. > > -- > Janna Layton (she, her) > Administrative Assistant - Product & Technology > Wikimedia Foundation <https://wikimediafoundation.org/> > -- Janna Layton (she, her) Administrative Assistant - Product & Technology Wikimedia Foundation <https://wikimediafoundation.org/> _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>