Re: [Wikimedia-l] [Wikimedia Research Showcase] November 20, 2019 at 9:30 AM PST, 17:30 UTC

2019-11-20 Thread Janna Layton
If you have not already done so, please send us your feedback on the
Research Showcase with this survey:
https://docs.google.com/forms/d/e/1FAIpQLSecgn8cMu5IfTYRgn93bfOiJVEIL09RRf_WV0dVr6ZnJ8UU_w/viewform

The Showcase will be starting in about 30 minutes.

On Mon, Nov 18, 2019 at 3:01 PM Janna Layton  wrote:

> Hello all,
>
> Reminder that the Research Showcase will be this Wednesday. Details below.
>
> On Fri, Nov 15, 2019 at 12:22 PM Janna Layton 
> wrote:
>
>> Hi all,
>>
>> The next Research Showcase will be live-streamed on Wednesday, November
>> 20, 2019, at 9:30 AM PST/17:30 UTC. We’ll have a presentation from Martin
>> Potthast of Leipzig University on text reuse in Wikipedia and other
>> presentation from the Wikimedia Foundation’s Isaac Johnson on the
>> demographics and interests of Wikipedia’s readers.
>>
>> YouTube stream: https://www.youtube.com/watch?v=tIko_V1k09s
>>
>> As usual, you can join the conversation on IRC at #wikimedia-research.
>> You can also watch our past research showcases here:
>> https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
>>
>> This month's presentations:
>>
>> Wikipedia Text Reuse: Within and Without
>>
>> By Martin Potthast, Leipzig University
>>
>> We study text reuse related to Wikipedia at scale by compiling the first
>> corpus of text reuse cases within Wikipedia as well as without (i.e., reuse
>> of Wikipedia text in a  sample of the Common Crawl). To discover reuse
>> beyond verbatim copy and paste, we employ state-of-the-art text reuse
>> detection technology, scaling it for the first time to process the entire
>> Wikipedia as part of a distributed retrieval pipeline. We further report on
>> a pilot analysis of the 100 million reuse cases inside, and the 1.6 million
>> reuse cases outside Wikipedia that we discovered. Text reuse inside
>> Wikipedia gives rise to new tasks such as article template induction,
>> fixing quality flaws, or complementing Wikipedia’s ontology. Text reuse
>> outside Wikipedia yields a tangible metric for the emerging field of
>> quantifying Wikipedia’s influence on the web. To foster future research
>> into these tasks, and for reproducibility’s sake, the Wikipedia text reuse
>> corpus and the retrieval pipeline are made freely available. Paper
>> , Demo
>> 
>>
>>
>> Characterizing Wikipedia Reader Demographics and Interests
>>
>> By Isaac Johnson, Wikimedia Foundation
>>
>> Building on two past surveys on the motivation and needs of Wikipedia
>> readers (Why We Read Wikipedia
>> ;
>> Why the World Reads Wikipedia
>> ),
>> we examine the relationship between Wikipedia reader demographics and their
>> interests and needs. Specifically, we run surveys in thirteen different
>> languages that ask readers three questions about their motivation for
>> reading Wikipedia (motivation, needs, and familiarity) and five questions
>> about their demographics (age, gender, education, locale, and native
>> language). We link these survey results with the respondents' reading
>> sessions -- i.e. sequence of Wikipedia page views -- to gain a more
>> fine-grained understanding of how a reader's context relates to their
>> activity on Wikipedia. We find that readers have a diversity of backgrounds
>> but that the high-level needs of readers do not correlate strongly with
>> individual demographics. We also find, however, that there are
>> relationships between demographics and specific topic interests that are
>> consistent across many cultures and languages. This work provides insights
>> into the reach of various Wikipedia language editions and the relationship
>> between content or contributor gaps and reader gaps. See the meta page
>> 
>> for more details.
>>
>> --
>> Janna Layton (she, her)
>> Administrative Assistant - Product & Technology
>> Wikimedia Foundation 
>>
>
>
> --
> Janna Layton (she, her)
> Administrative Assistant - Product & Technology
> Wikimedia Foundation 
>


-- 
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wikimedia Research Showcase] November 20, 2019 at 9:30 AM PST, 17:30 UTC

2019-11-18 Thread Janna Layton
Hello all,

Reminder that the Research Showcase will be this Wednesday. Details below.

On Fri, Nov 15, 2019 at 12:22 PM Janna Layton  wrote:

> Hi all,
>
> The next Research Showcase will be live-streamed on Wednesday, November
> 20, 2019, at 9:30 AM PST/17:30 UTC. We’ll have a presentation from Martin
> Potthast of Leipzig University on text reuse in Wikipedia and other
> presentation from the Wikimedia Foundation’s Isaac Johnson on the
> demographics and interests of Wikipedia’s readers.
>
> YouTube stream: https://www.youtube.com/watch?v=tIko_V1k09s
>
> As usual, you can join the conversation on IRC at #wikimedia-research. You
> can also watch our past research showcases here:
> https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
>
> This month's presentations:
>
> Wikipedia Text Reuse: Within and Without
>
> By Martin Potthast, Leipzig University
>
> We study text reuse related to Wikipedia at scale by compiling the first
> corpus of text reuse cases within Wikipedia as well as without (i.e., reuse
> of Wikipedia text in a  sample of the Common Crawl). To discover reuse
> beyond verbatim copy and paste, we employ state-of-the-art text reuse
> detection technology, scaling it for the first time to process the entire
> Wikipedia as part of a distributed retrieval pipeline. We further report on
> a pilot analysis of the 100 million reuse cases inside, and the 1.6 million
> reuse cases outside Wikipedia that we discovered. Text reuse inside
> Wikipedia gives rise to new tasks such as article template induction,
> fixing quality flaws, or complementing Wikipedia’s ontology. Text reuse
> outside Wikipedia yields a tangible metric for the emerging field of
> quantifying Wikipedia’s influence on the web. To foster future research
> into these tasks, and for reproducibility’s sake, the Wikipedia text reuse
> corpus and the retrieval pipeline are made freely available. Paper
> , Demo
> 
>
>
> Characterizing Wikipedia Reader Demographics and Interests
>
> By Isaac Johnson, Wikimedia Foundation
>
> Building on two past surveys on the motivation and needs of Wikipedia
> readers (Why We Read Wikipedia
> ;
> Why the World Reads Wikipedia
> ),
> we examine the relationship between Wikipedia reader demographics and their
> interests and needs. Specifically, we run surveys in thirteen different
> languages that ask readers three questions about their motivation for
> reading Wikipedia (motivation, needs, and familiarity) and five questions
> about their demographics (age, gender, education, locale, and native
> language). We link these survey results with the respondents' reading
> sessions -- i.e. sequence of Wikipedia page views -- to gain a more
> fine-grained understanding of how a reader's context relates to their
> activity on Wikipedia. We find that readers have a diversity of backgrounds
> but that the high-level needs of readers do not correlate strongly with
> individual demographics. We also find, however, that there are
> relationships between demographics and specific topic interests that are
> consistent across many cultures and languages. This work provides insights
> into the reach of various Wikipedia language editions and the relationship
> between content or contributor gaps and reader gaps. See the meta page
> 
> for more details.
>
> --
> Janna Layton (she, her)
> Administrative Assistant - Product & Technology
> Wikimedia Foundation 
>


-- 
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


[Wikimedia-l] [Wikimedia Research Showcase] November 20, 2019 at 9:30 AM PST, 17:30 UTC

2019-11-15 Thread Janna Layton
Hi all,

The next Research Showcase will be live-streamed on Wednesday, November 20,
2019, at 9:30 AM PST/17:30 UTC. We’ll have a presentation from Martin
Potthast of Leipzig University on text reuse in Wikipedia and other
presentation from the Wikimedia Foundation’s Isaac Johnson on the
demographics and interests of Wikipedia’s readers.

YouTube stream: https://www.youtube.com/watch?v=tIko_V1k09s

As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

This month's presentations:

Wikipedia Text Reuse: Within and Without

By Martin Potthast, Leipzig University

We study text reuse related to Wikipedia at scale by compiling the first
corpus of text reuse cases within Wikipedia as well as without (i.e., reuse
of Wikipedia text in a  sample of the Common Crawl). To discover reuse
beyond verbatim copy and paste, we employ state-of-the-art text reuse
detection technology, scaling it for the first time to process the entire
Wikipedia as part of a distributed retrieval pipeline. We further report on
a pilot analysis of the 100 million reuse cases inside, and the 1.6 million
reuse cases outside Wikipedia that we discovered. Text reuse inside
Wikipedia gives rise to new tasks such as article template induction,
fixing quality flaws, or complementing Wikipedia’s ontology. Text reuse
outside Wikipedia yields a tangible metric for the emerging field of
quantifying Wikipedia’s influence on the web. To foster future research
into these tasks, and for reproducibility’s sake, the Wikipedia text reuse
corpus and the retrieval pipeline are made freely available. Paper
, Demo



Characterizing Wikipedia Reader Demographics and Interests

By Isaac Johnson, Wikimedia Foundation

Building on two past surveys on the motivation and needs of Wikipedia
readers (Why We Read Wikipedia
; Why
the World Reads Wikipedia
),
we examine the relationship between Wikipedia reader demographics and their
interests and needs. Specifically, we run surveys in thirteen different
languages that ask readers three questions about their motivation for
reading Wikipedia (motivation, needs, and familiarity) and five questions
about their demographics (age, gender, education, locale, and native
language). We link these survey results with the respondents' reading
sessions -- i.e. sequence of Wikipedia page views -- to gain a more
fine-grained understanding of how a reader's context relates to their
activity on Wikipedia. We find that readers have a diversity of backgrounds
but that the high-level needs of readers do not correlate strongly with
individual demographics. We also find, however, that there are
relationships between demographics and specific topic interests that are
consistent across many cultures and languages. This work provides insights
into the reach of various Wikipedia language editions and the relationship
between content or contributor gaps and reader gaps. See the meta page

for more details.

-- 
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,