Re: [Wikimedia-l] Announcement of the Scoring Platform team

2017-07-25 Thread Aaron Halfaker
Hi Rogol,

In the blog post, I include a major section titled "Where we plan to go
next" which gives a high level overview of our plans for the next year.
There's a section right beneath that called "How to learn more and get
involved" with links to our team documentation, out technical blog, ORES'
documentation, our mailing list, and our IRC channel.

-Aaron

On Sun, Jul 23, 2017 at 3:17 AM, Rogol Domedonfors 
wrote:

> James
>
> In the light of what you say, I expect that Aaron will have no problem with
> being asked to follow his announcement on this list of "Democarizing axxess
> to AI" by posting in the same location as the original announcment a
> follow-up with a pointer to the plans and places for the community to be
> able to engage with his team in this democratic enterprise.  It seems that
> you yourself do not have this information, or if you do, that you do not
> wish to share it with the rest of us here.
>
> "Rogol"
>
> On Sun, Jul 23, 2017 at 8:46 AM, James Salsman  wrote:
>
> > Rogol, you might want to look at the history of Aaron's talk pages and
> > e.g. on Jimbotalk and various places on meta. He's been incredibly
> > receptive to suggestions and ideas from the community, moreso than
> > perhaps any other Foundation employee.
> >
> >
> > On Sun, Jul 23, 2017 at 12:59 AM, Rogol Domedonfors
> >  wrote:
> > > Aaron,
> > >
> > > You write of "Democratizing access to AI."   But it seems that what you
> > > mean is publishing the results of your work more widely.  Do you have
> > plan
> > > to democratize in the sense of involving a wider range of people in the
> > > decisions about how you work and what you work on – the wider Wikimedia
> > > Community, for example – and if so, how will you engage with that wider
> > > decision-making group?
> > >
> > > "Rogol"
> > >
> > > On Wed, Jul 19, 2017 at 7:42 PM, Aaron Halfaker <
> ahalfa...@wikimedia.org
> > >
> > > wrote:
> > >
> > >> Hey folks,
> > >>
> > >> This is a little overdue, but I wanted to work with comms to craft a
> > blog
> > >> post that would help us do a bit of outreach around the announcement
> of
> > the
> > >> team.  That just went live.
> > >>
> > >> See https://blog.wikimedia.org/2017/07/19/scoring-platform-team/
> > >>
> > >> -Aaron
> > >> Principal research scientist
> > >> Lead of the Scoring Platform team
> > >> Wikimedia Foundation
> > >> ___
> > >> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > >> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > >> wiki/Wikimedia-l
> > >> New messages to: Wikimedia-l@lists.wikimedia.org
> > >> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
> ,
> > >> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
> > > ___
> > > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > wiki/Wikimedia-l
> > > New messages to: Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
> >
> > ___
> > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > wiki/Wikimedia-l
> > New messages to: Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
> >
> ___
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wik
> i/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

[Wikimedia-l] Announcement of the Scoring Platform team

2017-07-19 Thread Aaron Halfaker
Hey folks,

This is a little overdue, but I wanted to work with comms to craft a blog
post that would help us do a bit of outreach around the announcement of the
team.  That just went live.

See https://blog.wikimedia.org/2017/07/19/scoring-platform-team/

-Aaron
Principal research scientist
Lead of the Scoring Platform team
Wikimedia Foundation
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Join my Reddit AMA about Wikipedia and ethical, transparent AI

2017-06-01 Thread Aaron Halfaker
The AMA is live.  Here's the link:
https://www.reddit.com/r/IAmA/comments/6epiid/im_the_principal_research_scientist_at_the/

On Wed, May 24, 2017 at 10:13 AM, Aaron Halfaker 
wrote:

> Hey everybody,
>
> TL;DR: I wanted to let you know about an upcoming experimental Reddit AMA
> ("ask me anything") chat we have planned. It will focus on artificial
> intelligence on Wikipedia and how we're working to counteract vandalism
> while also making life better for newcomers.
>
> We plan to hold this chat on June 1st at 21:00 UTC/14:00 PST in the
> /r/iAMA subreddit[1]. I'd love to answer any questions you have about these
> topics questions, and I'll send a follow-up email to this thread shortly
> before the AMA begins.
>
> 
>
> For those who don't know who I am, I create artificial intelligences[2]
> that support the volunteers who edit Wikipedia[3]. I've been fascinated by
> the ways that crowds of volunteers build massive, high quality information
> resources like Wikipedia for over ten years.
>
> For more background, I research and then design technologies that make it
> easier to spot vandalism in Wikipedia—which helps support the hundreds of
> thousands of editors who make productive contributions. I also think a lot
> about the dynamics between communities and new users—and ways to make
> communities inviting and welcoming to both long-time community members and
> newcomers who may not be aware of community norms.  For a quick sampling of
> my work, check out my most impactful research paper about Wikipedia[3],
> some recent coverage of my work from *Wired*[4], or check out the master
> list of my projects on my WMF staff user page[5], the documentation for the
> technology team I run[9], or the home page for Wikimedia Research[8].
>
> This AMA, which I'm doing with with the Foundation's Communications
> department, is somewhat of an experiment. The intended audience for this
> chat is people who might not currently be a part of our community but have
> questions about the way we work—as well as potential research collaborators
> who might want to work with our data or tools. Many may be familiar with
> Wikipedia but not the work we do as a community behind the scenes.
>
> I'll be talking about the work I'm doing with the ethics of AI and how we
> think about artificial intelligence on Wikipedia, and ways we’re working to
> counteract vandalism on the world’s largest crowdsourced source of
> knowledge—like the ORES extension[6], which you may have seen highlighting
> possibly problematic edits on your watchlist and in RecentChanges.
>
> I’d love for you to join this chat and ask questions.  If you do not or
> prefer not to use Reddit, we will also be taking questions on ORES'
> MediaWiki talk page[7] and posting answers to both threads.
>
> 1. https://www.reddit.com/r/IAmA/
> 2. https://en.wikipedia.org/wiki/Artificial_intelligence
> 2. https://www.mediawiki.org/wiki/ORES
> 3. http://www-users.cs.umn.edu/~halfak/publications/The_Rise_
> and_Decline/halfaker13rise-preprint.pdf
> 4. https://www.wired.com/2015/12/wikipedia-is-using-ai-to-
> expand-the-ranks-of-human-editors/
> 5. https://en.wikipedia.org/wiki/User:Halfak_(WMF)
> 6. https://www.mediawiki.org/wiki/Extension:ORES
> 7. https://www.mediawiki.org/wiki/Talk:ORES
> 8. https://www.mediawiki.org/wiki/Wikimedia_Research
> 9. https://www.mediawiki.org/wiki/Wikimedia_Scoring_Platform_team
>
> -Aaron
> Principal Research Scientist @ WMF
> User:EpochFail / User:Halfak (WMF)
>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

[Wikimedia-l] Join my Reddit AMA about Wikipedia and ethical, transparent AI

2017-05-24 Thread Aaron Halfaker
Hey everybody,

TL;DR: I wanted to let you know about an upcoming experimental Reddit AMA
("ask me anything") chat we have planned. It will focus on artificial
intelligence on Wikipedia and how we're working to counteract vandalism
while also making life better for newcomers.

We plan to hold this chat on June 1st at 21:00 UTC/14:00 PST in the /r/iAMA
subreddit[1]. I'd love to answer any questions you have about these topics
questions, and I'll send a follow-up email to this thread shortly before
the AMA begins.



For those who don't know who I am, I create artificial intelligences[2]
that support the volunteers who edit Wikipedia[3]. I've been fascinated by
the ways that crowds of volunteers build massive, high quality information
resources like Wikipedia for over ten years.

For more background, I research and then design technologies that make it
easier to spot vandalism in Wikipedia—which helps support the hundreds of
thousands of editors who make productive contributions. I also think a lot
about the dynamics between communities and new users—and ways to make
communities inviting and welcoming to both long-time community members and
newcomers who may not be aware of community norms.  For a quick sampling of
my work, check out my most impactful research paper about Wikipedia[3],
some recent coverage of my work from *Wired*[4], or check out the master
list of my projects on my WMF staff user page[5], the documentation for the
technology team I run[9], or the home page for Wikimedia Research[8].

This AMA, which I'm doing with with the Foundation's Communications
department, is somewhat of an experiment. The intended audience for this
chat is people who might not currently be a part of our community but have
questions about the way we work—as well as potential research collaborators
who might want to work with our data or tools. Many may be familiar with
Wikipedia but not the work we do as a community behind the scenes.

I'll be talking about the work I'm doing with the ethics of AI and how we
think about artificial intelligence on Wikipedia, and ways we’re working to
counteract vandalism on the world’s largest crowdsourced source of
knowledge—like the ORES extension[6], which you may have seen highlighting
possibly problematic edits on your watchlist and in RecentChanges.

I’d love for you to join this chat and ask questions.  If you do not or
prefer not to use Reddit, we will also be taking questions on ORES'
MediaWiki talk page[7] and posting answers to both threads.

1. https://www.reddit.com/r/IAmA/
2. https://en.wikipedia.org/wiki/Artificial_intelligence
2. https://www.mediawiki.org/wiki/ORES
3.
http://www-users.cs.umn.edu/~halfak/publications/The_Rise_and_Decline/halfaker13rise-preprint.pdf
4.
https://www.wired.com/2015/12/wikipedia-is-using-ai-to-expand-the-ranks-of-human-editors/
5. https://en.wikipedia.org/wiki/User:Halfak_(WMF)
6. https://www.mediawiki.org/wiki/Extension:ORES
7. https://www.mediawiki.org/wiki/Talk:ORES
8. https://www.mediawiki.org/wiki/Wikimedia_Research
9. https://www.mediawiki.org/wiki/Wikimedia_Scoring_Platform_team

-Aaron
Principal Research Scientist @ WMF
User:EpochFail / User:Halfak (WMF)
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wiki-research-l] Research showcase: Evolution of privacy loss in Wikipedia

2016-03-19 Thread Aaron Halfaker
Reminder, this showcase is starting in 5 minutes.  See the stream here:
https://www.youtube.com/watch?v=Xle0oOFCNnk

Join us on Freenode at #wikimedia-research
 to ask Andrei
questions.

-Aaron

On Tue, Mar 15, 2016 at 12:53 PM, Dario Taraborelli <
dtarabore...@wikimedia.org> wrote:

> This month, our research showcase
>  hosts
> Andrei Rizoiu (Australian National University) to talk about his work
>  on *how private traits of
> Wikipedia editors can be exposed from public data* (such as edit
> histories) using off-the-shelf machine learning techniques. (abstract below)
>
> If you're interested in learning what the combination of machine learning
> and public data mean for privacy and surveillance, come and join us this 
> *Wednesday
> March 16* at *1pm Pacific Time*.
>
> The event will be recorded and publicly streamed
> . As usual, we will be
> hosting the conversation with the speaker and Q&A on the
> #wikimedia-research channel on IRC.
>
> Looking forward to seeing you there,
>
> Dario
>
>
> Evolution of Privacy Loss in WikipediaThe cumulative effect of collective
> online participation has an important and adverse impact on individual
> privacy. As an online system evolves over time, new digital traces of
> individual behavior may uncover previously hidden statistical links between
> an individual’s past actions and her private traits. To quantify this
> effect, we analyze the evolution of individual privacy loss by studying
> the edit history of Wikipedia over 13 years, including more than 117,523
> different users performing 188,805,088 edits. We trace each Wikipedia’s
> contributor using apparently harmless features, such as the number of edits
> performed on predefined broad categories in a given time period (e.g.
> Mathematics, Culture or Nature). We show that even at this unspecific level
> of behavior description, it is possible to use off-the-shelf machine
> learning algorithms to uncover usually undisclosed personal traits, such as
> gender, religion or education. We provide empirical evidence that the
> prediction accuracy for almost all private traits consistently improves
> over time. Surprisingly, the prediction performance for users who stopped
> editing after a given time still improves. The activities performed by new
> users seem to have contributed more to this effect than additional
> activities from existing (but still active) users. Insights from this work
> should help users, system designers, and policy makers understand and make
> long-term design choices in online content creation systems.
>
>
> *Dario Taraborelli  *Head of Research, Wikimedia Foundation
> wikimediafoundation.org • nitens.org • @readermeter
> 
>
> ___
> Wiki-research-l mailing list
> wiki-researc...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] [Wmfall] Wikimedia Foundation executive transition update

2016-03-11 Thread Aaron Halfaker
Congrats Katherine!  This is certainly welcome news. :)

On Fri, Mar 11, 2016 at 9:10 AM, Victor Grigas 
wrote:

> What a good message to wake up to!
>
> I'd like to politely suggest along with any decision making ahead, that
> the board please leave the door open to considering Katherine as permanent
> executive director of the Wikimedia Foundation.
>
>
> On Fri, Mar 11, 2016 at 5:36 AM, Tanweer Morshed 
> wrote:
>
>> Congratulations to Katherine for taking up the responsibility! And at the
>> same time, thanks to the Board and C-levels for coming up with a
>> thoughtful
>> and acceptable decision.
>> It's great to see someone taking up the charge, who has been involved with
>> the movement for a long time.
>>
>>
>> Regards,
>> Tanweer
>>
>> On Fri, Mar 11, 2016 at 4:25 PM, Nicole Ebber 
>> wrote:
>>
>> > Congratulations, Katherine, and thank you for taking on this
>> > responsibility! I am glad to read all those congrats and wonderful
>> > emails from people involved.
>> >
>> > And thank you to the C-level team and Board for making this decision,
>> > and for making it quickly and transparent.
>> >
>> > I look forward to working with you, and to warmly welcoming you in
>> > Berlin in April!
>> >
>> > Cheers,
>> > Nicole
>> >
>> > On 11 March 2016 at 11:18, David Parreño Mont 
>> > wrote:
>> > > OMG! Many congratulations, Katherine!
>> > >
>> > > El dv., 11 març 2016 a les 4:00, Katherine Maher (<
>> kma...@wikimedia.org
>> > >)
>> > > va escriure:
>> > >
>> > >> Thank you, Patricio.
>> > >>
>> > >> I want to thank the Board for this opportunity, and for their
>> > confidence in
>> > >> the Foundation. I also want to thank community members and staff for
>> > >> continuing to be such committed advocates for our future -- your
>> passion
>> > >> and belief in our movement and purpose have been tremendous things to
>> > >> behold.
>> > >>
>> > >> As a movement, we’ve had some challenges lately. We’ve started on a
>> > process
>> > >> of change, but as Lydia Pintscher (User:Lydia Pintscher (WMDE))
>> recently
>> > >> reminded us,[1] “Change happens at the speed of trust.” We will need
>> to
>> > >> work together over these coming months to build that trust, and open
>> > >> critical lines of communication and accountability. I get the sense
>> from
>> > >> many people that that’s exactly what they’d like to do: absorb the
>> > lessons
>> > >> we’ve learned, re-engage with each other, and get back to advancing
>> our
>> > >> global movement.
>> > >>
>> > >> At the Foundation, we have an opportunity to center around our
>> values,
>> > and
>> > >> practice open and collaborative communication. During the interim
>> > period, I
>> > >> want to get things working well and improve transparency and
>> > communication,
>> > >> both internally and with the communities. We will work to create a
>> > >> supportive, fair environment where people can get things done, engage
>> > with
>> > >> their colleagues and community members, and understand how their work
>> > has
>> > >> an impact on our mission. This includes delivering on important
>> > deadlines
>> > >> for the Annual Plan and strategy,[2] filling key roles, and making
>> > progress
>> > >> on issues raised in our recent engagement survey.
>> > >>
>> > >> We are committed to delivering the first version of the 2016-2017
>> Annual
>> > >> Plan no later than April 1st for community and FDC review, and are on
>> > track
>> > >> to meet this deadline. The WMF 2016-2018 strategy development is also
>> > >> underway, with a draft version open for comments until March 18.[3]
>> Over
>> > >> the coming weeks, we’ll be moving forward with our Chief Technology
>> > Officer
>> > >> (CTO) search, and working with the Talent and Culture team to
>> reinvest
>> > in
>> > >> our culture. As new other emerge, we’ll work together to prioritize
>> > them.
>> > >>
>> > >> To accomplish all of this, we are going to need your help. I want to
>> > hear
>> > >> from you about what you would like to achieve in this interim period.
>> > This
>> > >> includes how we can collaborate together to prepare the organization
>> and
>> > >> movement to welcome our next Executive Director. The Foundation is
>> > prepared
>> > >> to actively support the Board in the search, and we will work closely
>> > with
>> > >> them to share important information and create opportunities to give
>> > >> feedback throughout the process.
>> > >>
>> > >> Just a few weeks ago, we marked the 15th birthday of the movement.[4]
>> > >> Millions of people around the world shared their love for Wikimedia.
>> It
>> > was
>> > >> a celebration of why we do what we do, and how much joy the movement
>> > brings
>> > >> people everywhere. That’s something I try to keep in mind every day.
>> > >>
>> > >> Yours sincerely,
>> > >> Katherine
>> > >>
>> > >> [1] https://twitter.com/nightrose/status/660043284841107457
>> > >> [2]
>> > >>
>> > >>
>> >
>> https://meta.wikimedia.org/wiki/2016_Strategy/Combined_strategy_and_annual_pl

Re: [Wikimedia-l] [Wmfall] Inspire Campaign on content curation & review launches today!

2016-03-07 Thread Aaron Halfaker
Just filed another. \o/

https://meta.wikimedia.org/wiki/Grants:IdeaLab/Automatic_article_topic_detection

Build a classifier that associates articles with WikiProjects. This could
be used to route new article drafts to people with the right subject matter
expertise for review. For example, English Wikipedia has different
guidelines depending on topic. See en:Wikipedia:Notability (academics)
<https://en.wikipedia.org/wiki/Wikipedia:Notability_%28academics%29> for
example.

-Aaron

On Wed, Mar 2, 2016 at 10:03 AM, Kacie Harold  wrote:

> Looking forward to all of the great ideas - thanks for your work on this,
> Chris, and for kicking off the campaign with a few submission, Aaron.
>
> On Tue, Mar 1, 2016 at 2:14 PM, Anna Stillwell 
> wrote:
>
>> Thank you. Great work.
>> /a
>>
>> On Mon, Feb 29, 2016 at 3:59 PM, Aaron Halfaker 
>> wrote:
>>
>>> I just finished submitting two ideas that I'd like to advise.
>>>
>>>
>>> https://meta.wikimedia.org/wiki/Grants:IdeaLab/Automated_good-faith_newcomer_detection
>>> Build and deploy a machine learning model for flagging newcomers who are
>>> editing in good-faith. This has the potential to mitigate some of the
>>> secondary, demotivational effects when good-faith newcomers' work passes
>>> through curation/review processes.
>>>
>>>
>>> https://meta.wikimedia.org/wiki/Grants:IdeaLab/Fast_and_slow_new_article_review
>>> Concerns about the introduction of spam into Wikipedia has lead
>>> Wikipedians towards implementing high speed new article review/curation
>>> processes. The speed at which editors tag articles for deletion via these
>>> processes is great for dealing with spam, but it might also be faster that
>>> good-faith new article creators can build their articles. We could build a
>>> machine learning classifier that is tuned to detect spammy article drafts.
>>> This would allow the new pages queue to be split into a high-speed spammy
>>> article review, and a low-speed article review that allows creators time to
>>> make a better first draft.
>>>
>>> I'll submit some more when I can.  :)
>>>
>>> On Sun, Feb 28, 2016 at 4:56 PM, Chris "Jethro" Schilling <
>>> cschill...@wikimedia.org> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I am pleased to announce the launch of the second Inspire Campaign for
>>>> IdeaLab.[1]  The theme of this campaign is focused on improving tasks
>>>> related to content curation & review in our projects:
>>>>
>>>> <https://meta.wikimedia.org/wiki/Grants:IdeaLab/Inspire>
>>>>
>>>> Reviewing and organizing tasks are fundamental to all WIkimedia
>>>> projects, and these efforts maintain and directly improve the quality of
>>>> our projects in addition to increasing the visibility of their content.  We
>>>> invite everyone to participate by sharing your ideas and proposals on how
>>>> to enhance these efforts. Constructive feedback and collaboration on ideas
>>>> is encouraged - your skills and advice can elevate a project into action.
>>>> The campaign runs until 29 March.
>>>>
>>>> All proposals are welcome - research projects, technical solutions,
>>>> community organizing and outreach, or something completely new! Grants are
>>>> available from the Wikimedia Foundation for projects developed during this
>>>> campaign that need financial support.[2]  Google Hangout sessions are
>>>> available in March if you'd like to have a conversation about your 
>>>> ideas.[3]
>>>>
>>>> Join the Inspire Campaign and let’s work together to improve review and
>>>> curation tasks so that we can make our content more meaningful and
>>>> accessible.
>>>>
>>>> With thanks,
>>>>
>>>> Jethro
>>>>
>>>> [1] You can learn more about the results of the first Inspire Campaign
>>>> here: <
>>>> https://meta.wikimedia.org/wiki/Research:Spring_2015_Inspire_campaign>
>>>> [2] <https://meta.wikimedia.org/wiki/Grants:Start>
>>>> [3] <https://meta.wikimedia.org/wiki/Grants:IdeaLab/Events>  (Note: If
>>>> another time would work better for you, feel free to e-mail me or ping me
>>>> on-wiki).
>>>>
>>>> ---
>>>> Chris "Jethro" Schilling
>>>> I JethroBT (WMF)
>>>> <https://meta.wikimedia.org/wiki/User:I

Re: [Wikimedia-l] [Wmfall] Inspire Campaign on content curation & review launches today!

2016-02-29 Thread Aaron Halfaker
I just finished submitting two ideas that I'd like to advise.

https://meta.wikimedia.org/wiki/Grants:IdeaLab/Automated_good-faith_newcomer_detection
Build and deploy a machine learning model for flagging newcomers who are
editing in good-faith. This has the potential to mitigate some of the
secondary, demotivational effects when good-faith newcomers' work passes
through curation/review processes.

https://meta.wikimedia.org/wiki/Grants:IdeaLab/Fast_and_slow_new_article_review
Concerns about the introduction of spam into Wikipedia has lead Wikipedians
towards implementing high speed new article review/curation processes. The
speed at which editors tag articles for deletion via these processes is
great for dealing with spam, but it might also be faster that good-faith
new article creators can build their articles. We could build a machine
learning classifier that is tuned to detect spammy article drafts. This
would allow the new pages queue to be split into a high-speed spammy
article review, and a low-speed article review that allows creators time to
make a better first draft.

I'll submit some more when I can.  :)

On Sun, Feb 28, 2016 at 4:56 PM, Chris "Jethro" Schilling <
cschill...@wikimedia.org> wrote:

> Hi everyone,
>
> I am pleased to announce the launch of the second Inspire Campaign for
> IdeaLab.[1]  The theme of this campaign is focused on improving tasks
> related to content curation & review in our projects:
>
> 
>
> Reviewing and organizing tasks are fundamental to all WIkimedia projects,
> and these efforts maintain and directly improve the quality of our projects
> in addition to increasing the visibility of their content.  We invite
> everyone to participate by sharing your ideas and proposals on how to
> enhance these efforts. Constructive feedback and collaboration on ideas is
> encouraged - your skills and advice can elevate a project into action. The
> campaign runs until 29 March.
>
> All proposals are welcome - research projects, technical solutions,
> community organizing and outreach, or something completely new! Grants are
> available from the Wikimedia Foundation for projects developed during this
> campaign that need financial support.[2]  Google Hangout sessions are
> available in March if you'd like to have a conversation about your ideas.[3]
>
> Join the Inspire Campaign and let’s work together to improve review and
> curation tasks so that we can make our content more meaningful and
> accessible.
>
> With thanks,
>
> Jethro
>
> [1] You can learn more about the results of the first Inspire Campaign
> here: <
> https://meta.wikimedia.org/wiki/Research:Spring_2015_Inspire_campaign>
> [2] 
> [3]   (Note: If
> another time would work better for you, feel free to e-mail me or ping me
> on-wiki).
>
> ---
> Chris "Jethro" Schilling
> I JethroBT (WMF) 
> Community Organizer, Wikimedia Foundation
> 
>
> ___
> Wmfall mailing list
> wmf...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wmfall
>
>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] On toxic communities

2015-11-19 Thread Aaron Halfaker
I've started a thread on our "Revision scoring as a service" talk page
regarding labeled conversation datasets & modeling work we could do.

See
https://meta.wikimedia.org/wiki/Research_talk:Revision_scoring_as_a_service#Thread_on_Toxic_communities_from_wikimedia-l

On Sun, Nov 15, 2015 at 12:41 PM, Risker  wrote:

> I am going to quote Joseph Reagle, who responded to a similarly titled
> threat on Wiki-en-L:
>
>
> date:13 November 2015 at 13:48
>
> It's been great that Riot games has had someone like Lin (an experimental
> psychologist) to think about issues of community and abuse. And I
> appreciate that Lin has been previously been so forthcoming about their
> experiences and findings.
>
> But the much trumpeted League of Legends Tribunal has been down "for
> maintenance" for months, even before this article was published, with much
> discussion by the community of how it was broken. On this, Riot and Lin
> have said nothing.
>
> Copying Joseph in case he wants to respond to some of the discussions here.
>
>
> Risker/Anne
>
> On 15 November 2015 at 10:36, Pharos  wrote:
>
> > The figure quoted is quite interesting, but do we have a comparable
> metric
> > for the Wikimedia projects? :
> >
> > "... incidences of homophobia, sexism and racism ... have fallen to a
> > combined
> > 2 percent of all games"
> >
> > 2% sounds "low", but do we indeed know if this is better or worse than
> us?
> > Would our comparable metric be the % of bigoted comments per article, per
> > talk page discussion, per time that an editor spends at the project?  I
> > would think that encountering bigoted comments on 1 in 50 discussions
> would
> > still be pretty significant.
> >
> > Thanks,
> > Pharos
> >
> > On Sun, Nov 15, 2015 at 1:21 PM, Ziko van Dijk 
> wrote:
> >
> > > Hello,
> > >
> > > Just yesterday I had a long talk with a researcher about how to define
> > > and detect trolls on Wikipedia. E.g., whether "unintentional trolling"
> > > should be included or not.
> > >
> > > In my opinion, it is not possible to detect by machine trollism,
> > > unkindness, harassment, mobbing etc., maybe with the exception of
> > > swear words. A lot of turntaking, deviation from the topic and other
> > > phenomena can be experienced by the participants as positive or as
> > > negative. You might need to ask them, and even then they might not be
> > > aware of a problem that works through in subtlety. Also, third persons
> > > not involved in the conversation can be effected negatively (look at
> > > ... page X... and you know why you don't like to contribute there).
> > >
> > > Kind regards
> > > Ziko
> > >
> > >
> > > 2015-11-15 17:40 GMT+01:00 Katherine Casey <
> fluffernutter.w...@gmail.com
> > >:
> > > > I'd be happy to offer my admin/oversighter experience and knowledge
> to
> > > help
> > > > you develop the labeling and such, Aaron! I just commented on
> Andreas's
> > > > proposal on the Community Wishlist, but to summarize here: I see a
> lot
> > of
> > > > potential pitfalls in trying to handle/generalize this with machine
> > > > learning, but I also see a lot of potential value, and I think it's
> > > > something we should be investigating.
> > > >
> > > > -Fluffernutter
> > > >
> > > > On Sun, Nov 15, 2015 at 11:32 AM, Aaron Halfaker <
> > > ahalfa...@wikimedia.org>
> > > > wrote:
> > > >
> > > >> >
> > > >> > The League of Legends team collaborated with outside scientists to
> > > >> > analyse their dataset. I would love to see the Wikimedia
> Foundation
> > > >> engage
> > > >> > in a similar research project.
> > > >>
> > > >>
> > > >> Oh!  We are!  :) When we have time. :\ One of the projects that I'd
> > > like to
> > > >> see done, but I've struggled to find the time for is a common talk
> > page
> > > >> parser[1] that could produce a dataset of talk page interactions.
> I'd
> > > like
> > > >> this dataset to be easy to join to editor outcome measures.  E.g.
> > there
> > > >> might be "aggressive" talk that we don't know is problematic until
> we
> > > see
> > > >> the kind of effect that it has o

Re: [Wikimedia-l] On toxic communities

2015-11-15 Thread Aaron Halfaker
>
> The League of Legends team collaborated with outside scientists to
> analyse their dataset. I would love to see the Wikimedia Foundation engage
> in a similar research project.


Oh!  We are!  :) When we have time. :\ One of the projects that I'd like to
see done, but I've struggled to find the time for is a common talk page
parser[1] that could produce a dataset of talk page interactions.  I'd like
this dataset to be easy to join to editor outcome measures.  E.g. there
might be "aggressive" talk that we don't know is problematic until we see
the kind of effect that it has on other conversation participants.

Anyway, I want some powerful utilities and datasets out there to help
academics look into this problem more easily.  For revscoring, I'd like to
be able to take a set of talk page diffs, have them classified in Wiki
labels[2] as "aggressive" and the build a model for ORES[3] to be used
however people see fit.  You could then use ORES to do offline analysis of
discussions for research.  You could use ORES to interrupt the a user
before saving a change.  I'm sure there are other clever ideas that people
have for what to do with such a model that I'm happy to enable it via the
service.  The hard part is getting a good dataset labeled.

If someone wants to invest some time and energy into this, I'm happy to
work with you.  We'll need more than programming help.  We'll need a lot of
help to figure out what dimensions we'll label talk page postings by and to
do the actual labeling.

1. https://github.com/Ironholds/talk-parser
2. https://meta.wikimedia.org/wiki/Wiki_labels
3. https://meta.wikimedia.org/wiki/ORES

On Sun, Nov 15, 2015 at 6:56 AM, Andreas Kolbe  wrote:

> On Sat, Nov 14, 2015 at 9:13 PM, Benjamin Lees 
> wrote:
>
> > This article highlights the happier side of things, but it appears
> > that Lin's approach also involved completely removing bad actors:
> > "Some players have also asked why we've taken such an aggressive
> > stance when we've been focused on reform; well, the key here is that
> > for most players, reform approaches are quite effective. But, for a
> > number of players, reform attempts have been very unsuccessful which
> > forces us to remove some of these players from League entirely."[0]
> >
>
>
> Thanks for the added context, Benjamin. Of course, banning bad actors that
> they consider unreformable is something Wikipedia admins have always done
> as well.
>
> The League of Legends team began by building a dataset of interactions that
> the community considered unacceptable, and then applied machine-learning to
> that dataset.
>
> It occurs to me that the English Wikipedia has ready access to such a
> dataset: it's the totality of revision-deleted and oversighted talk page
> posts. The League of Legends team collaborated with outside scientists to
> analyse their dataset. I would love to see the Wikimedia Foundation engage
> in a similar research project.
>
> I've added this point to the community wishlist survey:
>
>
> https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey#Machine-learning_tool_to_reduce_toxic_talk_page_interactions
>
>
>
> > P.S. As Rupert noted, over 90% of LoL players are male (how much over
> > 90%?).[1] It would be interesting to know whether this percentage has
> > changed along with the improvements described in the article.
> >
>
>
> Indeed.
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
>
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Unsolicieted email from "wikimedia research"

2015-06-27 Thread Aaron Halfaker
>
>  RCom, as far as I know has not been active in the past year or more (last
> meeting was on Dec. 22, 2011).


*RCom is not dead.   It changed into something less formal and less
hierarchical.  You can still email me and Dario to get support for your
research plans.  We'd still reconvene the committee if it looks like
that'll help. *

While RCom hasn't met in a long time, the process for subject recruitment
hasn't slowed.  We don't have a technical requirement that all recruitment
studies must follow The Process, but I have been helping researchers
document their studies and obtain feedback and sometimes consensus for more
than five years now.

Really, RCom has morphed slowly into the Research Team at the WMF + a few
interested volunteers that we can manage to pull in to help us with review
work (shout out to Daniel Mietchen, Nemo, Yaroslav & BluRasberry).  Within
the research team, we *do* have structured processed for supporting
researchers access to data and engineering support, but subject recruitment
has been mostly left in my (volunteer time) hands.

Regretfully, I wasn't involved in the planning of this project or I would
have directed it towards best practices for minimizing disruption
 -- e.g. an
RFC.  I would have also pushed Leila to find a way to make posts on talk
pages work (since they are known to be generally preferable, police-able,
etc.), but I can understand why concerns around privacy might be worth
discussion.  I regret that this discussion only happened after-the-fact as
it could have informed the study design for the better.  FWIW, SuggestBot
posts recommendations on user talk pages and also does not filter for
offensive content (to my knowledge).

Finally, I think it is important to consider the source of this research
work.  Leila is not some random academic or industry researcher who is
planning to take advantage of Wikipedians for a study, but not give back.
Leila is working with a team at the WMF tasked with building better
translation tools.  She helped them design an experiment that would explore
the effectiveness of these tools so that when something is deployed, it's
actually better and we know it scientifically.  A lot of the work I do with
external researchers is to help make sure that their work has the potential
to benefit Wikipedia/Wikipedians/Wikimedia/Open knowledge.  In this case,
the Leila's team is just helping the product teams engage in best practices
around empirical software change practice.   After all, every software
deployment is an experiment that is inflicted upon you without consent.  In
this case, Leila's job is making sure that we know the effect before we
deploy.

So, what I really mean to say is:

   1. You're right.  We should do this better.  We have a process and
   everyone should go through it.  It might have caught some of the issues
   that have been raised.
   2. Leila is WMF staff.  She's trying to help the WMF build better
   software for the purpose of benefiting Wikipedians.  Her team deserves some
   slack.  The alternative of not running the study is less desirable.

-Aaron

On Sat, Jun 27, 2015 at 12:56 PM, Michelle Paulson 
wrote:

> Hi All,
>
> Please see in-line below.
>
> -Michelle
>
> On Saturday, June 27, 2015, Leila Zia  wrote:
>
> > + Michelle Paulson
> >
> > On Sat, Jun 27, 2015 at 7:37 AM, Pine W  > > wrote:
> >
> >> This issue is also being discussed on the Research mailing list.
> >>
> >> I have three questions:
> >>
> >> 1. Was this outreach method approved by RCom?
> >>
> >  No, and RCom, as far as I know has not been active in the past year or
> > more (last meeting was on Dec. 22, 2011). This is a research from the
> > Research team in the WMF.
> >
> >> 2. Email addresses are nonpublic information on-wiki unless they are
> >> proactively and publicly disclosed by users. Does the bulk collection of
> >> nonpublic email addresses in this manner and the bulk provision of those
> >> addresses to researchers for their use in this campaign violate the
> >> Wikimedia privacy policy? The policy states regarding email, "We use
> your
> >> email address to let you know about things that are happening with the
> >> Foundation, the Wikimedia Sites, or the Wikimedia movement, such as
> telling
> >> you important information about your account, letting you know if
> something
> >> is changing about the Wikimedia Sites or policies, and alerting you when
> >> there has been a change to an article that you have decided to follow."
> The
> >> bulk scraping of email addresses from account registrations for research
> >> and outreach purposes doesn't appear to be contemplated or authorized
> under
> >> the privacy policy.
> >>
> > Michelle can help with this one as this is related to Legal. Note that
> > it's weekend here and this may have to wait until Monday.
> >
>
> The research team did speak to me prior to beginning this project to ensure
> that they complied with the WMF

Re: [Wikimedia-l] Research Committee

2014-07-17 Thread Aaron Halfaker
Per directing the conversation here to wiki-research-l, I'd like to link to
a post I made in the relevant thread there that describes the history of my
work on subject recruitment support.  See
http://lists.wikimedia.org/pipermail/wiki-research-l/2014-July/003579.html

Per Lane's comments, I look forward to improving support and engagement
with researchers and removing any suggestion of WMF control from the
process.   To echo Lane's assertion: *Researchers are awesome and they need
support.*   I'd like to add that: *Wikipedians are awesome and need to be
empowered by the process.*

-Aaron


On Thu, Jul 17, 2014 at 8:05 AM, Lane Rasberry 
wrote:

> Hello,
>
> At Wikimania in London August 6-7 there is a research meetup. Some RCOM
> people will be there.
> <
>
> https://meta.wikimedia.org/wiki/Research:Labs2/Hackathons/August_6-7th,_2014
> >
> I will be there all Thursday 7 August. Research ethics oversight is not the
> priority for this group and statistics seems to be, but at least I want to
> visit this group and see what they think.
>
> I support Aaron and RCOM, and would prefer that no one blame either for
> anything. I think both are being held responsible for a lot of complicated
> issues that are beyond the scope of what they are empowered to cover. RCOM
> has some strengths and weaknesses. I wish to empower the Research Committee
> and make it known for its strengths, and to help it divest responsibilities
> for areas which it cannot manage as well and find other channels for
> dealing with whatever RCOM is unable to do.
>
> Nathan, I would be willing to talk with you by phone or video sometime if
> you like. It is not that I want to make this private, but just that text
> and email are not the same as conversations with voice. I have no
> solutions, but at least I might be able to describe the positions of
> stakeholders in research, list options, and say something about what kinds
> of actions would be conservative and what would be radical. I wish for a
> bit more community participation in research oversight, but overall, I want
> to reduce bureaucracy and gatekeeping, and I think others may wish for this
> as well. Researchers are awesome and they need support.
>
> yours,
>
>
> On Thu, Jul 17, 2014 at 11:03 AM, Aaron Halfaker 
> wrote:
>
> > Nathan,
> >
> > I plan to address those concerns on the appropriate list.  It's a public
> > list.  I'm drafting an email at the moment.  If you're interested in wiki
> > research, I encourage you to sign up to wiki-research-l.  It's relatively
> > low traffic for anyone used to wikimedia-l.
> >
> > -Aaron
> >
> >
> > On Thu, Jul 17, 2014 at 8:01 AM, Nathan  wrote:
> >
> > > Hi Aaron,
> > >
> > > Are you sure that you can't make any kind of substantive reply here on
> > this
> > > list, for the benefit of people who have been reading about it here but
> > > aren't subscribed to the wiki-research-l list? I note that you also
> have
> > > not addressed any of the concerns either on your talkpage or on the
> other
> > > list.
> > >
> > > Thanks,
> > > Nathan
> > >
> > >
> > > On Thu, Jul 17, 2014 at 10:59 AM, Aaron Halfaker <
> > ahalfa...@wikimedia.org>
> > > wrote:
> > >
> > > > Hey folks,
> > > >
> > > > I appreciate your discussion here.  However, you're unlikely to get
> any
> > > > participation from actual wiki researchers on wikimedia-l  See
> > > > wiki-research-l[1], the mailing list for discussions of research.
> > >  There's
> > > > a thread referencing this discussion here[2].  I encourage you to
> > > continue
> > > > the conversation there.
> > > >
> > > > 1. https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > > 2.
> > > >
> > >
> >
> http://lists.wikimedia.org/pipermail/wiki-research-l/2014-July/003570.html
> > > >
> > > > -Aaron
> > > >
> > > >
> > > > On Thu, Jul 17, 2014 at 4:52 AM, Dariusz Jemielniak <
> dar...@alk.edu.pl
> > >
> > > > wrote:
> > > >
> > > > > RCOM would perhaps be more active if there were clear terms for
> > > members?
> > > > >
> > > > > best,
> > > > >
> > > > > dj
> > > > >
> > > > >
> > > > > On Thu, Jul 17, 2014 at 12:03 PM, Craig Franklin <
> > > > > cfrank...@halonetwork.net>
> > >

Re: [Wikimedia-l] Research Committee

2014-07-17 Thread Aaron Halfaker
Nathan,

I plan to address those concerns on the appropriate list.  It's a public
list.  I'm drafting an email at the moment.  If you're interested in wiki
research, I encourage you to sign up to wiki-research-l.  It's relatively
low traffic for anyone used to wikimedia-l.

-Aaron


On Thu, Jul 17, 2014 at 8:01 AM, Nathan  wrote:

> Hi Aaron,
>
> Are you sure that you can't make any kind of substantive reply here on this
> list, for the benefit of people who have been reading about it here but
> aren't subscribed to the wiki-research-l list? I note that you also have
> not addressed any of the concerns either on your talkpage or on the other
> list.
>
> Thanks,
> Nathan
>
>
> On Thu, Jul 17, 2014 at 10:59 AM, Aaron Halfaker 
> wrote:
>
> > Hey folks,
> >
> > I appreciate your discussion here.  However, you're unlikely to get any
> > participation from actual wiki researchers on wikimedia-l  See
> > wiki-research-l[1], the mailing list for discussions of research.
>  There's
> > a thread referencing this discussion here[2].  I encourage you to
> continue
> > the conversation there.
> >
> > 1. https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > 2.
> >
> http://lists.wikimedia.org/pipermail/wiki-research-l/2014-July/003570.html
> >
> > -Aaron
> >
> >
> > On Thu, Jul 17, 2014 at 4:52 AM, Dariusz Jemielniak 
> > wrote:
> >
> > > RCOM would perhaps be more active if there were clear terms for
> members?
> > >
> > > best,
> > >
> > > dj
> > >
> > >
> > > On Thu, Jul 17, 2014 at 12:03 PM, Craig Franklin <
> > > cfrank...@halonetwork.net>
> > > wrote:
> > >
> > > > I've spent a half hour or so going through this, and it looks like
> > Nathan
> > > > is on the money here.  If RCOM is as inactive as it seems (except
> where
> > > it
> > > > concerns the research of RCOM members) then it is no great surprise
> > that
> > > > external parties eventually try to do an end-run around it.  Unless
> an
> > > > explanation for this inactivity can be provided, I think that in its
> > > > current form RCOM should be disbanded or at least radically retooled,
> > > > because clearly it's not only ineffective, it's also preventing
> > > potentially
> > > > legitimate research from going ahead.
> > > >
> > > > Cheers,
> > > > Craig
> > > >
> > > >
> > > > On 17 July 2014 11:06, Nathan  wrote:
> > > >
> > > > > And... unsurprisingly, Aaron has reverted the changes I referred to
> > > > above.
> > > > > Not with any explanation, of course, other than "not true." Looking
> > at
> > > > the
> > > > > list of "reviewed" projects (where the review appears to
> constitute a
> > > > small
> > > > > handful of questions on the talkpage), the RCOM has reviewed a
> total
> > of
> > > > 10
> > > > > projects in its history. I'm excluding the one where Aaron himself
> > is a
> > > > > co-investigator.
> > > > >
> > > > > That might sound like a substantial amount, but in 2013 and 2014
> the
> > > rate
> > > > > so far is 1 (one) per *year*. Meanwhile, the AfD request languished
> > > for 7
> > > > > months without a peep from Aaron or someone on RCOM. Since we're on
> > the
> > > > > subject, let's look at the research index and see what we can see.
> > > > >
> > > > > # There is a "Gender Inequality Index" that has no comments from
> > RCOM,
> > > > > posted a month ago.
> > > > > # We have "Modeling monthly active editors" submitted by Aaron
> > himself.
> > > > > This is worth looking at[1] as evidently an example of what an RCOM
> > > > member
> > > > > considers sufficient description of a research project.
> Specifically,
> > > > > nothing at all.
> > > > > # "Number of books read by WikiWriters" a page written by a high
> > school
> > > > > student that should have been deleted but hasn't been, suggesting
> the
> > > > > submissions may not be closely monitored...
> > > > > # "Use of Wikipedia by doctors" submitted both to RCOM and to IEG
> in
> >

Re: [Wikimedia-l] Research Committee

2014-07-17 Thread Aaron Halfaker
Hey folks,

I appreciate your discussion here.  However, you're unlikely to get any
participation from actual wiki researchers on wikimedia-l  See
wiki-research-l[1], the mailing list for discussions of research.  There's
a thread referencing this discussion here[2].  I encourage you to continue
the conversation there.

1. https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
2.
http://lists.wikimedia.org/pipermail/wiki-research-l/2014-July/003570.html

-Aaron


On Thu, Jul 17, 2014 at 4:52 AM, Dariusz Jemielniak 
wrote:

> RCOM would perhaps be more active if there were clear terms for members?
>
> best,
>
> dj
>
>
> On Thu, Jul 17, 2014 at 12:03 PM, Craig Franklin <
> cfrank...@halonetwork.net>
> wrote:
>
> > I've spent a half hour or so going through this, and it looks like Nathan
> > is on the money here.  If RCOM is as inactive as it seems (except where
> it
> > concerns the research of RCOM members) then it is no great surprise that
> > external parties eventually try to do an end-run around it.  Unless an
> > explanation for this inactivity can be provided, I think that in its
> > current form RCOM should be disbanded or at least radically retooled,
> > because clearly it's not only ineffective, it's also preventing
> potentially
> > legitimate research from going ahead.
> >
> > Cheers,
> > Craig
> >
> >
> > On 17 July 2014 11:06, Nathan  wrote:
> >
> > > And... unsurprisingly, Aaron has reverted the changes I referred to
> > above.
> > > Not with any explanation, of course, other than "not true." Looking at
> > the
> > > list of "reviewed" projects (where the review appears to constitute a
> > small
> > > handful of questions on the talkpage), the RCOM has reviewed a total of
> > 10
> > > projects in its history. I'm excluding the one where Aaron himself is a
> > > co-investigator.
> > >
> > > That might sound like a substantial amount, but in 2013 and 2014 the
> rate
> > > so far is 1 (one) per *year*. Meanwhile, the AfD request languished
> for 7
> > > months without a peep from Aaron or someone on RCOM. Since we're on the
> > > subject, let's look at the research index and see what we can see.
> > >
> > > # There is a "Gender Inequality Index" that has no comments from RCOM,
> > > posted a month ago.
> > > # We have "Modeling monthly active editors" submitted by Aaron himself.
> > > This is worth looking at[1] as evidently an example of what an RCOM
> > member
> > > considers sufficient description of a research project. Specifically,
> > > nothing at all.
> > > # "Number of books read by WikiWriters" a page written by a high school
> > > student that should have been deleted but hasn't been, suggesting the
> > > submissions may not be closely monitored...
> > > # "Use of Wikipedia by doctors" submitted both to RCOM and to IEG in
> > March,
> > > no comment by RCOM.
> > > # Chinese Wikivoyage, created in January, no comment by RCOM.
> > > # SSAJRP program - extensively documented, posted in October 2013, no
> > > comment from RCOM and no RCOM liaison. This research is ongoing.
> > > # Gender assymetry, posted in September 2013, no comment from RCOM.
> > > # Dynamics of inclusion and exclusion, August 2013, no comment or
> > > participation from RCOM.
> > >
> > > I'm sure the list could go on, because the pattern is perfect -
> virtually
> > > the only projects to get participation from either Dario or Aaron are
> > those
> > > managed by WMF staff members (and most often, Aaron himself is the
> > > investigator). But the inactivity of RCOM is not news to the WMF. In
> > > December of last year, Dario posted to rcom-l [2] that "The Research
> > > Committee as a group with a fixed membership and a regular meeting
> > schedule
> > > has been inactive for a very long time." He then stated that "...the
> > > existence of a fixed-membership group with a recognized authority on
> any
> > > possible matter related to Wikimedia research and associated policies
> has
> > > ceased to be a priority." Another member of RCOM, WMF employee Jonathan
> > > Morgan, said in June on meta "I'm not sure what RCOM's mandate is these
> > > days." When asked in March how many projects RCOM had actually
> approved,
> > it
> > > took Aaron four months to reply.[3]
> > >
> > > So it is factually incorrect to suggest in documentation that RCOM
> > approval
> > > is required for anything; it's clear that RCOM as a body does not
> > actually
> > > exist. It may be argued that the approval of one of the two involved
> WMF
> > > employees is required. If that's the case, then at least based on
> public
> > > evidence they have been doing an absolutely woeful job of keeping up
> with
> > > this labor. I'll admit it's possible that all of the communication has
> > been
> > > via e-mail, and in actuality Aaron and Dario have been very busy
> > providing
> > > feedback to non-WMF researchers. If that's the case, or of I'm missing
> > some
> > > other function that RCOM fulfills, I'd love to hear about it. Otherwise
> > it
> 

Re: [Wikimedia-l] The first three weeks.

2014-05-30 Thread Aaron Halfaker
Gerard, I think that the work on Commons and WikiData is freaking awesome.
 If I could clone myself I'd be digging into it immediately.  Right now,
I'm working on measurement Wikipedias and large cross-wiki analyses.  FWIW,
I think that the wikidata games are some of the most exciting things to
happen in Wikimedia wikis in a long time.

Rui, re. the survival graphs.  Those are proportions.  Multiply by 100 to
get percentages.  i.e. the line starts at about ~24% and declines to ~7%.
 I'd really like to revisit this work since we've standardized some of the
measures I was using and the new, standard definitions will result in some
differences.  See
https://meta.wikimedia.org/wiki/Research:Surviving_new_editor for the
updated definition.  I'll try to schedule some time to get an updated
figure for ptwiki that goes back before 2006.

-Aaron


On Fri, May 30, 2014 at 5:30 AM, Rui Correia  wrote:

> Hi Aaron
>
> This is really a treasure trove of information. I am looking forward to
> savouring it in detail. Many thanks.
>
> One question for now on Point 5: the 3rd graph with values <1 - are those
> percentages? Is the decimal notation correct?
>
> Regards,
>
> Rui
>
>
> 2014-05-30 1:52 GMT+02:00 Aaron Halfaker :
>
> > Hi Rui,
> >
> > You raised a lot of questions that I think I might be able to help
> address.
> >  I'm a research scientist working for the WMF.  My research focuses on
> the
> > nature of newcomer participation, editor motivation and value production
> in
> > Wikipedia.  See [1] and [2] (if you have the time) for my most seminal
> work
> > on the subject.
> >
> > As you'll see in the study I referenced, my work directly addresses a
> > substantial portion of the questions you've raised.  See also my team's
> > work with standardizing metrics[3] including survival measures[4] and my
> > work exploring retention trends in ptwiki[5].  See [6] for an example of
> a
> > recent, cross-language study of newcomer article creation patterns.
>  Also,
> > you might be interested in [7] since it confirms your general concerns
> > about the speed of speedy deletions.
> >
> > A lot of the work of /really understanding Wikipedia/ is only half-way
> done
> > since it takes a long time build understanding about previously
> > undocumented phenomena.  The academic community, other researchers at the
> > WMF and myself are in the middle of developing a whole field around how
> > open collaboration systems like Wikipedia work, common problems they have
> > and how they can be best supported.
> >
> > While we're developing this general knowledge about engagement,
> production
> > and retention in our communities, we (the research & data team) are also
> > working directly with product teams at the WMF to measure their impact on
> > key metrics (e.g. participation) with scientific rigor and to
> > challenge/develop/refine theory on which product strategies lead us
> toward
> > our goals and which ones do not.  See [8] and [9] for examples of such
> > studies.
> >
> > I welcome anyone who'd like to continue the conversation about what we do
> > and don't know about Wikipedia(s) to raise discussions at
> > wiki-research-l[10].  There are a lot more researchers on that list than
> > wikimedia-l.  FWIW, I tend to follow that list more closely.
> >
> > 1. Summary:
> > http://www-users.cs.umn.edu/~halfak/publications/The_Rise_and_Decline/
> > 2. Full paper:
> >
> >
> http://www-users.cs.umn.edu/~halfak/publications/The_Rise_and_Decline/halfaker13rise-preprint.pdf
> > 3.
> https://www.mediawiki.org/wiki/Analytics/Editor_Engagement_Vital_Signs
> > 4. https://meta.wikimedia.org/wiki/Research:Surviving_new_editor
> > 5.
> >
> >
> https://meta.wikimedia.org/wiki/Research:Ideas/Is_ptwiki_declining_like_enwiki%3F
> > 6. https://meta.wikimedia.org/wiki/Research:Wikipedia_article_creation
> > 7.
> https://meta.wikimedia.org/wiki/Research:The_Speed_of_Speedy_Deletions
> > 8.
> >
> https://meta.wikimedia.org/wiki/Research:Onboarding_new_Wikipedians/Rollout
> > 9.
> >
> >
> https://meta.wikimedia.org/wiki/Research:VisualEditor%27s_effect_on_newly_registered_editors/Results
> > 10. https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> > -Aaron
> >
> >
> >
> > >
> > > > From: Rui Correia 
> > > > Subject: Re: [Wikimedia-l] The first three weeks.
> > > > Date: May 29, 2014 at 5:07:45 AM PDT
> > > > To: Wikimedia Mailing List 
> > > > Reply-To: Wikimedia 

Re: [Wikimedia-l] The first three weeks.

2014-05-29 Thread Aaron Halfaker
Hi Rui,

You raised a lot of questions that I think I might be able to help address.
 I'm a research scientist working for the WMF.  My research focuses on the
nature of newcomer participation, editor motivation and value production in
Wikipedia.  See [1] and [2] (if you have the time) for my most seminal work
on the subject.

As you'll see in the study I referenced, my work directly addresses a
substantial portion of the questions you've raised.  See also my team's
work with standardizing metrics[3] including survival measures[4] and my
work exploring retention trends in ptwiki[5].  See [6] for an example of a
recent, cross-language study of newcomer article creation patterns.  Also,
you might be interested in [7] since it confirms your general concerns
about the speed of speedy deletions.

A lot of the work of /really understanding Wikipedia/ is only half-way done
since it takes a long time build understanding about previously
undocumented phenomena.  The academic community, other researchers at the
WMF and myself are in the middle of developing a whole field around how
open collaboration systems like Wikipedia work, common problems they have
and how they can be best supported.

While we're developing this general knowledge about engagement, production
and retention in our communities, we (the research & data team) are also
working directly with product teams at the WMF to measure their impact on
key metrics (e.g. participation) with scientific rigor and to
challenge/develop/refine theory on which product strategies lead us toward
our goals and which ones do not.  See [8] and [9] for examples of such
studies.

I welcome anyone who'd like to continue the conversation about what we do
and don't know about Wikipedia(s) to raise discussions at
wiki-research-l[10].  There are a lot more researchers on that list than
wikimedia-l.  FWIW, I tend to follow that list more closely.

1. Summary:
http://www-users.cs.umn.edu/~halfak/publications/The_Rise_and_Decline/
2. Full paper:
http://www-users.cs.umn.edu/~halfak/publications/The_Rise_and_Decline/halfaker13rise-preprint.pdf
3. https://www.mediawiki.org/wiki/Analytics/Editor_Engagement_Vital_Signs
4. https://meta.wikimedia.org/wiki/Research:Surviving_new_editor
5.
https://meta.wikimedia.org/wiki/Research:Ideas/Is_ptwiki_declining_like_enwiki%3F
6. https://meta.wikimedia.org/wiki/Research:Wikipedia_article_creation
7. https://meta.wikimedia.org/wiki/Research:The_Speed_of_Speedy_Deletions
8.
https://meta.wikimedia.org/wiki/Research:Onboarding_new_Wikipedians/Rollout
9.
https://meta.wikimedia.org/wiki/Research:VisualEditor%27s_effect_on_newly_registered_editors/Results
10. https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

-Aaron



>
> > From: Rui Correia 
> > Subject: Re: [Wikimedia-l] The first three weeks.
> > Date: May 29, 2014 at 5:07:45 AM PDT
> > To: Wikimedia Mailing List 
> > Reply-To: Wikimedia Mailing List 
> >
> > Hi James
> >
> > Do we have any figures on retention of new editors? How long does the
> > average new editor stay? What percentage of new editors stays on for 6
> > months; one year; two years? Do we have these figures for all languages?
> >
> > New editors should be allowed space to grow. Wikipedia is so rich in
> > developing all kinds of scripts, templates etc, that it would be easy to
> > create something to inform others that someone is a new editor. Pages by
> > new editors should be left alone for a day or two. There is nothing more
> > disheartening than getting all excited about contributing only to find
> that
> > someone comes along and either deletes your first attempt or nominates it
> > for deletion. I've have seen this happen WITHIN MINUTES of the seminal
> > version being posted, followed up by 'warnings' on the editor's talk
> page.
> > I've seen edits reverted because the formatting of the source was wrong.
> It
> > should be a basic pillar that before reverting, we see if we can improve/
> > fix the problem. Undoing a newcomer's work and leaving something like
> > WP:MOS as an edit summary is not helpful - if you are going to cite a WP
> > policy, then do so by pointing directly to the specific page where the
> new
> > editor can read about it. I know it is time-consuming to fill in edit
> > summaries, especially if one is doing a series of identical edits to a
> > whole lot of pages. But we can use technology to speed this up - on a
> blank
> > edit summary, a prompt will suggest earlier text and you can select an
> > applicable one. On an edit summary with a reference to the section of the
> > page this does not work - so we need to find a way around this, like
> > splitting the field.
> >
> > No amount of ink about how welcoming WP is to new editors, IT IS NOT. For
> > reference, this section has some interesting facts,
> > https://en.wikipedia.org/wiki/Wikipedia#Contributors.
> >
> > We are also losing established editors, mostly because of edit warring.
> > There are blocks coalescing around all kinds of theme