[Wiki-research-l] Re: What's your favorite text about general research frameworks?

2022-02-06 Thread Tilman Bayer
Also consider the widely used textbook by Creswell & Creswell, "Research
Design: Qualitative, Quantitative, and Mixed Methods Approaches" (5th
Edition, ISBN 978-1506386706), as a general reference for the social
sciences with relevance to much of the research literature about Wikipedia.

It's pretty long and comprehensive (perhaps overly so for some purposes),
with e.g.
* entire chapters about how to do a literature review and on how to use
theory
* detailed checklists for various research designs (such as the two
reproduced here
, for surveys
and experiments),
* and "recipes" for writing research study proposals and papers in various
contexts.

The book emphasises the importance of identifying the particular
"philosophical worldview" guiding the choice of research approach
(qualitative, quantitative or mixed-methods) and other aspects of a
particular research project. In chapter 1 (available online

for the 4th edition), Creswell & Creswell describe four such worldviews in
detail, which I personally found quite useful in keeping track of the
different beliefs and assumptions underlying research publications bout
Wikipedia from various fields:

*1. Postpositivism (aka the scientific method)*
characterized by an emphasis on causality, the reduction of ideas and
theories to research questions and testable hypotheses, etc.
"The postpositivist assumptions have represented the traditional form of
research, and these assumptions hold
true more for quantitative research than qualitative research. This
worldview is sometimes called the scientific
method, or doing science research. It is also called
positivist/postpositivist research, empirical science, and
postpositivism."

*2. (Social) Constructivism*
"typically seen as an approach to qualitative research" (such as
ethnography or case studies), emphasizing the social construction of meaning
"The goal of the research is to rely as much as possible on the
participants’ views of the situation being studied", with subjective
meanings "formed through interaction with others (hence social
constructivism) and through historical and cultural norms that operate in
individuals’ lives."

*3. "The Transformative Worldview"*
"This position arose during the 1980s and 1990s from individuals who felt
that the postpositivist assumptions  imposed structural laws and theories
that did not fit marginalized individuals in our society or issues of power
and social justice, discrimination, and oppression that needed to be
addressed. There is no uniform body of literature characterizing this
worldview, but it includes groups of researchers that are critical
theorists; participatory action researchers; Marxists; feminists; racial
and ethnic minorities; persons with disabilities; indigenous and
postcolonial peoples; and members of the lesbian, gay, bisexual,
transsexual, and queer communities. [...] these inquirers felt that the
constructivist stance did not go far enough in advocating for an
action agenda to help marginalized peoples."
"Transformative research uses a program theory of beliefs about how a
program works and why the
problems of oppression, domination, and power relationships exist."

*4. "The Pragmatic Worldview"*
(kind of a pick-and-choose stance about worldviews, which the authors
appears to sympathize with)
"Pragmatism is not committed to any one system of philosophy and reality.
This applies to mixed methods research in that inquirers draw liberally
from both quantitative and qualitative assumptions when they engage in
their research." "Truth is what works at the time. It is not based in a
duality between reality independent of the mind or within the mind."

Regards, Tilman

On Thu, Feb 3, 2022 at 8:28 AM Andrew Green  wrote:

> Hi all,
>
> I hope this is the right place to ask this question!
>
> I was wondering if folks who are doing (or are interested in) research
> about Wikipedia might like to share texts that they feel best describe
> the general research frameworks they use (or might like to use).
>
> I'd love to hear about any texts you like, regardless of format
> (textbook, paper, general reference, blog post, etc.).
>
> It seems a lot of work about Wikipedia uses approaches from
> Computational Social Science. The main references I have for that are
> [1] and [2].
>
> I'm especially interested in links between Computational Social Science
> and frameworks from more traditional social sciences and cognitive science.
>
> Many thanks in advance! :) Cheers,
> Andrew
>
> [1] Cioffi-Revilla, C. (2017) /Introduction to Computational Social
> Science. Principles and Applications. Second Edition./ Cham,
> Switzerland: Springer.
>
> [2] Melnik, R. (ed.) (2015)/Mathematical and Computational Modeling.
> With Applications in Natural and Social Sciences, Engineering, and the
> Arts/. Hoboken, 

Re: [Wiki-research-l] Asperges, ADHD and editors

2020-04-08 Thread Tilman Bayer
The results of that Amichai-Hamburger et al. study were later questioned,
see
https://meta.wikimedia.org/wiki/Research:Newsletter/2013/March#Grumpiness_due_to_a_%22serious_typographical_error%22

Here is another one that studied Big Five personality traits:
https://meta.wikimedia.org/wiki/Research:Newsletter/2017/February#%22Relationship_between_personality_and_attitudes_to_Wikipedia%22
Our reviewer noted a lack of statistical power, however.

This (personal, non-scientific) essay may be worth reading:
https://guillaumepaumier.com/2015/07/29/autistic-wikipedian/

Lastly (for those reading along here), OP has since created
https://meta.wikimedia.org/wiki/Research:Editing_and_Neurotypes , and it
has been noted that in a Dutch survey, "One in eight editors (13%) say they
have an autism spectrum disorder" (
https://commons.wikimedia.org/wiki/File:EN_-_Report_survey_editors_Dutch_language_Wikipedia_2018.pdf
 ).

Regards, HaeB

On Thu, Apr 2, 2020 at 10:59 AM Finn Aarup Nielsen  wrote:

> For the reviews that summarized research some time ago:
>
> The people's encyclopedia under the gaze of the sages: A systematic
> review of scholarly research on Wikipedia (page 56
> https://orbit.dtu.dk/ws/files/52914302/SSRN_id2021326.pdf
>
> Wikipedia research and tools: Review and comments (page 23)
> http://www.imm.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.pdf
>
> we only found the Amichai-Hamburger et al. study (or at least I only
> recall finding):
>
> Yair Amichai-Hamburger, Naama Lamdan, Rinat Madiel, and Tsahi Hayat.
> Personality characteristics of wikipedia members. CyberPsychology &
> Behavior, 11(6):679–681, December 2008.
>
> I summarized it with:
>
> Application of the Big Five Inventory and Real-Me personality
> questionnaires to 139 Wikipedia and non-Wikipedia users. The recruitment
> was based on targeting posting of links. Wikipedians scored lower on
> agreeableness and higher on openness. Differences in extroversion and
> conscientiousness depended on the sex of the subject.
>
>
> My guess is that Wikipedians are not disproportionately ADHD, perhaps
> the reverse.
>
>
> Finn Årup Nielsen
>
>
> On 4/2/20 5:49 PM, RhinosF1 - wrote:
> > Evening all,
> >
> > I hope everyone is doing well given the crazy world we’re living in.
> >
> > I was having a conversation with a few users on Discord today and we were
> > wondering whether wikimedia (or users of other similiar sites would be
> > fine) disproportinately fall into the category of having aspergers, ADHD
> > and other simmilar conditions.
> >
> > It would be even better if anyone knew what sort of areas these users
> were
> > more likely to work in.
> >
> > Following a chat with Issac in #wikimedia-research, I understand there
> > isn’t much support for this kind of research as users may not want to
> > reveal this information and there is no clear reason for collecting the
> > information but if anyone knows of past research or has any information,
> > that would be helpful.
> >
> > Stay Safe,
> > RhinosF1
> >
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Statistics on reverted edits

2020-01-31 Thread Tilman Bayer
Concerning 1) and about analyzing reverts in general, see
https://meta.wikimedia.org/wiki/Research:Revert .

To explore 5), https://meta.wikimedia.org/wiki/AbuseFilter and
https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.

Regards, HaeB

On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky 
wrote:

> Hi everyone,
>
> I’m looking for statistics about the edits that are reverted on the
> English Wikipedia. This is for purposes of explaining to the public what
> Wikipedia’s quality control processes are like. If hard numbers aren’t
> available, I’m also interested in educated guesstimates.
>
> 1) An often-quoted statistic is that 7% of edits are reverted. Is this
> still believed to be true?
>
> 2) According to
> https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of
> edits are vandalism. There are other common reasons for reverting, and I’m
> wondering if anyone has studied their frequency. Does anyone know what
> percentage of all edits are reverted for being:
> a) Spam (as perceived by the reverter)
> b) Copyright violation
> c) Violations of the Biographies of Living Persons policy
>
> 3) Do statistics on the number of edits per day on the English Wikipedia
> (i.e. 164,000 edits per day) include edits that are blocked by the spam
> blacklists or by edit filters?
>
> 4) How many edits per day on the English Wikiepdia are prevented (blocked)
> by the spam blacklists?
>
> 5) How many edits per day on the English Wikiepdia are prevented by the
> edit filters?
>
> 6) What percentage of all reverts are made by users of Huggle and Stiki?
>
> 7) What proportion of vandalism is quickly reverted? A 2007 study
> (Priedhorsky et al) found that 42% of vandalistic contributions are
> repaired within one view and 70% within ten views - have any newer studies
> been done on this?
>
> Thanks in advance!
>
> Su-Laine
> Vancouver, BC
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Wikimedia-l] Remember Wikipedia Zero.. Where is the research about the effects of its demise?

2019-12-08 Thread Tilman Bayer
It's a reasonable question, for which the Wiki-research-l mailing list
(CCed) might be a better venue.

There is some data at
https://commons.wikimedia.org/wiki/File:Wikimedia_Foundation_Audiences_Metrics_%26_Insights_Q1_2018-19.pdf
(not
a full analysis, highlighting just two example countries)

Regards, HaeB

On Sun, Nov 24, 2019 at 11:19 PM Gerard Meijssen 
wrote:

> Hoi,
> The BBC shows how dramatically expensive internet is in Africa.. For in my
> opinion local political reasons Wikipedia Zero has terminated. That is ok
> up to a point; the point being that we understand the consequences from
> this action.
>
> Given that our data is NOT local, people have to pay a premium. What are we
> going to do to compensate for expensive Wikipedia that replaced Wikipedia
> Zero? Did we study the effects or are we not interested in the consequences
> of our actions?
> Thanks,
>GerardM
>
> https://www.bbc.co.uk/news/world-africa-50516888
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: wikimedi...@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Wikimedia-l] Page views of male/female biographies?

2018-12-03 Thread Tilman Bayer
Hi Micru,

in general, there may be better venues to ask this kind of question, e.g.
the Wiki-research-l and Gendergap mailing lists (both CCed). But for a
partial answer, the paper by Marit Hinnosaar reviewed here looks at these
stats (if not their long-term trend):
https://meta.wikimedia.org/wiki/Research:Newsletter/2015/December#Does_advertising_the_gender_gap_help_or_hurt_Wikipedia?

E.g. "On a typical (median) day in September 2014, no one read 26 percent
of the biographies of men versus only 16 percent of the biographies of
women."

On Wed, Nov 28, 2018 at 3:35 AM David Cuenca Tudela 
wrote:

> Hi,
>
> Are there any statistics that track the evolution of page views of
> male/female biographies in the different Wikipedias?
>
> Regards,
> Micru
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: wikimedi...@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] Beeline as Hive client

2018-10-02 Thread Tilman Bayer
There is an update
<https://wikitech.wikimedia.org/w/index.php?title=Analytics/Systems/Cluster/Hive/Queries=1804753=1804337>
about
this:

"...hive is officially deprecated in favor of beeline, but as of October
2018, the Analytics team does not recommend migrating to it. The hive client
still has significantly better error reporting and a few other advantages."

On Thu, Apr 21, 2016 at 5:06 PM, Tilman Bayer  wrote:

> Thanks for making it easier to use Beeline and for setting up the
> documentation! Curious and excited about stat1004 ;)
>
> On Thu, Apr 21, 2016 at 9:38 AM, Madhumitha Viswanathan
>  wrote:
> > Hi all,
> >
> > For all Hive users using stat1002/1004, you might have seen a deprecation
> > warning when you launch the hive client - that claims it's being replaced
> > with Beeline. The Beeline shell has always been available to use, but it
> > required supplying a database connection string every time, which was
> pretty
> > annoying. We now have a wrapper script setup to make this easier. The old
> > Hive CLI will continue to exist, but we encourage moving over to Beeline.
> > You can use it by logging into the stat1002/1004 boxes as usual, and
> > launching `beeline`.
> >
> > There is some documentation on this here:
> > https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Beeline.
> >
> > If you run into any issues using this interface, please ping us on the
> > Analytics list or #wikimedia-analytics or file a bug on Phabricator.
> >
> > (If you are wondering stat1004 whaaat - there should be an announcement
> > coming up about it soon!)
> >
> > Best,
> >
> > --Madhu :)
> >
> > _______
> > Analytics mailing list
> > analyt...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/analytics
> >
>
>
>
> --
> Tilman Bayer
> Senior Analyst
> Wikimedia Foundation
> IRC (Freenode): HaeB
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Upcoming Research Newsletter: New Papers Open For Review

2018-09-22 Thread Tilman Bayer
Hi Masssly,

thanks! But that's actually the old Etherpad for the previous issue - as I
said earlier today/yesterday, the new pad should be up at
https://etherpad.wikimedia.org/p/WRN201809  shortly. Perhaps you could send
a followup notice with the corrected link then. For now I have put a note
on top of the old pad.

On Sat, Sep 22, 2018 at 4:28 PM, Mohammed Sadat Abdulai 
wrote:

> Hi everyone,
> We’re preparing for the August 2018 research newsletter and looking for
> contributors. Please take a look at https://etherpad.wikimedia.
> org/p/WRN201808 and add your name next to any paper you are interested in
> covering. Our target publication date is on September 28 UTC although
> actual publication might happen several days later. As usual, short notes
> and one-paragraph reviews are most welcome.
>
>
> Highlights from this month:
>
>- "Sharing Small Pieces of the World": Increasing and Broadening
>Participation in Wikimedia Commons
>- A Wikia Census: Motives, Tools and and Insights
>- Characterizing the Triggering Phenomenon in Wikipedia
>- Comparative Analysis of the Informativeness and Encyclopedic Style
>of the Popular Web Information Sources
>- Do less active participants make active participants more active? An
>examination of Chinese Wikipedia
>- Do We All Talk Before We Type?: Understanding Collaboration in
>Wikipedia Language Editions
>- Evaluating Wikipedia as a Source of Information for Disease
>Understanding
>- Neural Article Pair Modeling for Wikipedia Sub-article Matching
>- The Battle for Wikipedia: The New Age of ‘Lost Victories’?
>- The impact of news exposure on collective attention in the United
>States during the 2016 Zika epidemic
>- University Students in the Educational Field and Wikipedia Vandalism
>- What is the Commons Worth? Estimating the Value of Wikimedia Imagery
>by Observing Downstream Use
>
>
> Masssly, Tilman Bayer and Dario Taraborelli
>
> [1] http://meta.wikimedia.org/wiki/Research:Newsletter
>
>


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] where did I read about predicting user conflicts?

2018-09-17 Thread Tilman Bayer
Maybe it was this research ? https://blog.wikimedia.org/201
8/06/13/conversations-gone-awry/

Or perhaps you were recalling the talk page research summarized in this
year's "State of Wikimedia Research"
<https://wikimania2018.wikimedia.org/wiki/Program/State_of_Wikimedia_Research_2017-2018>
Wikimania presentation? https://mako.cc/talks/201807-wikimania_research.pdf

On Sun, Sep 16, 2018 at 2:27 AM, Kerry Raymond 
wrote:

> Some time in the last few months (possibly at Wikimania) someone pointed me
> at some research about predicting the outcome of Wikipedia consensus
> building from the language they were using in Talk. I think it was either
> research in progress or recently completed.
>
>
>
> As I recall, the main "take home" message was that discussions where "you"
> started to be used tended to end up in conflict and that discussions that
> avoided "you" were more likely to resolve amicably.
>
>
>
> If this rings any bells for you, can you please point me at it please.
>
>
>
> Thanks
>
>
>
> Kerry
>
>
>
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] "State of Wikimedia Research" presentation at Wikimania 2018

2018-07-27 Thread Tilman Bayer
Thanks, Pine! The slide deck (with notes) for this presentation is at
https://mako.cc/talks/201807-wikimania_research.pdf .
And a general reminder that for monthly and daily (instead of yearly)
research updates, you are welcome to subscribe to our newsletter and
Twitter feed:  https://meta.wikimedia.org/wiki/Research:Newsletter /
https://twitter.com/WikiResearch

On Fri, Jul 27, 2018 at 1:26 PM, Pine W  wrote:

> In case people are interested: https://www.youtube.com/watch?v=pE2UQu3r6vE
>
>
> The topics covered include:
> * Media and images
> * Talk page debates
> * Comparisons of Wikipedia language editions
> * Who is not participating?
> * Wikipedia as a source of data
>
>
> Pine
> ( https://meta.wikimedia.org/wiki/User:Pine )
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Upcoming research newsletter: new papers open for review

2018-01-21 Thread Tilman Bayer
Resending the email below, as it does not have seem to have made to the
inboxes of several recipients - including myself - even though it is recorded
in the list's archives
<https://lists.wikimedia.org/pipermail/wiki-research-l/2018-January/date.html>
.

---
From: masssly at ymail.com
Subject: [Wiki-research-l] Upcoming research newsletter: new papers open
for review
Date: Fri Jan 19 20:12:18 UTC 2018

Hi everyone,
We’re preparing for the January 2018 research newsletter and looking for
contributors. Please take a look at:
https://etherpad.wikimedia.org/p/WRN201801 and add your name next to any
paper you are interested in covering. Our target publication date is on
January 26 UTC. As usual, short notes and one-paragraph reviews are most
welcome.
Highlights from this month:
• Can conference papers have information value through Wikipedia? An
investigation of four engineering fields
• Collaborative Approach to Developing a Multilingual Ontology: A Case
Study of Wikidata
• Determining Quality of Articles in Polish Wikipedia Based on Linguistic
Features
• Emo, Love, and God: Making Sense of Urban Dictionary, a Crowd-Sourced
Online Dictionary
• Fostering Public Good Contributions with Symbolic Awards: A Large-Scale
Natural Field Experiment at Wikipedia
• Knowledge categorization affects popularity and quality of Wikipedia
articles
• The Conceptual Correspondence between the Encyclopaedia and Wikipedia
• The Wisdom of Polarized Crowds
• Use of Louisiana's Digital Cultural Heritage by Wikipedians
• What Makes Wikipedia's Volunteer Editors Volunteer?
• Wikipedia-integrated publishing: a comparison of successful models
If you have any question about the format or process feel free to get in
touch off-list.
Masssly, Tilman Bayer and Dario Taraborelli
[1] http://meta.wikimedia.org/wiki/Research:Newsletter


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] stat1002 and stat1003 deprecated. Please use new stat boxes

2017-07-19 Thread Tilman Bayer
Thanks Andrew! Having wiki documentation pages for each of the new servers
would be great (like we currently have
https://wikitech.wikimedia.org/wiki/Stat1003 and
https://wikitech.wikimedia.org/wiki/Stat1002 ). Also, I'm not sure if/where
their SSH fingerprints have been published already (cf.
https://phabricator.wikimedia.org/T162972 ).

On Tue, Jul 18, 2017 at 10:31 AM, Andrew Otto <o...@wikimedia.org> wrote:

> Hi all!
>
> tl;dr: Stop using stat100[23] by September 1st.
>
> We’re finally replacing stat1002 and stat1003.  These boxes are out of
> warranty, and are running Ubuntu Trusty, while most of the production fleet
> is already on Debian Jessie or even Debian Stretch.
>
> stat1005 is the new stat1002 replacement.  If you have access to stat1002,
> you also have access to stat1005.  I’ve copied over home directories from
> stat1002.
>
> stat1006 is the new stat1003 replacement.  If you have access to stat1003,
> you also have access to stat1006.  I’ve copied over home directories from
> stat1003.
>
> I have not migrated any personal cron jobs running on stat1002 or
> stat1003.  I need your help for this!
>
> Both of these boxes are running Debian Stretch.  As such, packages that
> your work depends on may have upgraded.  Please log into the new boxes and
> try stuff out!  If you find anything that doesn’t work, please let me know
> by commenting on https://phabricator.wikimedia.org/T152712.
>
> Please be fully migrated to the new nodes by September 1st.  This will
> give us enough time to fully decommission stat1002 and stat1003 by the end
> of this quarter.
>
> I’ve only done a single rsync of home directories.  If there is new data
> on stat1002 or stat1003 that you want rsynced over, let me know on the
> ticket.
>
> A few notes:
> - stat1002 used to have /a.  This has been removed in favor of /srv.  /a
> no longer exists.
> - Home directories are now much larger.  You no longer need to create
> personal directories in /srv.
> - /tmp is still small, so please be careful.  If you are running long jobs
> that generate temporary data, please have those jobs write into your home
> directory, rather than /tmp.
> - We might implement user home directory quotas in the future.
>
> Thanks all!  I’ll send another email in about a months time to remind you
> of the impending deadline of Sept 1.
>
> -Andrew Otto
>
>
>


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] Kaggle competition to forecast Wikipedia article traffic

2017-07-18 Thread Tilman Bayer
Note that the contest is invitation only and the training dataset (based on
our PV data) is only accessible if you have a Kaggle account (and possibly
you need to have an invite for that too).


On Tue, Jul 18, 2017 at 10:34 AM, Dario Taraborelli <
dtarabore...@wikimedia.org> wrote:

> Wanted to make sure everyone saw this challenge announced by Kaggle:
>
> https://www.kaggle.com/c/web-traffic-time-series-forecasting
> https://twitter.com/kaggle/status/887093338117201923
>
> The timeline:
>
>
>- September 1st, 2017 - Deadline to accept competition rules.
>- September 1st, 2017 - Team Merger deadline. This is the last day
>participants may join or merge teams.
>- September 1st, 2017 - Final dataset is released.
>- September 10th, 2017 - Final submission deadline.
>
> Competition winners will be revealed after November 10, 2017.
>
> Dario
>
> --
>
> *Dario Taraborelli  *Director, Head of Research, Wikimedia Foundation
> wikimediafoundation.org • nitens.org • @readermeter
> <http://twitter.com/readermeter>
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Fwd: [Wikimedia-l] Wikimedia and other 60+ organizations launch the Initiative for Open Citations (I4OC)

2017-04-06 Thread Tilman Bayer
Forwarding exciting news that might be of interest to some on this list as
well.

-- Forwarded message --
From: Dario Taraborelli <dtarabore...@wikimedia.org>
Date: Thu, Apr 6, 2017 at 11:11 AM
Subject: [Wikimedia-l] Wikimedia and other 60+ organizations launch the
Initiative for Open Citations (I4OC)
To: Wikimedia Mailing List <wikimedi...@lists.wikimedia.org>, Open Access
discussions <openacc...@lists.wikimedia.org>, wikicite-discuss <
wikicite-disc...@wikimedia.org>


Hey all,

I wanted to let you know that we launched an initiative this morning
called: Initiative for Open Citations <https://i4oc.org/> (I4OC).

Prior to the launch of I4OC, only 1% of scholarly papers made citation data
available in the open. Today, that number has jumped to 40%. We're proud to
make a growing piece of fundamental data for open knowledge available to
everyone, with no copyright restriction whatsoever.

The I4OC has been in the making for the past 6 months, with lots of
individual discussions with scholarly publishers, asking them to flip the
switch and release this data. Wikimedia Deutschland, Wikimedia UK, the Wiki
Edu Foundation, the Internet Archive, Mozilla, PLOS and many other open
knowledge and open data organizations are among the official endorsers of
the initiative.

You can read more about this initiative on a post
<https://blog.wikimedia.org/2017/04/06/initiative-for-open-citations/> we
published this morning on the Wikimedia Blog, on the joint press release
<https://i4oc.org/press.html>, or follow @i4oc_org
<https://twitter.com/i4oc_org> for more updates.

Best,
Dario



--

*Dario Taraborelli  *Director, Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter
<http://twitter.com/readermeter>
___
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
wiki/Wikimedia-l
New messages to: wikimedi...@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Wikimedia-l] Termodynamics and social capital

2017-03-19 Thread Tilman Bayer
Hi John,

the Wiki-research-l mailing list (CCed) is usually a better place to
ask such questions than Wikimedia-l.

Without having taken a look at the book you mention, here are two
pointers to research that might be related:

https://meta.wikimedia.org/wiki/Research:Newsletter/2012/April#cite_ref-27
("Wikipedia as a thermodynamic system - becoming more efficient over
time")
https://meta.wikimedia.org/wiki/Research:Newsletter/2015/September#More_newbies_mean_more_conflict.2C_but_extreme_tolerance_can_still_achieve_eternal_peace

On Thu, Mar 16, 2017 at 9:29 AM, John Erling Blad <jeb...@gmail.com> wrote:
> Has anyone tried to use termodynamics on social capital within Wikipedia?
> Over investment in social capital and negative specific heat might create
> unstable systems, that is people will leave the community.
>
> There is a book on the topic; A Dynamic Balance: Social Capital and
> Sustainable Community Development
> ___
> Wikimedia-l mailing list, guidelines at: 
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: wikimedi...@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Request: Studies of external impacts of Wikipedia

2017-01-26 Thread Tilman Bayer
For a concrete quantitative estimate of the economic benefit
(technically, consumer
surplus <https://en.wikipedia.org/wiki/Economic_surplus>), albeit outdated,
probably too low, and not peer-reviewed, see
https://meta.wikimedia.org/wiki/Research:Newsletter/2013/March#Estimate_for_economic_benefit_of_Wikipedia:_.2450_million_by_2006_already
(The Economist cited the aforementioned Shane Greenstein, who "thinks
Wikipedia accounted for up to $50m of that surplus" as of 2006 - in other
words, Wikipedia provides a good that otherwise people would be willing to
buy, spending $50m on it that instead they get to spend on something else.)

Tangentially, the methodology of this research is also interesting, as it
tried to put price tags on the benefit provided by a small, specific slice
of Wikipedia content (images of bestseller authors on enwiki):
https://meta.wikimedia.org/wiki/Research:Newsletter/2015/April#Excessive_copyright_terms_proven_to_be_a_cost_for_society.2C_via_English_Wikipedia_images

On Tue, Jan 24, 2017 at 2:19 PM, Aaron Halfaker <ahalfa...@wikimedia.org>
wrote:

> Wikipedia has probably had some substantial external impacts.  Are there
> any studies quantifying them?  Maybe increased scientific literacy?  Or
> maybe GDP rises with access to Wikipedia?
>
> Are there any studies that have explored how Wikipedia has affected
> economic or social issues?
>
> I'm looking for any references you've got.
>
> -Aaron
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Chapters

2017-01-10 Thread Tilman Bayer
This 2010 conference paper by Leonhard Dobusch and Sigrid Quack compared
the global affiliate network of the Wikimedia Foundation and Creative
Commons, based on many interviews with (on the Wikimedia side) chapter
members:
http://wikis.fu-berlin.de/download/attachments/59080767/Dobusch-Quack-Paper.pdf

On Sun, Jan 8, 2017 at 5:41 PM, Aisha Brady <aishabr...@gmail.com> wrote:

> Hi!
>
> Could anyone point me towards any papers relevant to Wikimedia chapters
> (how they function, the work they do, whether they have been successful or
> otherwise)?
>
> Thank you! :)
>
> Aisha
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] Question about data mining of the "Articles for Deletion" queues

2016-11-22 Thread Tilman Bayer
These papers by Jodi Schneider et al. might be of interest too:
https://meta.wikimedia.org/wiki/Research:Newsletter/2013/May#In_brief
https://meta.wikimedia.org/wiki/Research:Newsletter/2012/
September#cite_ref-11

On Tue, Nov 22, 2016 at 10:25 AM, Jonathan Morgan <jmor...@wikimedia.org>
wrote:

> +research-l because this is more of a research than an analytics question.
>
> Hi Jane,
>
> What do you mean by acronyms in deletion queues here? Are you talking
> about policy links used to justify !votes in deletion discussions, or
> acronyms used in deletion comments of AfD'd articles? Or something else
> entirely.
>
> If #1, this paper
> <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.835.9466=rep1=pdf>
> examines the use of a single policy (IAR) in AfD's over time.
> If #2, I did a similar (quick and dirty) analysis with AfC recently, here:
> https://quarry.wmflabs.org/query/13341
>
> Others may be aware of additional resources or analyses.
>
> Best,
> Jonathan
>
> On Tue, Nov 22, 2016 at 7:10 AM, Jane Darnell <jane...@gmail.com> wrote:
>
>> Hi all,
>> Has anyone tried to find the frequency of acronyms used in AfD queues?
>> Any information about the deletion queue in language is welcome, thanks.
>>
>> This came up during a discussion about "enyclopedia worthiness" and how
>> to explain this concept to newbies.
>> Jane
>>
>> ___
>> Analytics mailing list
>> analyt...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> --
> Jonathan T. Morgan
> Senior Design Researcher
> Wikimedia Foundation
> User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
>
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Make Software: Change the world

2016-11-10 Thread Tilman Bayer
PS, the meetup page about that 2012 session with some more
information: 
https://en.wikipedia.org/wiki/Wikipedia:Meetup/Computer_History_Museum

On Thu, Nov 10, 2016 at 1:57 AM, Tilman Bayer <tba...@wikimedia.org> wrote:
> Hi Alexandre,
>
> yes, Andrew Lih (CCed) worked with them on this, and preparations
> included a brainstorming session held by CHM in Mountain View in
> December 2012, which about 15 local Wikipedians (including myself)
> attended.
>
> On Wed, Nov 9, 2016 at 5:19 AM, Alexandre Hocquet
> <alexandre.hocq...@univ-lorraine.fr> wrote:
>> Dear Wiki-Research memebers,
>>
>> Apologies if this has been debated before : I came across that The Computer
>> History Museum in Mountain View, CA will be presenting a new exhibition from
>> January on called "Make Software: Change the world" that focuses on seven
>> "game changing applications" and among them : Wikipedia
>>
>> http://www.computerhistory.org/exhibits/makesoftware/
>>
>>
>> Has somebody on the list worked with the project or is aware of how
>> Wikipedia is presented in the exhibit ?
>>
>> Yours,
>>
>> --
>> ***
>> Alexandre Hocquet
>>
>> Université de Lorraine & Archives Henri Poincaré
>> alexandre.hocq...@univ-lorraine.fr
>> http://poincare.univ-lorraine.fr/fr/membre-titulaire/alexandre-hocquet
>> ***
>>
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
> --
> Tilman Bayer
> Senior Analyst
> Wikimedia Foundation
> IRC (Freenode): HaeB



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Make Software: Change the world

2016-11-10 Thread Tilman Bayer
Hi Alexandre,

yes, Andrew Lih (CCed) worked with them on this, and preparations
included a brainstorming session held by CHM in Mountain View in
December 2012, which about 15 local Wikipedians (including myself)
attended.

On Wed, Nov 9, 2016 at 5:19 AM, Alexandre Hocquet
<alexandre.hocq...@univ-lorraine.fr> wrote:
> Dear Wiki-Research memebers,
>
> Apologies if this has been debated before : I came across that The Computer
> History Museum in Mountain View, CA will be presenting a new exhibition from
> January on called "Make Software: Change the world" that focuses on seven
> "game changing applications" and among them : Wikipedia
>
> http://www.computerhistory.org/exhibits/makesoftware/
>
>
> Has somebody on the list worked with the project or is aware of how
> Wikipedia is presented in the exhibit ?
>
> Yours,
>
> --
> ***
> Alexandre Hocquet
>
> Université de Lorraine & Archives Henri Poincaré
> alexandre.hocq...@univ-lorraine.fr
> http://poincare.univ-lorraine.fr/fr/membre-titulaire/alexandre-hocquet
> ***
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Wikipedia video stats ?

2016-11-03 Thread Tilman Bayer
Hi Trilce,

some data exists about video views, although it's AFAIK not available
in form of a nice online tool. See
https://wikitech.wikimedia.org/wiki/Analytics/Data/Mediacounts

On Mon, Oct 31, 2016 at 5:34 AM, Trilce Navarrete
<trilce.navarr...@gmail.com> wrote:
> Dear all,
>
> I'm doing some research on the use of image and video in Wikipedia and would
> like to know if there is any way to track # of video views in Wikipedia
> articles ?
>
> Image view per page I use the GLAM tools, but for video, I'm not sure if
> there is a tool or general Wikipedia stat on # of videos currently used in
> all languages, # of Wikipedia articles containing video and # of views to
> this pages.
>
> I understand use of video online is exploiting, and wondered if the wiki had
> stats on this as well.
>
> your feedback will be most appreciated !
> thanks much in advance
> Trilce
>
> --
> :..::...::..::...::..:
> Trilce Navarrete
>
> m: +31 (0)6 244 84998 | s: trilcen | t: @trilcenavarrete
> w: trilcenavarrete.com
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Research on automatically created articles

2016-09-13 Thread Tilman Bayer
The new issue of the Wikimedia Research Newsletter contains a review
of the paper by Denny, also mentioning the debate on this list and on ANI:
https://blog.wikimedia.org/2016/09/12/research-newsletter-august-2016/
(there is some additional discussion in the comments there and on the
talk page of the Signpost version)

On Tue, Aug 23, 2016 at 8:20 PM, Stuart A. Yeates <syea...@gmail.com> wrote:
> For the sake of completeness, the archival URL for the thread at ANI is
>
> https://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/IncidentArchive931#Moving_discussion_from_wikimedia_research_mailing_list
>
> cheers
> stuart
>
> --
> ...let us be heard from red core to black sky
>
> On Tue, Aug 16, 2016 at 7:04 AM, Samuel Klein <meta...@gmail.com> wrote:
>>
>> Thanks Sidd for responding actively in this thread.
>>
>> The biggest problem here: the algorithm used in this research were bad.
>> They produced nonsense that wasn't remotely grammatical.  You should have
>> caught most of these problems.  (The early version of the bot (for just
>> plays) had a poor success rate as well, but it seemed plausible that a
>> template for tiny play articles could be effectively filled out with
>> automation.)
>>
>> Two interesting results IMO:
>>  + A nonsensical article with a decent first sentence & sections, and refs
>> (however random), can serve as encouragement to write a real article.
>> Possibly more of an encouragement than just the first sentence alone.  I
>> believe there's some related research into how people respond to cold emails
>> that include mistakes & nonsense.  (Surely there's a more effective \
>> non-offensive way to produce similar results)
>>  + We could use even a naive measure of the coverage & consistency of new
>> article review.  (If it drops below a certain threshhold, we could do
>> something like change the background color & search-engine metadata for
>> pages that haven't been properly reviewed yet)
>>
>> For future researchers:
>> If we encourage people to spend more time making tools work – rather than
>> doing something simple (even counterproductive) and writing a paper about it
>> – everyone will benefit.  The main namespace is full of bots, both fully
>> automatic and requiring a human to run them. Anyone considering or
>> implementing wiki automation should look at them and talk to the community
>> of bot maintainers.
>>
>> Sam
>>
>> On Mon, Aug 15, 2016 at 1:28 PM, siddhartha banerjee <sidd2...@gmail.com>
>> wrote:
>>>
>>> Ziko,
>>>
>>> Thanks for your detailed email. Agree on all the comments.
>>>
>>> Some earlier comments might have been harsh, but I understand that there
>>> is a valid reason behind it and also the dedication of so many people
>>> involved to help reach Wikipedia where it is today.
>>>
>>> We should have been more diligent in finding out policies and rules
>>> (including IRB) before entering content on Wikipedia. We promise not to
>>> repeat anything of this sort in the future and also I am trying to summarize
>>> all that has been discussed here to prevent such unpleasant experiences from
>>> other researchers in this area.
>>>
>>> -- Sidd
>>>
>>> ___
>>> Wiki-research-l mailing list
>>> Wiki-research-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>>
>>
>>
>>
>> --
>> Samuel Klein  @metasj   w:user:sj  +1 617 529 4266
>>
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Looking for Wikipedia search queries

2016-08-17 Thread Tilman Bayer
CCing the WMF Search and Discovery mailing list
(https://lists.wikimedia.org/mailman/listinfo/discovery )

On Wed, Aug 17, 2016 at 6:00 AM, Felix Engelmann
<fengelm...@uni-koblenz.de> wrote:
> Hi everybody,
>
> I’m currently writing by bachelor thesis at University Koblenz, Germany. The 
> goal is to improve Wikipedia search by exploiting the text structure of 
> Wikipedia articles. To conduct unbiased user studies I need real world 
> queries so I can compare the novel algorithms agains the currently used ones. 
> Are there any query logs existing which I can use for this purpose?
>
> Thanks for your help!
> Felix Engelmann
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] Beeline as Hive client

2016-04-21 Thread Tilman Bayer
Thanks for making it easier to use Beeline and for setting up the
documentation! Curious and excited about stat1004 ;)

On Thu, Apr 21, 2016 at 9:38 AM, Madhumitha Viswanathan
<mviswanat...@wikimedia.org> wrote:
> Hi all,
>
> For all Hive users using stat1002/1004, you might have seen a deprecation
> warning when you launch the hive client - that claims it's being replaced
> with Beeline. The Beeline shell has always been available to use, but it
> required supplying a database connection string every time, which was pretty
> annoying. We now have a wrapper script setup to make this easier. The old
> Hive CLI will continue to exist, but we encourage moving over to Beeline.
> You can use it by logging into the stat1002/1004 boxes as usual, and
> launching `beeline`.
>
> There is some documentation on this here:
> https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Beeline.
>
> If you run into any issues using this interface, please ping us on the
> Analytics list or #wikimedia-analytics or file a bug on Phabricator.
>
> (If you are wondering stat1004 whaaat - there should be an announcement
> coming up about it soon!)
>
> Best,
>
> --Madhu :)
>
> ___
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Wikimedia-l] Gender gap on "classical" encyclopedias

2016-04-20 Thread Tilman Bayer
On Wed, Apr 20, 2016 at 12:39 AM,  <alexhin...@gmail.com> wrote:
> Hi, as some of you may know, the Wikipedia gender indicator [1] tells us how 
> many articles are biographies about women x language/country/culture.
>
> In order to compare these numbers...Does anyone knows if there is an existing 
> comparison with gender balance in classical encyclopedias? (Britannica, 
> Larousse...) or, if not, could someone prepare a WD query about it?
>
> I think it could be a good argument for us to use: e.g "at cawiki 12% of bios 
> are about women, compared to 5% in GEC, Our most famous encyclopedia".
>
> We could compare it also for temathic encyclopedias or other databases 
> existing in projects like Mix and match.
>
> Can someone help? thanks in advance
>
>
> [1]http://wigi.wmflabs.org/
>
>
> Àlex Hinojo
> User:Kippelboy
> Amical Wikimedia Programme manager

Interesting question. There may be more suitable venues for it, e.g.
the research mailing list (CCed). Anyway, to start with two examples:

http://reagle.org/joseph/pelican/social/gender-bias-in-wikipedia-and-britannica.html

https://meta.wikimedia.org/wiki/Research:Newsletter/2015/May#Notable_women_.22slightly_overrepresented.22_.28not_underrepresented.29_on_Wikipedia.2C_but_the_Smurfette_principle_still_holds
Comparison of Wikipedia with, among other sources, "Human
Accomplishment", a 2003 "ranking of geniuses throughout the ages and
around the world based on their prominence in contemporary
encyclopedias" (NYT)


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Fwd: [Wikitech-l] statistics about frequent section titles

2016-03-02 Thread Tilman Bayer
;>> Wikitech-l mailing list
>>>>> wikitec...@lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jonathan T. Morgan
>>>>> Senior Design Researcher
>>>>> Wikimedia Foundation
>>>>> User:Jmorgan (WMF)
>>>>>
>>>>>
>>>>> ___
>>>>> Wiki-research-l mailing list
>>>>> Wiki-research-l@lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>>>>
>>>>
>>>> ___
>>>> Wiki-research-l mailing list
>>>> Wiki-research-l@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>>>
>>>
>>>
>>> ___
>>> Wiki-research-l mailing list
>>> Wiki-research-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>>
>>
>>
>>
>> --
>> Jonathan T. Morgan
>> Senior Design Researcher
>> Wikimedia Foundation
>> User:Jmorgan (WMF)
>>
>>
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Upcoming research newsletter (December 2015): new papers open for review

2016-01-02 Thread Tilman Bayer
PS: Note that publication of the newsletter's December issue had to be
postponed by one week to Wednesday January 6 or a bit later. (It's
tied to the Signpost's publication schedule, whose December 23 issue
was skipped with the December 30 issue going out ahead of time
instead, too early for the research newsletter.)

We could still use reviewers for this issue - e.g I would love it if
someone could provide an informed view of the "Conflict and
Computation on Wikipedia" paper (which intriguingly  "suggests that
policy-makers may be limited in their ability to manage conflict, and
that bad actors and exogenous shocks are less effective in causing
conflict than is generally believed"). In any case, apologies to our
readers for the delay.

On Sat, Dec 26, 2015 at 2:12 AM,  <mass...@ymail.com> wrote:
> Hi everybody,
>
> We’re preparing for the December 2015 research newsletter and looking for
> contributors. Please take a look at:
> https://etherpad.wikimedia.org/p/WRN201512 and add your name next to any
> paper you are interested in covering. Our target publication date is
> Wednesday December 30 UTC although actual publication might happen several
> days later. As usual, short notes and one-paragraph reviews are most
> welcome.
>
> Highlights from this month:
>
>
> Accidental Technologist: How Can Libraries Improve Wikipedia?
> Artificial intelligence service gives Wikipedians ‘X-ray specs’ to see
> through bad edits
> Conflict and Computation on Wikipedia: a Finite-State Machine Analysis of
> Editor Interactions
> Evolution of Privacy Loss in Wikipedia
> Extracting Semantics from Unconstrained Navigation on Wikipedia
> Information-seeking behaviour for epilepsy: an infodemiological study of
> searches for Wikipedia articles
> Integrated Parallel Sentence and Fragment Extraction from Comparable
> Corpora: A Case Study on Chinese--Japanese Wikipedia
> Les discussions Wikipedia : un corpus pour caractériser le genre "(wiki)
> discussion"
> Mapping bilateral information interests using the activity of Wikipedia
> editors
> Microtext Normalization using Probably-. Phonetically-Similar Word Discovery
> Mining Wikipedia to Rank Rock Guitarist
> Only 2-4% of UK 12-15 year olds  use Wikipedia as first stop for information
> Open Collaboration Systems Research Workshop 2015 Report
> Teachers' use of Wikipedia with their Students
> The implications of Wikipedia for contemporary science education: Using
> Social Network Analysis Techniques for Automatic Organisation of Knowledge
> Understanding the Role of Participative Web within Collaborative Culture:
> The Case of Wikipedia
> Untangling Performance from Success
> Wikidata: A platform for data integration and dissemination for the life
> sciences and beyond
> Wikipedia Ranking of World Universities
> Wikipedia, sociology, and the promise and pitfalls of Big Data
> Wikipedia: The difference between information acquisition and learning
> knowledge
> Wikis and Collaborative Systems for Large Formal Mathematics
>
>
> If you have any question about the format or process feel free to get in
> touch off-list.
> Masssly, Tilman Bayer and Dario Taraborelli
>
> [1] http://meta.wikimedia.org/wiki/Research:Newsletter



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Research-related activities at Wikimania 2016

2015-11-28 Thread Tilman Bayer
Hi Daniel,

thanks for your work on this! Could it be clarified whether "research"
in this context means

1. research about Wikipedia (and other Wikimedia projects) as
published in the academic literature or by Wikimedia volunteers, WMF
staff etc., or

2. research in general (about everything), as covered on Wikimedia
projects -in particular, Wikipedia as an outreach medium for science,
researchers as expert contributors, etc.?

From the proposed guest speaker list and the "What we would like to
learn/teach" sections, I assume that 2. is meant. There's an
intersection area with 1. of course, but it's relatively small in my
experience. Both are very important topics of course, so it might be
worth creating a separate track for 1.

(I also posted this on the talk page, so feel free to reply there instead.)

On Sat, Nov 21, 2015 at 8:20 PM, Daniel Mietchen
<daniel.mietc...@googlemail.com> wrote:
>
> Dear all,
>
> we have started to collect ideas on how to cover research-related
> topics at the next Wikimania:
> https://meta.wikimedia.org/wiki/Wikimania_2016_bids/Esino_Lario/Program/Liaisons#Research
> .
>
> Your contributions there would be most welcome.
>
> Thanks and cheers,
>
> Daniel
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Analytics] Does StackExchange have more monthly active users than Wikipedia?

2015-11-13 Thread Tilman Bayer
Joel Spolsky explained his comparison - which was already mentioned on
this list (Analytics-l) on September 17 - a bit more here:
https://www.youtube.com/watch?v=bvEAuSHJOBU=2216
TLDL: it's indeed about the entire Stack Exchange network vs. the
English Wikipedia (i.e. not about the number from Nemo's query), and
they chose this metric for the closest possible comparison - but still
maintain that posting a question or answer is a larger unit of work
than the average WP edit.

On Fri, Nov 13, 2015 at 11:04 AM, Jonathan Morgan <jmor...@wikimedia.org> wrote:
> +research
>
> Fascinating. Thanks for sharing this, Nemo. And for setting those arrogant
> Stackers straight ;)
>
> For anyone else interested: Nemo was able to answer this question because
> StackExchange has a Quarry-like public query interface of their own. You
> should go play with it right now: http://data.stackexchange.com/
>
> Jonathan
>
>
>
> On Fri, Nov 13, 2015 at 10:56 AM, Federico Leva (Nemo) <nemow...@gmail.com>
> wrote:
>>
>> Some information at
>> https://meta.stackexchange.com/questions/269334/how-many-active-users-contributors-does-stack-overflow-stack-exchange-have/
>>
>> TL;DR: not really, and definitely not StackOverflow alone (~14k). But
>> perhaps the whole StackExchange has more than the English Wikipedia alone.
>>
>> Nemo
>>
>> ___
>> Analytics mailing list
>> analyt...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
>
> --
> Jonathan T. Morgan
> Senior Design Researcher
> Wikimedia Foundation
> User:Jmorgan (WMF)
>
>
> _______
> Analytics mailing list
> analyt...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] analyzing 50 billion Wikipedia pageviews in 5 seconds (w/BigQuery)

2015-07-16 Thread Tilman Bayer
Thanks Felipe! Yes, I think this is a really interesting tool to explore.

Another quick example:
List articles attract between 2-3% of pageviews on the English Wikipedia:

SELECT SUM(requests)
FROM [fh-bigquery:wikipedia.pagecounts_20150715_14]
WHERE LEFT(TITLE, 8) = 'List_of_'
AND language = 'en'

244091

SELECT SUM(requests)
FROM [fh-bigquery:wikipedia.pagecounts_20150715_14]
WHERE language = 'en'

8870277

(Caveats: during one hour this Wednesday, using the old pageview
definition, i.e. not excluding spiders and bots, and relying on the
article name instead of categories.)

I understand Felipe has already been talking to the WMF Analytics
team, who are making major progress on
https://phabricator.wikimedia.org/T44259 currently.

On Thu, Jul 16, 2015 at 11:32 AM, Felipe Hoffa felipe.ho...@gmail.com wrote:
 Hi! I'm currently attending Wikimania (I have a session on Friday at
 4.30pm).

 Tilman Bayer suggested to share this tool and techniques here, so I am
 following his advice :).

 I've been using Google BigQuery for a while to analyze Wikipedia's publicly
 available data. It's main advantages:

 - It's unbelievable fast (try it - operations that you might expect to run
 in minutes or hours run in seconds).
 - It's secure, but you can also instantly share data (no need to download
 and setup locally before being able to analyze - BigQuery is always on).
 - Everyone can use BigQuery with a free quota of 1TB of monthly analysis.


 Interesting links:

 - Quick getting started:
 https://www.reddit.com/r/bigquery/comments/3dg9le/analyzing_50_billion_wikipedia_pageviews_in_5/
 - Analyzing the gender gap in Wikipedia (Freebase, and joining it with
 pageviews): https://www.youtube.com/watch?v=lV5vk3higvA
 - Massive Geo-Ip geolocation from the changelog:
 https://www.reddit.com/r/bigquery/comments/1zh7ty/massive_geoip_geolocation_with_google_bigquery/
 - Just for fun, the most popular numbers:
 https://www.reddit.com/r/bigquery/comments/2p0vz4/query_of_the_day_the_most_popular_numbers_in/
 - Top Wikipedia Entries Which Are Most-Edited by Members of the U.S.
 Congress http://minimaxir.com/2014/07/caucus-needed/
 - Music recommendations:
 http://apassant.net/2014/07/11/music-recommendations-300m-data-points-sql/


 I have a couple other interesting examples I haven't written about, but the
 invitation here is for you to try your own :).

 My main challenge today: How to get more publicly available data into
 BigQuery. Let's work together :). I'm sitting around the big data analytics
 team today at the Wikimedia hackathon - and as said earlier, I'll do a
 session on this topic on Friday at 4:30pm.

 Thanks!

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] a cautious note on gender stats Re: Fwd: [Gendergap] Wikipedia readers

2015-03-05 Thread Tilman Bayer
users have to set that user preference if they want the word user next to
their nick show up in female instead of male grammatical gender form (e.g.
Benutzerin vs. Benutzer in German) - male users do not have that
incentive.


-- 
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] this month's research newsletter

2014-07-29 Thread Tilman Bayer
On Wed, Jul 2, 2014 at 7:39 AM, Edward Saperia e...@wikimanialondon.org wrote:

 On 2 July 2014 15:37, Oliver Keyes oke...@wikimedia.org wrote:

 I feel like that might be a bit short-notice - papers need to be
 submitted, reviewed or voted on, so on and so forth. But it could be lovely
 to have a 'best presentation' award for WM itself!


 Well, we could pick from things featured in the research newsletter, for
 example? How do you imagine the winner to be chosen? We can always do
 something more structured for next year. But this might be a good way to
 launch the idea of a research award.

 Ed

Not an award, but it seems worth mentioning
https://wikimania2014.wikimedia.org/wiki/Submissions/The_State_of_Wikimedia_Scholarship_2013-2014
here ...

(Anyone who is going to be in London and has ideas or feedback about
the newsletter: don't hesitate to say hi ;)


 On 2 July 2014 10:33, Edward Saperia e...@wikimanialondon.org wrote:


 I really like the idea of some kind of annual award.


 If someone puts it together before Wikimania, I can put it into the
 closing ceremony?

 Edward Saperia
 Conference Director Wikimania London
 email • facebook • twitter • 07796955572
 133-135 Bethnal Green Road, E2 7DG



 On 2 July 2014 10:15, Aaron Halfaker aaron.halfa...@gmail.com wrote:

 Given that it seems we agree with Poitr's desire for research about
 Wikipedia to lead to useful tools an insights that can be directly applied
 to making Wikipedia and other wikis better, what might be a more effective
 strategy for encouraging researchers to engage with us or at least release
 their work in forms that we can more easily work with?

 Here's a couple of half-baked ideas:

 Wiki research impact task force -- contacts authors to encourage them
 to release code/datasets/etc. and praise them publicly when they do -- 
 could
 be part of the work of newsletter reviewers.  There are many researchers 
 on
 this list who work directly with Wikimedians to make sure that their
 research has direct impact and their awesomeness is worth our appreciation
 and public recognition.
 Yearly research award -- for the most directly impactful research
 projects/researchers similar to
 https://meta.wikimedia.org/wiki/Research:Wikimedia_France_Research_Award.
 One of the focuses of the judging could be the direct impact that the work
 has had.

 -Aaron


 On Wed, Jul 2, 2014 at 7:05 AM, Heather Ford hfor...@gmail.com wrote:

 Apologies. You're right, Han-Teng. The reviewer looks to be Piotr
 Konieczny who I think is on this mailing list?

 Heather Ford
 Oxford Internet Institute Doctoral Programme
 EthnographyMatters | Oxford Digital Ethnography Group
 http://hblog.org | @hfordsa




 On 2 July 2014 12:58, h hant...@gmail.com wrote:

 Heather, I am not sure who contribute that. Probably not Nemo. If
 this issue of newsletter is correctly attributed, the contributors 
 include:
 Taha Yasseri, Maximilian Klein, Piotr Konieczny, Kim Osman, and Tilman
 Bayer. My suggestion is only a personal one, and I am not sure if it is
 against policies to make a few edits once the newsletter is out.

 Thanks again to the contributors of the newsletter, my life is a bit
 easier and more interesting because of your work.



 2014-07-02 15:35 GMT+07:00 Heather Ford hfor...@gmail.com:

 +1 Thanks for your really thoughtful comments, Joe, Han-Teng.

 Nemo, would you be willing to add a note to the review and/or
 contacting the researcher?

 Best,
 Heather.

 Heather Ford
 Oxford Internet Institute Doctoral Programme
 EthnographyMatters | Oxford Digital Ethnography Group
 http://hblog.org | @hfordsa




 On 2 July 2014 05:17, h hant...@gmail.com wrote:

 The tone of the sentence in question

 'it is disappointing that the main purpose appears to be
 completing a thesis, with little thought to actually improving 
 Wikipedia'

 could have been written as

 'It would be more useful for the Wikipedia community of
 practice if the author discussed or even spelled out the implications 
 of the
 research for improving Wikipedia.

 This suggestion is based on my own impression that
 [Wiki-research-l] has mainly two groups of readers: community of 
 practice
 and community of knowledge. It is okay to have some group tensions for
 creative/critical inputs. Still, a neutral tone is better for 
 assessment,
 and an encouraging tone might work a bit better to encourage others 
 to fill
 the *gaps* (both practice and knowledge ones).

 Also, the factors such as originally intended audience and word
 limits may determine how much a writer can do for *due weight* 
 (similar to
 [[WP:due]]). If the original (academic) author failed to address the
 implications for practices satisfactory, a research newsletter 
 contributor
 can point out what s/he thinks the potential/actual implications are. 
 (My
 thanks to the research newsletter's voluntary contributors for their 
 unpaid
 work!)

 While I understand that the monthly research newsletter has its
 own

[Wiki-research-l] The Wikimedia Research Newsletter 4(3) is out

2014-04-01 Thread Tilman Bayer
The March 2014 issue of the Wikimedia Research Newsletter is out:

https://meta.wikimedia.org/wiki/Research:Newsletter/2014/March

Contents:

1 Cross-language study of conflict on Wikipedia
2 The social construction of knowledge on English Wikipedia
3 User hierarchy map: Building Wikipedia's Org Chart
4 Briefly
4.1 Extracting machine-readable data from Wiktionary
4.2 Wikipedia as a source of proper names in various languages
4.3 Wikipedia and Machine Translation: killing two birds with one stone
4.4 Knowledge Construction in Wikipedia: A Systemic-Constructivist Analysis
4.5 Younger librarians more supportive of Wikipedia
4.6 Preparing and publishing Wikipedia articles are a good tool to
train project management, teamwork and peer reviewed publishing
processes in life sciences
4.7 Networked Grounded Theory analysis of views on the use of
Wikipedia in education
4.8 Risk factors and control of hospital acquired infections: a
comparison between Wikipedia and scientific literature
4.9 How a country's broadband connectivity and Wikipedia coverage are related

*** 12 publications were covered in this issue ***
Thanks to Federico Leva, Scott Hale, Kim Osman, Jonathan Morgan, Piotr
Konieczny, Niklas Laxström and James Heilman for contributing.

Tilman Bayer and Dario Taraborelli

--
Wikimedia Research Newsletter
https://meta.wikimedia.org/wiki/Research:Newsletter/

* Follow us on Twitter: @WikiResearch
* Receive this newsletter by mail:
https://lists.wikimedia.org/mailman/listinfo/research-newsletter
* Subscribe to the RSS feed:
http://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/

-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] published articles about Wikipedia translation

2014-03-19 Thread Tilman Bayer
Hi Amir,

not quite a tagged bibliography either, but searching translation in
the Wikimedia Research Newsletter archives at
https://meta.wikimedia.org/wiki/Research:Newsletter/Archives#Search_the_WRN_archives

will find you e.g. these two papers:

https://meta.wikimedia.org/wiki/Research:Newsletter/2012/May#Identifying_software_needs_from_Wikipedia_translation_discussions
https://meta.wikimedia.org/wiki/Research:Newsletter/2014/January#Translation_students_embrace_Wikipedia_assignments.2C_but_find_user_interface_frustrating

On Wed, Mar 19, 2014 at 6:16 AM, Amir E. Aharoni
amir.ahar...@mail.huji.ac.il wrote:
 Hi,

 Is there any list of academic studies of Wikimedia projects sorted or tagged
 by topic? In particular I'm interested in anything to do with translation,
 but it is useful for other topics as well.

 The best thing that I could think of now is going to
 https://en.wikipedia.org/wiki/Wikipedia:Academic_studies_of_Wikipedia
 and searching the page for translation.

 Is there a more structured way?

 Thanks!

 --
 Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
 http://aharoni.wordpress.com
 ‪“We're living in pieces,
 I want to live in peace.” – T. Moore‬

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Upcoming research newsletter: new papers open for review

2014-02-25 Thread Tilman Bayer
Hi Heather,

that's a cool idea, and we have actually been considering something
like this already. While the names of the reviewers are prominently
displayed in the byline on top (and also, many readers of the Signpost
and the newsletter are of course experienced in reading version
histories), showing them next to each review might be make attribution
easier. We just haven't found the time to implement it yet, like with
many other things for the newsletter. You are welcome to figure out a
suitable format and add these attributions in the upcoming issue,
let's follow up offlist if more information is needed.

On Tue, Feb 25, 2014 at 1:11 AM, Heather Ford hfor...@gmail.com wrote:
 Thanks, Dario, Tilman!

 I was wondering whether it would be helpful to add reviewer names/usernames
 to individual signpost reviews. I was struck while reading a review of a
 paper on Signpost recently that I felt like the reviewer was inserting some
 very opinionated statements about the article rather than the regular
 summaries. While I don't think that this is a problem necessarily (although
 I wish that they were a bit more informed about the topic and social science
 research in general), I do think it can be problematic to have these
 comments unattributed. Would be interested to hear what others think...

 Best,
 Heather.

 Heather Ford
 Oxford Internet Institute Doctoral Programme
 EthnographyMatters | Oxford Digital Ethnography Group
 http://hblog.org | @hfordsa




 On 25 February 2014 05:26, Tilman Bayer tba...@wikimedia.org wrote:

 Hi Max,

 yes, we're co-publishing with the Signpost, so the ultimate deadline
 is the Signpost's actual publication time. Its formal publication date
 is this Wednesday (the 26th) UTC, although actual publication might
 take place several hours or even a few days later. Thanks for signing
 up to review the Editor's Biases paper, I'm looking forward to
 reading your summary!

 On Mon, Feb 24, 2014 at 3:39 PM, Klein,Max kle...@oclc.org wrote:
  Dario, what's the timeframe for writing reviews so they can get into the
  signpost in time. 25th?
 
  Maximilian Klein
  Wikipedian in Residence, OCLC
  +17074787023
 
  
  From: wiki-research-l-boun...@lists.wikimedia.org
  wiki-research-l-boun...@lists.wikimedia.org on behalf of Dario 
  Taraborelli
  dtarabore...@wikimedia.org
  Sent: Monday, February 24, 2014 8:11 AM
  To: A mailing list for the Analytics Team at WMF and everybody who has
  an   interest in Wikipedia and analytics.; Research into Wikimedia
  content and communities
  Subject: [Wiki-research-l] Upcoming research newsletter: new papers open
  forreview
 
  Hi everybody,
 
  with CSCW just concluded and conferences like CHI and WWW coming up we
  have a good set of papers to review for the February issue of the Research
  Newsletter [1]
 
  Please take a look at: https://etherpad.wikimedia.org/p/WRN201402 and
  add your name next to any paper you are interested in reviewing. As usual,
  short notes and one-paragraph reviews are most welcome.
 
  Instead of contacting past contributors only, this month we're
  experimenting with a public call for reviews cross-posted to analytics-l 
  and
  wiki-research-l. if you have any question about the format or process feel
  free to get in touch off-list.
 
  Dario Taraborelli and Tilman Bayer
 
  [1] http://meta.wikimedia.org/wiki/Research:Newsletter
  ___
  Wiki-research-l mailing list
  Wiki-research-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
 
 
  ___
  Wiki-research-l mailing list
  Wiki-research-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



 --
 Tilman Bayer
 Senior Operations Analyst (Movement Communications)
 Wikimedia Foundation
 IRC (Freenode): HaeB

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Fwd: Upcoming talk at the Berkman Center on the gender gap and Internet use skills

2014-01-17 Thread Tilman Bayer
Looks very interesting! Somewhat related:
https://meta.wikimedia.org/wiki/Research:Newsletter/2013/November#Non-participation_of_female_students_on_Wikipedia_influenced_by_school.2C_peers_and_lack_of_community_awareness


On Fri, Jan 17, 2014 at 7:16 AM, Dario Taraborelli 
dtarabore...@wikimedia.org wrote:

 Begin forwarded message:

 *From: *aaron shaw aarons...@northwestern.edu

 Date: Fri, 17 Jan 2014 08:52:02 -0600
 Subject: Upcoming talk at the Berkman Center on the gender gap and
 Internet use skills


 I wanted to pass along the details of an upcoming talk that Eszter
 Hargittai and I will be doing at the Berkman Center on Tuesday 1/21. We
 will present preliminary findings of work-in-progress on the relationship
 between the Wikipedia gender gap and people's internet skills. You can
 stream the talk online or attend in-person (if you happen to be in the
 Boston area). More details and an RSVP form are available on the Berkman
 Center website:
 http://cyber.law.harvard.edu/events/luncheon/2014/01/hargittai-shaw

 All the best,
 Aaron


 [January 21] Internet Skills and Wikipedia's Gender Inequality with
 Eszter Hargittai and Aaron Shaw, Northwestern University


 *January 21, 2014 at 12:30pm ET Berkman Center for Internet  Society, 23
 Everett St, 2nd Floor*
 *RSVP required for those attending in person via the form
 http://cyber.law.harvard.edu/events/luncheon/2014/01/hargittai-shaw#RSVP*
 *This event will be webcast live (on this page) at 12:30pm ET.*

 Although women are just as likely as men to read Wikipedia, they only
 represent an estimated 16% of global Wikipedia editors and 23% of U.S.
 adult Wikipedia editors. Previous research has focused on analyzing aspects
 of current contributors and aspects of the existing Wikipedia community to
 explain this gender gap in contributions. Instead, we analyze data about
 both Wikipedia contributors and non-contributors. We also focus on a
 previously ignored factor: people’s Internet skills. Our data set includes
 a diverse group of American young adults with detailed information about
 their background attributes, Internet experiences and skills. We find that
 the gender gap in editing is exacerbated by a similarly important Internet
 skills gap. By far the most likely people to contribute to Wikipedia are
 males with high Internet skills. Our findings suggest that efforts to
 overcome the gender gap in Wikipedia contributions must address the Web-use
 skills gap. Future research needs to look at why high-skilled women do not
 contribute at comparable rates to highly-skilled men.





 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] The Wikimedia Research Newsletter 3(12) is out

2013-12-30 Thread Tilman Bayer
The December 2013 issue of the Wikimedia Research Newsletter is out:

https://meta.wikimedia.org/wiki/Research:Newsletter/2013/December

Contents:

1 Cohort of cross-language Wikipedia editors analyzed
2 Attempt to use Wikipedia pageviews to predict election results in
Iran, Germany and the UK
3 Integrity of Wikipedia and Wikipedia research
4 Briefly
4.1 How we found a million style and grammar errors in the English Wikipedia
4.2 Evaluation of gastroenterology and hepatology articles on Wikipedia
4.3 Overview of research on FLOSS and Wikipedia
4.4 In battle over Walt Whitman's sexuality, Wikipedia policies tamed
the mass into producing a good encyclopedia entry
4.5 Elinor Ostrom's theories applied to Wikipedia
4.6 New dissertation on Wiktionary

••• 9 publications were covered in this issue •••
Thanks to Daniel Mietchen, Maximilian Klein and Piotr Konieczny for
contributing.

Tilman Bayer and Dario Taraborelli

--
Wikimedia Research Newsletter
https://meta.wikimedia.org/wiki/Research:Newsletter/

* Follow us on Twitter/Identi.ca: @WikiResearch
* Receive this newsletter by mail:
https://lists.wikimedia.org/mailman/listinfo/research-newsletter
* Subscribe to the RSS feed:
http://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/

-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Counting contributions for review/tenure

2013-07-23 Thread Tilman Bayer
I guess you are already aware of
https://blog.wikimedia.org/2011/04/06/tenure-awarded-based-in-part-on-wikipedia-contributions/
?

On Tue, Jul 23, 2013 at 1:04 PM, phoebe ayers phoebe.w...@gmail.com wrote:
 Dearest research list!

 Two things:

 1) I am looking for anything and everything about counting Wikipedia
 contributions for attribution  tenure/promotion purposes and/or C.V.
 enhancement, especially for academic faculty. This includes blog posts,
 anecdotes, research, case studies...

 2) I'm just starting a review on the subject, which is also going to involve
 interviewing academics involved in Wikipedia about their thoughts, hopes and
 dreams on the subject of getting 'credit' for their contributions: so let me
 know if you're interested in being interviewed.

 If there's interest maybe we can get together a little informal discussion
 at WikiSym/Wikimania as well.

 thanks!
 Phoebe


 --
 * I use this address for lists; send personal messages to phoebe.ayers at
 gmail.com *

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] The Wikimedia Research Newsletter 3(6) is out

2013-06-30 Thread Tilman Bayer
Thanks, it works now.

On Sun, Jun 30, 2013 at 6:48 PM, Stuart A. Yeates syea...@gmail.com wrote:
 The link https://meta.wikimedia.org/wiki/Research:Newsletter/ in your
 signature doesn't work.

 cheers
 stuart



 On Sat, Jun 29, 2013 at 6:42 AM, Tilman Bayer tba...@wikimedia.org wrote:

 The June 2013 issue of the Wikimedia Research Newsletter is out:

 https://meta.wikimedia.org/wiki/Research:Newsletter/2013/June

 In this issue:

 1 The most controversial topics in Wikipedia: a multilingual and
 geographical analysis
 2 Sockpuppet evidence from automated writing style analysis
 3 Adjusting automatic quality flaw predictors by topic areas
 4 Briefly
 5 References

 ••• 9 publications were covered in this issue •••
 Thanks to Giovanni Luca Ciampaglia and Taha Yasseri for contributing

 Tilman Bayer and Dario Taraborelli

 --
 Wikimedia Research Newsletter
 https://meta.wikimedia.org/wiki/Research:Newsletter/

 * Follow us on Twitter/Identi.ca: @WikiResearch
 * Receive this newsletter by mail:
 https://lists.wikimedia.org/mailman/listinfo/research-newsletter
 * Subscribe to the RSS feed:
 http://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/

 --
 Tilman Bayer
 Senior Operations Analyst (Movement Communications)
 Wikimedia Foundation
 IRC (Freenode): HaeB

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] The Wikimedia Research Newsletter 3(6) is out

2013-06-28 Thread Tilman Bayer
The June 2013 issue of the Wikimedia Research Newsletter is out:

https://meta.wikimedia.org/wiki/Research:Newsletter/2013/June

In this issue:

1 The most controversial topics in Wikipedia: a multilingual and
geographical analysis
2 Sockpuppet evidence from automated writing style analysis
3 Adjusting automatic quality flaw predictors by topic areas
4 Briefly
5 References

••• 9 publications were covered in this issue •••
Thanks to Giovanni Luca Ciampaglia and Taha Yasseri for contributing

Tilman Bayer and Dario Taraborelli

--
Wikimedia Research Newsletter
https://meta.wikimedia.org/wiki/Research:Newsletter/

* Follow us on Twitter/Identi.ca: @WikiResearch
* Receive this newsletter by mail:
https://lists.wikimedia.org/mailman/listinfo/research-newsletter
* Subscribe to the RSS feed:
http://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/

-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Modeling Wikipedia admin elections using multidimensional behavioral social networks

2013-02-18 Thread Tilman Bayer
There have been quite a few papers analyzing RfAs (mostly) on the
English Wikipedia, see e.g.:

https://meta.wikimedia.org/wiki/Research:Newsletter/2012/March#How_editors_evaluate_each_other:_effects_of_status_and_similarity
https://meta.wikimedia.org/wiki/Research:Newsletter/2012/January#Students_predict_connections_between_Wikipedians
https://meta.wikimedia.org/wiki/Research:Newsletter/2011/September#How_social_ties_influence_admin_votes
- this also contains citations of earlier research on the topic.

And the authors of the present paper already published another one
about Polish Wikipedia RfAs in 2011:
https://meta.wikimedia.org/wiki/Research:Newsletter/2011/October#What_it_takes_to_become_an_admin:_Insights_from_the_Polish_Wikipedia

On Mon, Feb 18, 2013 at 9:30 AM, Everton Zanella Alvarenga
t...@wikimedia.org wrote:

 Abstract:

 Wikipedia admins are editors entrusted with special privileges and
 duties, responsible for the community management of Wikipedia. They
 are elected using a special procedure defined by the Wikipedia
 community, called Request for Adminship (RfA). Because of the growing
 amount of management work (quality control, coordination, maintenance)
 on the Wikipedia, the importance of admins is growing. At the same
 time, there exists evidence that the admin community is growing more
 slowly than expected. We present an analysis of the RfA procedure in
 the Polish-language Wikipedia, since the procedure’s introduction in
 2005. With the goal of discovering good candidates for new admins that
 could be accepted by the community, we model the admin elections using
 multidimensional behavioral social networks derived from the Wikipedia
 edit history. We find that we can classify the votes in the RfA
 procedures using this model with an accuracy level that should be
 sufficient to recommend candidates. We also propose and verify
 interpretations of the dimensions of the social network. We find that
 one of the dimensions, based on discussion on Wikipedia talk pages,
 can be validly interpreted as acquaintance among editors, and discuss
 the relevance of this dimension to the admin elections.

 Link: http://link.springer.com/article/10.1007/s13278-012-0092-6

 From the conclusion:

 [...] We have noticed the decreasing amount of successful admin
 elections and have formulated two hypotheses that could explain this
 phenomenon. Hypothesis A stated that new admins are elected on the
 basis of acquaintance of the voter and candidate. If this would be a
 valid explanation, we could conclude that the community of admins is
 becoming increasingly closed, which would be detrimental to the
 sustainable development of the Wikipedia.

 Hypothesis B stated that new admins are elected on the basis of
 similarity of experience in editing various topics of the voter and
 candidate. Since voters are other active admins whose experience
 increases with time, their thresholds of accepting a candidate are
 likely to increase (as has been observed from the simple statistics of
 RfA votings).

 I would love to see this research on other Wikipedias.

 Tom

 --
 Everton Zanella Alvarenga (also Tom)
 A life spent making mistakes is not only more honorable, but more
 useful than a life spent doing nothing.

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




--
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] 2012 top pageview list

2012-12-28 Thread Tilman Bayer
On Fri, Dec 28, 2012 at 10:24 AM, John Vandenberg jay...@gmail.com wrote:

 Is favicon only in the Chinese Wikipedia top 100?

 It seems so, and is odd if the problem is a web browser bug.

 John Vandenberg.
 sent from Galaxy Note
 On Dec 28, 2012 4:07 PM, Johan Gunnarsson johan.gunnars...@gmail.com
 wrote:

  On Fri, Dec 28, 2012 at 5:33 AM, John Vandenberg jay...@gmail.com
 wrote:
  Hi Johan,
 
  Thank you for the lovely data at
 
  https://toolserver.org/~johang/2012.html
 
  I posted that link to my facebook (below if you want to join in
  there), and a few language specific facebook groups, and there have
  been some concerns raised about the results, which I'll list below.
 
  These lists are getting some traction in the press so it would be good
  to understand it better.
 
  http://guardian.co.uk/technology/blog/2012/dec/27/wikipedia-most-viewed

 Cool, cool.


 
  Why is [[zh:Favicon]] #2?
 
  The data doesnt appear to support that
 
  http://stats.grok.se/zh/201201/Favicon
  http://stats.grok.se/zh/latest90/Favicon

 My post-processing filtering follows redirects to find the true
 title. In this case the page Favicon.ico redirects to Favicon. This is
 probably due to broken browsers trying to load the icon.


 
  Number 1 in French is a plant native to asia.  The stats for December
 disagree
  https://en.wikipedia.org/wiki/Ilex_crenata
  http://stats.grok.se/fr/201212/Houx_cr%C3%A9nel%C3%A9

 French's Ilex_crenata redirects to Houx_crénelé.

 Ilex_crenata had huge traffic in April:
 http://stats.grok.se/fr/201204/Ilex_crenata

 There are a bunch of spikes like this. I can't really explain it. I
 talked to Domas Mituzas (the maintainer of the original dumps I use)
 yesterday and he suggested it might be bots going crazy for whatever
 reason. I'd love to filter all these false positives, but haven't been
 able to come up with an easy way to do it.

 Might be possible with access to logs with the user-agent string, but
 that would probably inflate the dataset size even more. It's already
 past the terabyte. However that could probably be solved by sampling
 (for example) 1/100 of the entries.

 Comments and ideas are welcome!


 
  Number 1 in German is Cul de sac. This is odd, but matches the stats
  http://stats.grok.se/de/201207/Sackgasse

 RIght. This one is funny. It has huge traffic on weekdays only.
 Deserted on weekends.

 This has been noted on the dewiki village pump before. The most
interesting guess
therehttps://de.wikipedia.org/wiki/Wikipedia:Fragen_zur_Wikipedia#Sackgasse_als_Top_Artikel_.3F.21(by
Benutzer:YMS): There might be a web filtering software installed on
workplace PCs in companies which redirects all prohibited URLs to the
German Wikipedia on cul-de-sac. This would explain the weekly pattern, and
also http://stats.grok.se/de/201112/Sackgasse (December 25-26 are holidays
in Germany, and many employees take the rest of the year off).




 
  Number 1 in Dutch is a Chinese mountain.  The stats for December
 disagree
  http://stats.grok.se/nl/201212/Hua_Shan

 July/August agree: http://stats.grok.se/nl/201208/Hua_Shan


 
  Number 4 in Hebrew is zipper.  The stats for December disagree
  http://stats.grok.se/he/201212/%D7%A8%D7%95%D7%9B%D7%A1%D7%9F

 April agrees:
 http://stats.grok.se/he/201204/%D7%A8%D7%95%D7%9B%D7%A1%D7%9F


 
  Number 2 in Spanish is '@'.  This is odd, but matches the stats
  http://stats.grok.se/es/201212/Arroba_%28s%C3%ADmbolo%29
 
  --
  John Vandenberg
  https://www.facebook.com/johnmark.vandenberg


 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] # of searches where WP is the first hit?

2012-11-19 Thread Tilman Bayer
Not much left to add after Finn's list, but those may be interesting as
well:

https://meta.wikimedia.org/wiki/Research:Newsletter/2011/October#High_search_engine_rankings_of_Wikipedia_articles_found_to_be_justified_by_quality
 (In 1000 queries, Yahoo showed the most Wikipedia results within the top
10 lists (446), followed by MSN/Live (387), Google (328), and Ask.com
(255).)

https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2011-03-07/In_the_news#Google_algorithm_update
(caused
Wikipedia to rise from 7578 to 8050 (+6.2%) presences in the first search
result page, in a sample of around 60,000 keywords.)

https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2006-11-06/Search_and_Wikipedia
 (Wikipedia appeared in the top 10, thus putting it on the first page of
results, on 81% of searches using Google and 77% for Yahoo.)

http://stats.wikimedia.org/wikimedia/squids/SquidReportGoogle.htm (Google
referred to our sites, through its services including search, maps, and
Google Earth, 212,902,650 page views per day, representing 41.1% of our
external page requests. )


On Thu, Nov 15, 2012 at 2:59 AM, Finn Årup Nielsen f...@imm.dtu.dk wrote:

 Hi Phoebe (and others on the list),



 On 13-11-2012 21:47, phoebe ayers wrote:

  Are there any solid estimates out there of how many Google [or other]
 searches have a Wikipedia article as the first [or second or third...]
 hit? Any language breakdowns of this would be super cool as well.


 If you look in my Wikipedia research and tools: Review and comments.
 http://www2.imm.dtu.dk/pubdb/**views/edoc_download.php/6012/**
 pdf/imm6012.pdfhttp://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.pdf
 on page 15 Popularity you see a couple of studies using a sample of
 pages:

 Seeking health information online: does Wikipedia matter?
 http://jamia.bmj.com/content/**16/4/471.longhttp://jamia.bmj.com/content/16/4/471.long

 http://www.conductor.com/blog/**2012/03/wikipedia-in-the-**
 serps-appears-on-page-1-for-**60-of-informational-34-**
 transactional-queries/http://www.conductor.com/blog/2012/03/wikipedia-in-the-serps-appears-on-page-1-for-60-of-informational-34-transactional-queries/

 http://www.**intelligentpositioning.com/**blog/2012/02/wikipedia-page-**
 one-of-google-uk-for-99-of-**searches/http://www.intelligentpositioning.com/blog/2012/02/wikipedia-page-one-of-google-uk-for-99-of-searches/

 The first one reports around 35% health related queries having Wikipedia
 on top of of the Google result list.
 http://jamia.bmj.com/content/**16/4/471/T1.expansion.htmlhttp://jamia.bmj.com/content/16/4/471/T1.expansion.html


  I've seen offhand references to this phenomenon in many papers, but
 I'm wondering if someone on this list knows of a particularly good
 estimate or reliable information.



 Google has become 'bubbled'. You could try DuckDuckGo instead, e.g.,

 http://duckduckgo.com/?q=**Alzheimer+region%3Anonehttp://duckduckgo.com/?q=Alzheimer+region%3Anone

 See also: http://dontbubble.us/


 /Finn


 __**_
 Wiki-research-l mailing list
 Wiki-research-l@lists.**wikimedia.orgWiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wiki-**research-lhttps://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Wikipedia pages revisions and incremental dumps

2012-11-09 Thread Tilman Bayer
On Fri, Nov 9, 2012 at 1:05 AM, Finn Aarup Nielsen f...@imm.dtu.dk wrote:



 Den 09-11-2012 04:38, Rami Al-Rfou' skrev:


  I am interested into counting the number of revisions every page went
 through. I was wondering if it is possible to count that without using
 the whole history dump. I mean is it available in the schema directly?
 Is it computable without having the revisions text downloaded?


 If you have toolserver access you can readily do it. Embarrassingly I
 cannot find a tool on the toolserver that already does that.

 There is the Wikichecker that shows a count:

 http://en.wikichecker.com/**article/?a=Denmarkhttp://en.wikichecker.com/article/?a=Denmark

Just be aware that the site is still in beta, and that e.g.
http://en.wikichecker.com/article/?a=Barack+Obama claims that the English
Wikipedia's article on Barack Obama was started in July 2012 and has
received 485 non-bot edits (the real number is likely over 20,000).



  Moreover, many of my future projects will benefit a lot if Wikipedia has
 incremental dumps of their database. Any one aware of something relevant
 or close?


 It is possible that this paper can help you:

 Wikipedia Revision Toolkit: Efficiently Accessing Wikipedia's Edit
 History

 https://code.google.com/p/**jwpl/ https://code.google.com/p/jwpl/


 /Finn


 __**_
 Wiki-research-l mailing list
 Wiki-research-l@lists.**wikimedia.orgWiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/**mailman/listinfo/wiki-**research-lhttps://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Social Network Analysis of Wikipedia

2012-09-05 Thread Tilman Bayer
Hi Jeremy,

there are quite a few papers which have done social network analysis
of (mostly the English) Wikipedia; e.g. in the Wikimedia Research
Newsletter we covered two which looked at centrality in different
contexts:

https://meta.wikimedia.org/wiki/Research:Newsletter/2012-06-25#Briefly
('Central' users produce higher quality)

https://meta.wikimedia.org/wiki/Research:Newsletter/2011-09-26#How_social_ties_influence_admin_votes
(Closeness, PageRank, and eigenvector centrality were found to have
the largest regression coefficients in predicting the outcome of an
RfA)

On Wed, Sep 5, 2012 at 12:43 PM, Jeremy Foote foo...@purdue.edu wrote:
 I am a brand new Master's student at Purdue. For my Social Network Analysis
 class, I'm thinking about doing a project about whether a Wikipedian's
 centrality in a network can be used as a predictor of future participation.
 I've spent the afternoon looking for relevant literature. I found the very
 interesting

 Validity Issues in the Use of Social Network Analysis with Digital Trace
 Data by Howison, Wiggins, and Crowston
 and
 Network analysis of collaboration structure in Wikipedia by Brandes et al.

 I'm wondering if there are other papers about how to translate Wikipedia
 into a network structure, or even more specifically relating to node-level
 centrality measures and participation measures.

 Very many thanks,
 Jeremy Foote

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] The Wikimedia Research Newsletter 2(6) is out

2012-06-26 Thread Tilman Bayer
The new Wikimedia Research Newsletter is out:

https://meta.wikimedia.org/wiki/Research:Newsletter/2012-06-25

In this issue:

1 Dynamics of edit wars
2 Who deletes Wikipedia
3 Evaluating and predicting interlingual links in Wikipedia
4 Wikipedia Academy preview
5 Special issue of Digithum on Wikipedia research
6 Briefly

••• 26 publications were covered in this issue •••
Thanks to Piotr Konieczny, Evan Rosen and Daniel Mietchen for their
contributions


There's more:
* Follow us on https://twitter.com/#!/WikiResearch or
https://identi.ca/wikiresearch
* Receive this newsletter by mail:
https://lists.wikimedia.org/mailman/listinfo/research-newsletter
* Subscribe to the RSS feed:
https://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/
* Download the full 45-page PDF of Volume 1 (2011) and a dataset of
all references covered in it: http://blog.wikimedia.org/?p=10655


Tilman Bayer and Dario Taraborelli

-- 
Tilman Bayer
Senior Operations Analyst (Movement Communications)
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] The Wikimedia Research Newsletter 2(1) - Jan 2012 is out

2012-01-31 Thread Tilman Bayer
The latest issue (January 2012) of the monthly Wikimedia Research
Newsletter is out:

https://meta.wikimedia.org/wiki/Research:Newsletter/2012-01-30

In this issue:
1 Admins influence the language of non-admins
2 Can Wikipedia replace commercial biography databases?
3 Students predict connections between Wikipedians
4 Language analysis finds Wikipedia's political bias moving from left to right
5 Briefly
6 References

••• 11 items were covered in this issue •••

You can post suggestions and contributions for the next issue at:
https://meta.wikimedia.org/wiki/Research_talk:Newsletter

or by mail at researchn...@wikimedia.org

RSS feed for the newsletter:
https://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/

Regards, Tilman


-- 
Tilman Bayer
Movement Communications
Wikimedia Foundation

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Fraction of reverts

2011-08-15 Thread Tilman Bayer
I think Ed Chi's group at PARC did some the earliest studies about revert rates:

http://asc-parc.blogspot.com/2009/08/part-2-more-details-of-changing-editor.html
Monthly ratio of reverted edits by editor class
http://asc-parc.blogspot.com/2009/07/part-1-slowing-growth-of-wikipedia-some.html
http://www.parc.com/content/attachments/singularity-is-not-near.pdf

On Tue, Aug 16, 2011 at 1:53 AM, Denny Vrandecic
vrande...@googlemail.com wrote:
 Hello,

 does anyone have a rough estimate of how many edits get reverted?
 Does anyone have a study handy?

 Cheers,
 Denny



 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
T. Bayer
Movement Communications
Wikimedia Foundation
IRC (Freenode): HaeB

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l