Re: [Wikimedia-l] The most controversial topics in Wikipedia: A multilingual and geographical analysis

2013-07-21 Thread Balázs Viczián
You may contact them directly with your concerns what I guess many did
after they published their study.

here is their homepage: http://wwm.phy.bme.hu/

Cheers,
Balázs


2013/7/21 MZMcBride 

> Anders Wennersten wrote:
> >A most interesting study looking at findings from 10 different language
> >versions.
> >
> >Jesus and Middle east are the most controversial articles seen over the
> >world, but George Bush on en:wp and Chile on es:wp
> >
> >http://arxiv.org/ftp/arxiv/papers/1305/1305.5566.pdf
>
> Thanks for sharing this.
>
> I had a bit of free time last night waiting for trains and I skimmed
> through the study and its findings. Two points stuck out at me: a
> seemingly fatally flawed methodology and the age of data used.
>
> The methodology used in this study seems to be pretty inherently flawed.
> According to the paper, controversiality was measured by full page
> reverts, which are fairly trivial to identify and study in a database dump
> (using cryptographic hashes, as the study did), but I don't think full
> reverts give an accurate impression _at all_ of which articles are the
> most controversial.
>
> Pages with many full reverts are indicative of pages that are heavily
> vandalized. For example, the "George W. Bush" article is/was heavily
> vandalized for years on the English Wikipedia. Does blanking the article
> or replacing its contents with the word "penis" mean that it's a very
> controversial article? Of course not. Measuring only full reverts (as the
> study seems to have done, though it's certainly possible I've overlooked
> something) seems to be really misleading and inaccurate.
>
> In order to measure how controversial an article is, there are a number of
> metrics that could be used, though of course no metric is perfect and many
> metrics can be very difficult to accurately and rigorously measure:
>
> * amount of talk page discussion generated for each article;
> * number of page watchers;
> * number of page views (possibly);
> * number of arbitration cases or other dispute resolution procedures
> related to the article (perhaps a key metric in determining which articles
> are truly most controversial); and
> * edit frequency and time between certain edits and partial or full
> reverts of those edits.
>
> There are likely a number of other metrics that could be used as well to
> measure controversiality; these were simply off the top of my head.
>
> The second point that stuck out at me was that the study relied on a
> database dump from March 2010. While this may be unavoidable, being over
> three years later, this introduces obvious bias into the data and its
> findings. Put another way, for the English Wikipedia started in 2001, this
> omits a quarter of the project's history(!). Again, given the length of
> time needed to draft and prepare a study, this gap may very well be
> unavoidable, but it certainly made me raise an eyebrow.
>
> One final comment I had from briefly reading the study was that in the
> past few years we've made good strides in making research like this
> easier. Not that computing cryptographic hashes is particularly intensive,
> but these days we now store such hashes directly in the database (though
> we store SHA-1 hashes, not MD5 hashes as the study used). Storing these
> hashes in the database saves researchers the need to compute the hashes
> themselves and allows MediaWiki and other software the ability to easily
> and quickly detect full reverts.
>
> MZMcBride
>
> P.S. Noting that this study is still a draft, I happened to notice a small
> typo on page nine: "We tried to a as diverse as possible sample including
> West European [...]". Hopefully this can be corrected before formal
> publication.
>
>
>
> ___
> Wikimedia-l mailing list
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
>
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] The most controversial topics in Wikipedia: A multilingual and geographical analysis

2013-07-21 Thread Tilman Bayer
On Sun, Jul 21, 2013 at 2:32 PM, MZMcBride  wrote:
> Anders Wennersten wrote:
>>A most interesting study looking at findings from 10 different language
>>versions.
>>
>>Jesus and Middle east are the most controversial articles seen over the
>>world, but George Bush on en:wp and Chile on es:wp
>>
>>http://arxiv.org/ftp/arxiv/papers/1305/1305.5566.pdf
>
FWIW, here is the review by Giovanni Luca Ciampaglia in last month's
Wikimedia Research Newsletter:
https://blog.wikimedia.org/2013/06/28/wikimedia-research-newsletter-june-2013/#.22The_most_controversial_topics_in_Wikipedia:_a_multilingual_and_geographical_analysis.22
(also published in the Signpost, the weekly newsletter on the English
Wikipedia)

> Thanks for sharing this.
>
> I had a bit of free time last night waiting for trains and I skimmed
> through the study and its findings. Two points stuck out at me: a
> seemingly fatally flawed methodology and the age of data used.
>
> The methodology used in this study seems to be pretty inherently flawed.
> According to the paper, controversiality was measured by full page
> reverts, which are fairly trivial to identify and study in a database dump
> (using cryptographic hashes, as the study did), but I don't think full
> reverts give an accurate impression _at all_ of which articles are the
> most controversial.
>
> Pages with many full reverts are indicative of pages that are heavily
> vandalized. For example, the "George W. Bush" article is/was heavily
> vandalized for years on the English Wikipedia. Does blanking the article
> or replacing its contents with the word "penis" mean that it's a very
> controversial article? Of course not. Measuring only full reverts (as the
> study seems to have done, though it's certainly possible I've overlooked
> something) seems to be really misleading and inaccurate.
They didn't. You may have overlooked the description of the
methodology on p.5: It's based on "mutual reverts" where user A has
reverted user B and user B has reverted user A, and gives higher
weight to disputes between more experienced editors. This should
exclude most vandalism reverts of the sort you describe. As noted in
Giovanni's review, this method was proposed in an earlier paper, Sumi
et al. 
(https://meta.wikimedia.org/wiki/Research:Newsletter/2011/July#Edit_wars_and_conflict_metrics
). That paper explains at length how this metric serves to distinguish
vandalism reverts from edit wars. Of course there are ample
possibilities to refine it, e.g. taking into account page protection
logs.

Personally, I'm more concerned that the new paper totally fails to put
its subject into perspective by stating how frequent such
controversial articles are overall on Wikipedia. Thus it's no wonder
that the ample international media coverage that it generated mostly
transports the notion (or reinforces the preconception) of Wikipedia
as a huge battleground.

The 2011 Sumi et al. paper did a better job in that respect: "less
than 25k articles, i.e. less than 1% of the 3m articles available in
the November 2009 English WP dump, can be called controversial, and of
these, less than half are truly edit wars."


>
> In order to measure how controversial an article is, there are a number of
> metrics that could be used, though of course no metric is perfect and many
> metrics can be very difficult to accurately and rigorously measure:
>
> * amount of talk page discussion generated for each article;
> * number of page watchers;
> * number of page views (possibly);
> * number of arbitration cases or other dispute resolution procedures
> related to the article (perhaps a key metric in determining which articles
> are truly most controversial); and
> * edit frequency and time between certain edits and partial or full
> reverts of those edits.
>
> There are likely a number of other metrics that could be used as well to
> measure controversiality; these were simply off the top of my head.
Perhaps you are interested in this 2012 paper comparing such metrics,
which the authors of the present paper cite to justify their choice of
metric:
Sepehri Rad, H., Barbosa, D.: Identifying controversial articles in
Wikipedia: A comparative study.
http://www.wikisym.org/ws2012/p18wikisym2012.pdf

Regarding detection of (partial or full) reverts, see also
https://meta.wikimedia.org/wiki/Research:Revert_detection

>
> The second point that stuck out at me was that the study relied on a
> database dump from March 2010. While this may be unavoidable, being over
> three years later, this introduces obvious bias into the data and its
> findings. Put another way, for the English Wikipedia started in 2001, this
> omits a quarter of the project's history(!). Again, given the length of
> time needed to draft and prepare a study, this gap may very well be
> unavoidable, but it certainly made me raise an eyebrow.
>
> One final comment I had from briefly reading the study was that in the
> past few years we've made good strides in making research like this
>

[Wikimedia-l] [Wikimedia Announcements] Tech News summary #30 is out

2013-07-21 Thread Tomasz W. Kozlowski

Hi community,
the latest issue of the Tech News summary has been published and is now 
being delivered to its subscribers across the wikis: 
https://meta.wikimedia.org/wiki/Special:MyLanguage/Tech/News/2013/30


The newsletter aims to help Wikimedians stay informed about recent and 
future technical changes that are likely to impact their work. Thanks to 
our amazing community of volunteers translators, the current issue is 
available in 13 languages (yay!).


Since subscribers of this list are likely to be interested in reading 
the newsletter, we'll announcing its release every week from now on, 
until we hear from you otherwise :-) Feedback about this decision (and 
the newsletter) is welcome at https://meta.wikimedia.org/wiki/Talk:Tech/News


Please note that you can also subscribe to get the newsletter directly 
on your talk page: 
https://meta.wikimedia.org/wiki/Global_message_delivery/Targets/Tech_ambassadors


Please let me know if you have any questions, comments or concerns. I 
appreciate your help, feedback and involvement.


  Tomasz

___
Please note: all replies sent to this mailing list will be immediately directed 
to Wikimedia-l, the public mailing list of the Wikimedia community. For more 
information about Wikimedia-l:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
___
WikimediaAnnounce-l mailing list
wikimediaannounc...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaannounce-l
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] The most controversial topics in Wikipedia: A multilingual and geographical analysis

2013-07-21 Thread MZMcBride
Anders Wennersten wrote:
>A most interesting study looking at findings from 10 different language
>versions.
>
>Jesus and Middle east are the most controversial articles seen over the
>world, but George Bush on en:wp and Chile on es:wp
>
>http://arxiv.org/ftp/arxiv/papers/1305/1305.5566.pdf

Thanks for sharing this.

I had a bit of free time last night waiting for trains and I skimmed
through the study and its findings. Two points stuck out at me: a
seemingly fatally flawed methodology and the age of data used.

The methodology used in this study seems to be pretty inherently flawed.
According to the paper, controversiality was measured by full page
reverts, which are fairly trivial to identify and study in a database dump
(using cryptographic hashes, as the study did), but I don't think full
reverts give an accurate impression _at all_ of which articles are the
most controversial.

Pages with many full reverts are indicative of pages that are heavily
vandalized. For example, the "George W. Bush" article is/was heavily
vandalized for years on the English Wikipedia. Does blanking the article
or replacing its contents with the word "penis" mean that it's a very
controversial article? Of course not. Measuring only full reverts (as the
study seems to have done, though it's certainly possible I've overlooked
something) seems to be really misleading and inaccurate.

In order to measure how controversial an article is, there are a number of
metrics that could be used, though of course no metric is perfect and many
metrics can be very difficult to accurately and rigorously measure:

* amount of talk page discussion generated for each article;
* number of page watchers;
* number of page views (possibly);
* number of arbitration cases or other dispute resolution procedures
related to the article (perhaps a key metric in determining which articles
are truly most controversial); and
* edit frequency and time between certain edits and partial or full
reverts of those edits.

There are likely a number of other metrics that could be used as well to
measure controversiality; these were simply off the top of my head.

The second point that stuck out at me was that the study relied on a
database dump from March 2010. While this may be unavoidable, being over
three years later, this introduces obvious bias into the data and its
findings. Put another way, for the English Wikipedia started in 2001, this
omits a quarter of the project's history(!). Again, given the length of
time needed to draft and prepare a study, this gap may very well be
unavoidable, but it certainly made me raise an eyebrow.

One final comment I had from briefly reading the study was that in the
past few years we've made good strides in making research like this
easier. Not that computing cryptographic hashes is particularly intensive,
but these days we now store such hashes directly in the database (though
we store SHA-1 hashes, not MD5 hashes as the study used). Storing these
hashes in the database saves researchers the need to compute the hashes
themselves and allows MediaWiki and other software the ability to easily
and quickly detect full reverts.

MZMcBride

P.S. Noting that this study is still a draft, I happened to notice a small
typo on page nine: "We tried to a as diverse as possible sample including
West European [...]". Hopefully this can be corrected before formal
publication.



___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] 3rd Global Congress on IP & OpenAir Conference on Innovation & IP in Africa

2013-07-21 Thread Peter Southwood

David Ricfield (User:Slashme) might be interested
Cheers,
Peter
- Original Message - 
From: "Oona Castro" 

To: "Wikimedia Mailing List" 
Sent: Sunday, July 21, 2013 9:18 PM
Subject: Re: [Wikimedia-l] 3rd Global Congress on IP & OpenAir Conference on 
Innovation & IP in Africa




Hi Lodewijk,
I haven't.. I actually don't remember of knowing anyone from South Africa
here. But, yes, I think that would be awesome to have someone from there,
as I suggested in the email - I just don't know to whom I should send it
to. Would you suggest anyone in particular?

Oona



On Sun, Jul 21, 2013 at 2:23 PM, Lodewijk 
wrote:



Lodewijk






--
Oona Castro
Consultant for the Brazilian Catalyst Program at Wikimedia Foundation
+ 55 21 81812505

Imagine a world in which every single human being can freely share in
the sum of all knowledge.  Help us make it a reality!

http://wikimediafoundation.org 


___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
 



___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] 3rd Global Congress on IP & OpenAir Conference on Innovation & IP in Africa

2013-07-21 Thread Jan Ainali
The email adress wikimediasouthafr...@gmail.com is listed on their chapter
page on meta, I would start with that one.

http://meta.wikimedia.org/wiki/Wikimedia_South_Africa

*Med vänliga hälsningar,
Jan Ainali*

Verksamhetschef, Wikimedia Sverige 
0729 - 67 29 48






2013/7/21 Oona Castro 

> Hi Lodewijk,
> I haven't.. I actually don't remember of knowing anyone from South Africa
> here. But, yes, I think that would be awesome to have someone from there,
> as I suggested in the email - I just don't know to whom I should send it
> to. Would you suggest anyone in particular?
>
> Oona
>
>
>
> On Sun, Jul 21, 2013 at 2:23 PM, Lodewijk  >wrote:
>
> > Lodewijk
>
>
>
>
>
> --
> Oona Castro
> Consultant for the Brazilian Catalyst Program at Wikimedia Foundation
> + 55 21 81812505
>
> Imagine a world in which every single human being can freely share in
> the sum of all knowledge.  Help us make it a reality!
>
> http://wikimediafoundation.org  >
> ___
> Wikimedia-l mailing list
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
>
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] 3rd Global Congress on IP & OpenAir Conference on Innovation & IP in Africa

2013-07-21 Thread Oona Castro
Hi Lodewijk,
I haven't.. I actually don't remember of knowing anyone from South Africa
here. But, yes, I think that would be awesome to have someone from there,
as I suggested in the email - I just don't know to whom I should send it
to. Would you suggest anyone in particular?

Oona



On Sun, Jul 21, 2013 at 2:23 PM, Lodewijk wrote:

> Lodewijk





-- 
Oona Castro
Consultant for the Brazilian Catalyst Program at Wikimedia Foundation
+ 55 21 81812505

Imagine a world in which every single human being can freely share in
the sum of all knowledge.  Help us make it a reality!

http://wikimediafoundation.org 
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] 3rd Global Congress on IP & OpenAir Conference on Innovation & IP in Africa

2013-07-21 Thread Lodewijk
Hi Oona,

did you already try to contact Wikimedia South Africa about this?

Best,
Lodewijk


2013/7/21 Oona Castro 

> Hi all,
>
> I've received an email from the organizers of the 3rd Global Congress on IP
> & OpenAir Conference on Innovation & IP in Africa (
> http://www.openair.org.za/capetown2013) which will take place from 9 to 13
> December, in Cape Town, calling for participation from Wikimedia movement.
>
> I've attended the Global Congress on IP in the the past, when I was leading
> researches on the subject in Brazil, and they were an opportunity to engage
> with other researchers, know better the global picture and problems of IP
> and discuss global initiatives and threats among both global north and
> global south perspectives.
>
> Would there be anyone interested in participating of this meeting this
> year?
>
> I strongly recommend people engaged in IP and international treaties
> debates join them.
>
> Perhaps someone from South Africa?
>
> Best regards
> Oona
>
> --
> Oona Castro
> Consultant for the Brazilian Catalyst Program at Wikimedia Foundation
> + 55 21 81812505
>
> Imagine a world in which every single human being can freely share in
> the sum of all knowledge.  Help us make it a reality!
>
> http://wikimediafoundation.org  >
> ___
> Wikimedia-l mailing list
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


[Wikimedia-l] 3rd Global Congress on IP & OpenAir Conference on Innovation & IP in Africa

2013-07-21 Thread Oona Castro
Hi all,

I've received an email from the organizers of the 3rd Global Congress on IP
& OpenAir Conference on Innovation & IP in Africa (
http://www.openair.org.za/capetown2013) which will take place from 9 to 13
December, in Cape Town, calling for participation from Wikimedia movement.

I've attended the Global Congress on IP in the the past, when I was leading
researches on the subject in Brazil, and they were an opportunity to engage
with other researchers, know better the global picture and problems of IP
and discuss global initiatives and threats among both global north and
global south perspectives.

Would there be anyone interested in participating of this meeting this year?

I strongly recommend people engaged in IP and international treaties
debates join them.

Perhaps someone from South Africa?

Best regards
Oona

-- 
Oona Castro
Consultant for the Brazilian Catalyst Program at Wikimedia Foundation
+ 55 21 81812505

Imagine a world in which every single human being can freely share in
the sum of all knowledge.  Help us make it a reality!

http://wikimediafoundation.org 
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Approval of WCUG Greece as Wikimedia User Group

2013-07-21 Thread Balázs Viczián
Congrats from WMHU!

Balázs

2013/7/20 Ivan Martínez 

> Welcome, congrats!
>
> El sábado, 20 de julio de 2013, Sophie Österberg escribió:
>
> > Congratulations, this is lovely!!
> >
> > *Be Bold!
> > Sophie Österberg
> > sosterb...@wikimedia.org *
> >
> >
> > *Every single contribution to Wikipedia is a
> > gift of free knowledge to humanity. *
> >
> >
> > 2013/7/20 Mile Kiš >
> >
> > > 2013/7/20 Asaf Bartov >
> > >
> > > > Congratulations
> > >
> > >
> > > Congratulations
> > > Wikimedia Serbia looking to have some regional cooperation with you
> > >
> > > Mile Kis
> > > Wikimedia Serbia
> > > ___
> > > Wikimedia-l mailing list
> > > Wikimedia-l@lists.wikimedia.org 
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > >  > ?subject=unsubscribe>
> > >
> >
> >
> >
> > --
> > ___
> > Wikimedia-l mailing list
> > Wikimedia-l@lists.wikimedia.org 
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> >  > ?subject=unsubscribe>
>
>
>
> --
> *Atentamente:
>
> Iván Martínez
> Presidente
> Wikimedia México A.C.
> wikimedia.mx
>
> Imagina un mundo en donde cada persona del planeta pueda tener acceso libre
> a la suma total del conocimiento humano.
> Eso es lo que estamos haciendo . *
> ___
> Wikimedia-l mailing list
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
>
___
Wikimedia-l mailing list
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,