[Wikimedia-l] [Reminder] Language Engineering IRC Office Hour on November 13, 2013 at 1500 UTC

2013-11-13 Thread Runa Bhattacharjee
Hello, A quick reminder that the Wikimedia Language Engineering team will be hosting an IRC office hour from 1500 to 1600UTC later today on #wikimedia-office (FreeNode). Please see below for the event details. Thanks Runa -- Forwarded message -- From: Runa Bhattacharjee

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Gerard Meijssen
Hoi, Seriously we should never ever be ruled be panic.What you see is bad, no doubt but the notion that we should dump everything because of the latest issue to come along is way overboard. - by stopping the flow on projects like Visual Editor you break dependencies for the work of many

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Matthew Flaschen
On 11/13/2013 02:40 AM, James Heilman wrote: The Wikimedia Foundation needs to wake up and deal with the real tech elephant in the room. Our primary issue is not a lack of FLOW, a lack of a visual editor, or a lack of a rapidly expanding education program. Our biggest issue is copyright

[Wikimedia-l] Recovering wikipedia.it: top 1 trademark priority per it.wiki poll

2013-11-13 Thread Federico Leva (Nemo)
Hello all (cc Yana, Michelle, Geoff, legal, board). In a formal poll[1] proposed by two admins, the it.wiki community has decided the following: «The Italian Wikipedia community considers that, among all the actions in defense of its name (as in public image and trademark) pertaining to the

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Steven Walling
On Tue, Nov 12, 2013 at 11:40 PM, James Heilman jmh...@gmail.com wrote: The Wikimedia Foundation needs to wake up and deal with the real tech elephant in the room. Our primary issue is not a lack of FLOW, a lack of a visual editor, or a lack of a rapidly expanding education program. Our

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Marco Chiesa
On Wed, Nov 13, 2013 at 8:40 AM, James Heilman jmh...@gmail.com wrote: Our biggest issue is copyright infringement. We have had the Indian program, we have had issues with the Education program, and I have today come across a user who has made nearly 20,000 edits to 1,742 article since 2006

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Lodewijk
Marco: I agree, we had also issues on the Dutch Wikipedia - these have been around for ages, the English Wikipedia is just less aware of them. Often, copypasting in the same language is caught easily - between different languages is much harder and persistent. There are many people, including

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Federico Leva (Nemo)
Marco Chiesa, 13/11/2013 10:21: There are bots that go and look whether a newly inserted block of text is already present somewhere else, [...] Rectius: there *used* to be a bot (RevertBot, Lusumbot). The program https://www.mediawiki.org/wiki/Manual:Pywikibot/copyright.py has been stopped

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Philippe Beaudette
On Wed, Nov 13, 2013 at 2:37 AM, Matthew Flaschen matthew.flasc...@gatech.edu wrote: A significant problem with TurnItIn is that is proprietary, and can not be customized by anyone in the movement. The fact that it is proprietary also means it can never be port of the main infrastructure,

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Matthew Flaschen
On 11/13/2013 05:16 AM, Philippe Beaudette wrote: On Wed, Nov 13, 2013 at 2:37 AM, Matthew Flaschen matthew.flasc...@gatech.edu wrote: A significant problem with TurnItIn is that is proprietary, and can not be customized by anyone in the movement. The fact that it is proprietary also means

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Gerard Meijssen
Hoi I know several authors who publish and use their original text to publish on Wikipedia as well.. This is another source of false positives because they have the copyright to the original source... To recognise this you have to be even more sophisticated. The point I want to make is that

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Marco Chiesa
On Wed, Nov 13, 2013 at 11:44 AM, Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi I know several authors who publish and use their original text to publish on Wikipedia as well.. This is another source of false positives because they have the copyright to the original source... To

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Chris McKenna
On Wed, 13 Nov 2013, Marco Chiesa wrote: On Wed, Nov 13, 2013 at 11:44 AM, Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi I know several authors who publish and use their original text to publish on Wikipedia as well.. This is another source of false positives because they have the

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Chris McKenna
On Wed, 13 Nov 2013, Gerard Meijssen wrote: The point I want to make is that having a tool that is KNOWN to be deficient in specific ways can still be a huge advantage over not having a tool at all. So PLEASE lets not make perfection the enemy of the good. The problem isn't that we're waiting

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Marco Chiesa
On Wed, Nov 13, 2013 at 12:36 PM, Chris McKenna cmcke...@sucs.org wrote: But an automated tool can not know whether OTRS verification has happened or not. We put something like {{OTRS verified}} in the article's talk page, something saying: Part of the text comes from website X, ticket

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Marco Chiesa
On Wed, Nov 13, 2013 at 12:39 PM, Chris McKenna cmcke...@sucs.org wrote: The problem isn't that we're waiting for perfection. We're waiting for the proportion of false positives and false negatives to fall to a level where don't overwhelm the true positives. To avoid false positives from

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread
On 13 November 2013 07:40, James Heilman jmh...@gmail.com wrote: ... Our biggest issue is copyright infringement. ... Thanks for raising this James. Yes, this is an issue but if you are gunning for elephants this month, I really don't think the copyright elephant is the biggest one in the herd.

[Wikimedia-l] [Wikimedia Announcements] Wikimedia UK report, September 2013

2013-11-13 Thread Stevie Benton
Hello everyone, Please find below the Wikimedia UK monthly reporthttps://wikimedia.org.uk/wiki/Reports for the period 1st to 30th September 2013. If you want to keep up with the chapter's activities as they happen, please subscribe to our bloghttp://blog.wikimedia.org.uk/ , join a UK mailing

Re: [Wikimedia-l] next Wikidata office hour

2013-11-13 Thread Lydia Pintscher
On Sat, Nov 2, 2013 at 4:27 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: Hi everyone, I'll be holding an office hour together with addshore on Wednesday, November 13 at 17:00 UTC. For your timezone see

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Quim Gil
On 11/13/2013 12:37 AM, Matthew Flaschen wrote: However, there may be room for enhancing MadmanBot (e.g. as a GSOC or OPW project). Any technical project able to identify small tasks and mentors available are welcome to join Wikimedia's Google Code-in team at

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Nathan
On Wed, Nov 13, 2013 at 4:53 AM, Lodewijk lodew...@effeietsanders.orgwrote: Marco: I agree, we had also issues on the Dutch Wikipedia - these have been around for ages, the English Wikipedia is just less aware of them. Not sure if you meant this how it sounds, but the English Wikipedia

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread George Herbert
On Wed, Nov 13, 2013 at 3:48 AM, Fæ fae...@gmail.com wrote: ... PS with regard to OTRS verification, we could do with better standards for verification, We are not attempting to perform a complete and unassailable verification; imagining that we can is folly. The point is, we need someone

Re: [Wikimedia-l] next Wikidata office hour

2013-11-13 Thread Lydia Pintscher
On Sat, Nov 2, 2013 at 4:27 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: Hi everyone, I'll be holding an office hour together with addshore on Wednesday, November 13 at 17:00 UTC. For your timezone see

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Michael Snow
On 11/13/2013 10:39 AM, Nathan wrote: On Wed, Nov 13, 2013 at 4:53 AM, Lodewijk lodew...@effeietsanders.orgwrote: Marco: I agree, we had also issues on the Dutch Wikipedia - these have been around for ages, the English Wikipedia is just less aware of them. Not sure if you meant this how it

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Nathan
On Wed, Nov 13, 2013 at 1:48 PM, Michael Snow wikipe...@frontier.comwrote: On 11/13/2013 10:39 AM, Nathan wrote: On Wed, Nov 13, 2013 at 4:53 AM, Lodewijk lodew...@effeietsanders.org wrote: Marco: I agree, we had also issues on the Dutch Wikipedia - these have been around for ages, the

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Tobias
On 11/13/2013 08:40 AM, James Heilman wrote: Our biggest issue is copyright infringement. When it comes to copyright infringement, among all community sites on the Internet, Wikipedia is one of the best to handle it. Many websites don't even bother with copyright unless they get a DMCA

Re: [Wikimedia-l] Copyright infringement - The real elephant in the room

2013-11-13 Thread Martin Rulsch
Unquestionably, there are also many instances where the systems fails and where lots of copyrighted material gets uploaded. Back in 2005, we had a case similar to the one you described in German Wikipedia, where various IPs copied content from old books. It is a big mess to clean up, but it