Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-23 Thread Maggie Dennis
Just a few points inline. :) On Tue, Jul 22, 2014 at 5:50 AM, James Heilman jmh...@gmail.com wrote: To clarify the proposal is: 1) only looking at new edits that add blocks of text over a certain size 2) only tagging those edits on a workspace page for further follow-up by an experienced

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-23 Thread Kerry Raymond
-l-boun...@lists.wikimedia.org [mailto:wiki-research-l-boun...@lists.wikimedia.org] On Behalf Of Maggie Dennis Sent: Thursday, 24 July 2014 12:42 AM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early Just a few points

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-23 Thread Oliver Keyes
,'cvml','wiki-research-l-boun...@lists.wikimedia.org');] *On Behalf Of *Maggie Dennis *Sent:* Thursday, 24 July 2014 12:42 AM *To:* Research into Wikimedia content and communities *Subject:* Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early Just a few points inline

[Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-22 Thread James Heilman
To clarify the proposal is: 1) only looking at new edits that add blocks of text over a certain size 2) only tagging those edits on a workspace page for further follow-up by an experienced human editor 3) only running on articles of WikiProjects that want it and are willing to follow-up (thus

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-21 Thread Pine W
It should be relatively easy to catch a significant percentage of those copyright violations with the assistance of automated search tools. The trick is to do it at a large scale in near-realtime, which might require some computationally intensive and bandwidth intensive work. James, can I suggest

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-21 Thread Jane Darnell
Isn't that what Corenbot does/did? I always found it very confusing though whenever I ran into it, and the false positives are huge (so many sites copy Wikimedia content these days) On Mon, Jul 21, 2014 at 9:11 AM, Pine W wiki.p...@gmail.com wrote: It should be relatively easy to catch a

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-21 Thread Nathan
On Mon, Jul 21, 2014 at 9:52 AM, Andrew G. West west.andre...@gmail.com wrote: Having dabbled in this initiative a couple years back when it first started to gain some traction, I'll make some comments. Yes, CorenSearchBot (CSB) did/does(?) operate in this space. It basically searched took

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-21 Thread Jane Darnell
It's been a while, but as I recall, my problem with the Corenbot is the text that was inserted on the page (some loud banner with a link to the original text on some website, which was often not at all related to the matter at hand). My confusion was the instructional text in the link, and I

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-21 Thread Sage Ross
Hey folks. As James noted, Wiki Education Foundation is planning to do some work on this problem. I'll the project manager for it, and I'll be grateful for all the help and advice I can get. I'm in the process now of finding a development company to work with. Our current plan is to complete a

Re: [Wiki-research-l] [Wikimedia-l] Catching copy and pasting early

2014-07-21 Thread Kerry Raymond
In light of the editor retention problem, I suggest we have to be very careful with any kind of plagiarism detector software because we have real subject matter experts among our editors. I'm aware of members of local history societies who have had issues with copyright violation because they have