Thanks James
Just out of curiosity, the other day I found two articles with a long
section with identical wording, only names and numbers had been
changed. Example:
The town of ... has a population of .. . The town is know for
its challenges in fighting poverty. According to local
The new and improved version of the copy and detection bot that we at [[WP:
MED]] have been using for nearly a year [
https://en.wikipedia.org/wiki/User:EranBot/Copyright here] is nearly ready
to be expanded to other topic areas.
It can be found here [
Hi, James.
Is the source code available anywhere?
IF you want to try your bot in other languages, I could help you with
testing in Russian Wikipedia :)
Best regards.
rubin16
2015-04-03 12:07 GMT+03:00 James Heilman jmh...@gmail.com:
The new and improved version of the copy and detection bot
Hi James
I often suspect copy-paste and find exact matches of the text
elsewhere. However, whereas one can painstakingly (unless there is a
trick that I am not aware of) ascertain when text was enetered into
an article, it is not always possible to know when the other text
first appeared on the
1) Yes the source code is available. User:Eran has posted it here
https://github.com/valhallasw/plagiabot
2) This bot ONLY works on new edits within a couple of hours of them
occurring. This reducing the number of false positives. It DOES NOT look at
old edits.
3) This requires human follow up
On 10/17/12 10:26 PM, James Heilman wrote:
We really need a plagiarism detection tool so that we can make sure our
sources are not simply copy and pastes of older versions of Wikipedia.
Today I was happily improving our article on pneumonia as I have a day off.
I came across a recommendation
How hard would it be to set up a tool like the software that as far as I
know the MIT uses to automatically check plagiarism among thesis etc.
submitted to their digital library, checking the text of all Wikimedia
projects against e.g. newspaper websites and Google Books, and then
publishing
On Thu, Oct 18, 2012 at 6:26 AM, James Heilman jmh...@gmail.com wrote:
We really need a plagiarism detection tool so that we can make sure our
sources are not simply copy and pastes of older versions of Wikipedia.
Today I was happily improving our article on pneumonia as I have a day off.
I