Re: [Commons-l] Duplicate removal?

James Heald Thu, 04 Dec 2014 01:09:55 -0800

We really need a better way to mark duplicates on Commons (and imagesthat are details from a larger work). A structure to record this issomething that probably ought to be on the radar for the new StructuredData project.

As well as exact duplicates, there may often also be different versionsof the same painting with different lighting, or scans of slightlydifferent reproductions of the same work. I don't know whether thealgorithm is permissive enough to pick all of these up, but as many ascan be picked up would be good to tag as "other versions" of the sameunderlying image.

In general, we probably wouldn't *remove* duplicate images, but we wouldwant to identify them as versions of each other.


All best,

   James.


On 04/12/2014 08:25, Federico Leva (Nemo) wrote:

Jonas Öberg, 04/12/2014 08:31:

In our work with Elog.io[1], we've come across a number of duplicate
files in Commons.


Great!

Some of them are explainable, such as PNGs which
also have a thumbnail as JPG[2], but others seem to be more clear-cut
duplicated uploads, like [3] and [4], and yet others are the same work
but different sizes like [5] and [6].


Are most of the case you find perfect duplicates like these?


Going through this is quite an effort, and likely requires a bit of
manual work. Is there any organised structure/group of people, that
deal with duplicate works? We'd love to contribute our findings to
such an effort once we clean up our data a bit.


Sure. You can edit the files and add
https://commons.wikimedia.org/wiki/Template:Duplicate
If you need to report many thousands files, it may be better to use a
flagged bot account:
https://commons.wikimedia.org/wiki/Commons:Bots/Requests

Nemo

_______________________________________________
Commons-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/commons-l



_______________________________________________
Commons-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/commons-l

Re: [Commons-l] Duplicate removal?

Reply via email to