Hi everyone,

In our work with Elog.io[1], we've come across a number of duplicate
files in Commons. Some of them are explainable, such as PNGs which
also have a thumbnail as JPG[2], but others seem to be more clear-cut
duplicated uploads, like [3] and [4], and yet others are the same work
but different sizes like [5] and [6].

Going through this is quite an effort, and likely requires a bit of
manual work. Is there any organised structure/group of people, that
deal with duplicate works? We'd love to contribute our findings to
such an effort once we clean up our data a bit.

[1] http://elog.io/
[2] Like 
https://commons.wikimedia.org/wiki/File:Island_House,_Bellows_Falls,_by_P._W._Taft.png
[3] 
https://commons.wikimedia.org/wiki/File:Defense.gov_News_Photo_090910-N-8420M-038.jpg
[4] 
https://commons.wikimedia.org/wiki/File:US_Navy_090910-N-8420M-038_Students_in_Basic_Underwater_Demolition-SEAL_(BUD-S)_class_279_participate_in_a_surf_passage_exercise_during_the_first_phase_of_training_at_Naval_Amphibious_Base_Coronado.jpg
[5] 
https://commons.wikimedia.org/wiki/File:P0772931871(37827)(NRCS_Photo_Gallery).jpg
[6] 
https://commons.wikimedia.org/wiki/File:NRCSMT01082(18769)(NRCS_Photo_Gallery).jpg

--
Jonas Öberg, Founder & Shuttleworth Foundation Fellow
Commons Machinery | [email protected]
E-mail is the fastest way to my attention

_______________________________________________
Commons-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/commons-l

Reply via email to