> >
> > Message: 4
> > Date: Thu, 4 Dec 2014 14:58:37 -0500
> > From: "Sreejith K." <[email protected]>
> > To: Wikimedia Commons Discussion List <[email protected]>
> > Subject: Re: [Commons-l] Duplicate removal?
> > Message-ID:
> >         <CAN8yy7Mtte+FPJ5N=hq=
[email protected]>
> > Content-Type: text/plain; charset="utf-8"
> >
> > I am using Wikimedia APIs to create a gallery of duplicates and
routinely
> > clean them. You can see the results here.
> >
> > https://commons.wikimedia.org/wiki/User:Sreejithk2000/Duplicates
> >
> > The page also has a link to the script. If anyone is interested in using
> > this script, let me know and I can work with you to customize it.
> >
> > - Sreejith K.
> >
> >
>
See also https://commons.wikimedia.org/wiki/Special:ListDuplicatedFiles
which lists files that have the most byte for byte duplicates (really most
of the time those should use file redirects).

--

Thanks Jonas for experimenting with this sort of thing. I always wished we
did something with preceptual hashes internally in addition to the sha1
hashes we do currently.

--bawolff
_______________________________________________
Commons-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/commons-l

Reply via email to