David Jashi wrote:
Thanks, Andrzej.

In fact I was meaning DD/MM/YYYY.

Anyway, knowing, that dedup is keeping latest version of file makes my
life a bit easier.

For future reference - the algorithm that DeleteDuplicates uses is this:

* first, leave only the latest version of content under any given url. This step produces the input to the second step.

* second, use Signature implementation (MD5Signature, TextProfileSignature, ...) to find content duplicates under different urls. If found, keep only the version under the shortest url.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to