simplsol wrote

>  Richard wrote:
>> I was thinking about the same problem here the other day,
>> and wile I haven't written anything yet it occurred to me
>> that Rev's MD5 function could probably be useful: you'd
>> make a list of all the files, read each file into a
>> variable and run the variable through MD5, and then any
>> matching MD5 keys are likely duplicates.
>>
>>  Anyone here see a weakness to that crude approach?
>
> Richard,
> A potential danger I see to this approach is if one of the
> duplicates had been edited. The routine would have to check
> for modification date (probably safe to keep the one most
> recently changed) - just not delete any apparent duplicates
> that have different edit dates?

I think an MD5 approach would account for that, since any edits would change the MD5 signature.

The MD5digest function takes any chunk of data and returns a short (16-byte) binary "signature" derived in a way that makes it mathematically improbable that two different sets of data passed in will ever produce the same signature.

So if you had two files and only one pixel was different between them, the MD5digest result would be different. Only true duplicates, where all data is completely identical, would deliver the same MD5 signature.

--
 Richard Gaskin
 Managing Editor, revJournal
 _______________________________________________________
 Rev tips, tutorials and more: http://www.revJournal.com

_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to