Philip Meyer;576311 Wrote: 
> >Don't worry about the performance optimizations yet, there are many
> >ways to solve that, I'm sure we can get decent performance on this in
> >one way or another. At the moment it's more important that we can
> >guarantee the uniqueness and that it works with all file formats.
> 
> There is never a way to guarantee uniqueness with MD5.
> The more audio content that is checked, the less chance of duplicates. 
> Even checking the whole audio content would not guarantee false
> duplicates.  Checking all data would be really costly on performance.
> 
> I was exploring the idea of calculating the checksum on a block of
> data, and then if that is a duplicate re-calculate the checksum by
> reading more data.  Nice idea in concept, to perform the check on a
> small subset unless necessary, but in reality I can't see how it would
> work.

But in this whole discussion, has an actual case of collision been
verified yet? I.e. a case where different data produced identical MD5s?
I know that it is possible, but I assume that it should be very, very
rare.  (Like "won't happen before the next ice-age" rare.) And I know
you've found different, non-identical files producing identical MD5s,
but it hasn't been clear to me in this discussion that the regions
examined in the non-identical files were, in fact, non-identical.


-- 
gharris999
------------------------------------------------------------------------
gharris999's Profile: http://forums.slimdevices.com/member.php?userid=115
View this thread: http://forums.slimdevices.com/showthread.php?t=81679

_______________________________________________
beta mailing list
[email protected]
http://lists.slimdevices.com/mailman/listinfo/beta

Reply via email to