Philip Meyer;576311 Wrote: > >Don't worry about the performance optimizations yet, there are many > >ways to solve that, I'm sure we can get decent performance on this in > >one way or another. At the moment it's more important that we can > >guarantee the uniqueness and that it works with all file formats. > > There is never a way to guarantee uniqueness with MD5. > The more audio content that is checked, the less chance of duplicates. > Even checking the whole audio content would not guarantee false > duplicates. Checking all data would be really costly on performance. > > I was exploring the idea of calculating the checksum on a block of > data, and then if that is a duplicate re-calculate the checksum by > reading more data. Nice idea in concept, to perform the check on a > small subset unless necessary, but in reality I can't see how it would > work.
But in this whole discussion, has an actual case of collision been verified yet? I.e. a case where different data produced identical MD5s? I know that it is possible, but I assume that it should be very, very rare. (Like "won't happen before the next ice-age" rare.) And I know you've found different, non-identical files producing identical MD5s, but it hasn't been clear to me in this discussion that the regions examined in the non-identical files were, in fact, non-identical. -- gharris999 ------------------------------------------------------------------------ gharris999's Profile: http://forums.slimdevices.com/member.php?userid=115 View this thread: http://forums.slimdevices.com/showthread.php?t=81679 _______________________________________________ beta mailing list [email protected] http://lists.slimdevices.com/mailman/listinfo/beta
