Brian Huddleston wrote :
|| > MD5 *must* duplicate. It may never duplicate in practice; it may never
|| > duplicate over the life of a single project. But if you are designing
|| > aircraft software, you must be able to say 'we need to check every byte
|| > for changes'.
|| 
|| Well, its a little worse than that.  From the PGP O'Reilley book:
|| "So why does MD5 seems so seecure?  Because 128 bits allows you to have
|| 2^128=340,282,366,920,938,463,374,607,431,768,211,456 different possible
|| MD5 codes.  That is a number that is billions of times larger than the total
|| number
|| of documents that will ever be created by the human race for the next
|| thousands
|| of year."
|| 
|| So while it is possible that MD5 could give you an erroneous result, it is
|| statistically
|| so close to zero as to be almost impossible.
|| 
|| (You might checkout http://www.rsasecurity.com/rsalabs/faq/3-6-6.html (which
|| is cool
|| in that it has links to the actual papers in addition to being a high level
|| overview))  The PGP O'Reiley
|| book also has a pretty good high-level overview, but I would recommend Bruce
|| Scheiner's Applied
|| Cryptography, if anyone in the audience is interested in how they work and
|| how you use them.)

Schneier (note the spelling) has reservations about RC5 - there is an
attack that makes it possible to create string pairs that have that
same hash code - but that is for cryptographic usage and it is by no
means clear that a equally hashing value can be found for an
arbirtrary text.

You're still talking universe lifetimes as the collision frequency when
there is no deliberate attempt to cause a collision.

|| Of course, if you're just ultra-paranoid you could use SHA-1 as your digest
|| algorythm.  It uses
|| 160-bits and a better algorhythm.

That would be safer, but it is probably of the "more infinitessimal"
flavour rather than critical.

|| Compare the above to a timestamp which can fail if:
|| 
|| 1) The edits are within the granularity of the time stamp.
|| 2) The sys-admin (or any bozo with sudo shell access) diddles the system
|| clock.
|| 3) Daylight savings switchover (in most parts of the US).

Only on non-Unix systems.  Unix time-stamps are second since 00:00:00
Jan 1 1970.  It only when that value is converted to display format
that timezone and daylight have any effect, but that doesn't happen
for timestamp comparisons.

|| 4) Automatic NTP correction of the system time (pretty common in the Unix
|| server world).
||     (Under Windows 2000 it is possible for all the machines in a given
|| domain to periodically
||     sync their clocks with the PDC).

This is usually done by stretching/contracting the clock (updating
the seconds counter every 99 or 101 ticks of the hundreds of a second
interrupt for a while until the desired shift has occurred.

|| 5) touch -r (although that's a bit of a pathological case)
|| 6) Getting completely scrambled by a misbehaving samba servers.  (Heh...no
|| flames
||     please.  Just something I've seen happen.)

   7) Getting the file content changed by a disk hardware error.

|| How often do these happen?  I'd be willing to bet $50 that it is less often
|| than a 128bit or 160bit
|| digest routine duplicates. ;-)

Even without items 3 and 4, you're on very safe ground here.  You
could even provide odds.  Anyone worried about hash collisions had
better be desperately concerned about 7.

-- 
Anyone who can't laugh at himself is not    | John Macdonald
taking life seriously enough -- Larry Wall  |   [EMAIL PROTECTED]

Reply via email to