I'm back home. I hashed all 3-char strings from a pool of 63 characters
(250047 total) into a 4-byte hash of 2^32 possible values. This means I
tested 0.0058% of the space. 

I assumed that the MD5 and SHA1 hash rounds were so effective that they
acted like a really good randomiser and that the hashes would be well
distributed. So well distributed in fact that I expected no collisions at
all. It looks like the "avalanche effect" isn't as strong as I expected
where small bit changes in the input are supposed to significantly alter the
output. I was quite shocked to see whole blocks of identical output bytes
for different inputs.

CRC32 checksums of the same 250047 inputs produce no collisions.

I'll look into this more when I get some hobby time (maybe next Xmas!)

Greg 

Reply via email to