Greg Keogh wrote:

The strings produce quite different MD5 hashes as you would expect, but when I XOR “fold” the buffers down I get the same 4 byte result. This seems statistically infeasible. Using SHA1 you get collisions with "7Fw" and "X9z". I have a dozen other examples.


Let's say each character in the string: "7Fw". Each character is a byte. (could be ASCII, but whatever...)

You are trying to turn 3 bytes (of which there are ~17 million combinations) into a 4 byte hash (of which there are ~4 billion combinations) using an algorithm which works for strings of near infinite length.

This means there is a 1/250 chance (0.4%) that strings of 3 bytes will have the same 4 byte hash).

Considering you are only trying 3 characters, and that there could be 1, 2, 4.... 10000 characters, a 1/250 chance for a length 3 doesn't seem that bad.

Long story short: It's interesting, but not surprising.
--
Les Hughes
[email protected]

Reply via email to