Re: [OT] Hash collisions

Les Hughes Tue, 01 May 2012 16:44:18 -0700

Les Hughes wrote:

Greg Keogh wrote:
The strings produce quite different MD5 hashes as you would expect,but when I XOR “fold” the buffers down I get the same 4 byte result.This seems statistically infeasible. Using SHA1 you get collisionswith "7Fw" and "X9z". I have a dozen other examples.
Let's say each character in the string: "7Fw". Each character is abyte. (could be ASCII, but whatever...)
You are trying to turn 3 bytes (of which there are ~17 millioncombinations) into a 4 byte hash (of which there are ~4 billioncombinations) using an algorithm which works for strings of nearinfinite length.
This means there is a 1/250 chance (0.4%) that strings of 3 bytes willhave the same 4 byte hash).

Actually, who needs approximations? 3 bytes into 4 bytes is 1 in 256 (8bit!)

Again: while you might want an even distribution, as the hash functionis trying to do an even distribution over an unknown and unlimitedstring length, a bias such as this for 3 chars is expected.

--
Les Hughes
[email protected]

Re: [OT] Hash collisions

Reply via email to