Jeroen Demeyer <j.deme...@ugent.be> added the comment:

> Has anyone figured out the real source of the degeneration when mixing in 
> negative integers?

The underlying reason for the collisions is the following mathematical relation:

x ^ -x = -2 << i

where i is the index of the smallest set bit of x. In particular, x ^ -x = -2 
for odd x.

Now consider two consecutive hash iterations:

y = (x ^ a) * m1
z = (y ^ b) * m2

and suppose that x ^ a is odd. Now if we replace a by a ^ -2, then x ^ a will 
be replaced by -(x ^ a) and y will be replaced by -y. If we also replace b by b 
^ -2, then y ^ b will be replaced by y ^ b. In other words, we have a collision.

This kind of bad interaction between ^ and * only happens with the FNV-style 
hashing. The Bernstein hash using + instead of ^ does not suffer from this 
problem. That makes it a better choice in my opinion.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to