Tim Peters <[email protected]> added the comment:
@jdemeyer, please define exactly what you mean by "Bernstein hash". Bernstein
has authored many hashes, and none on his current hash page could possibly be
called "simple":
https://cr.yp.to/hash.html
If you're talking about the teensy
hash = hash * 33 + c;
thing from now-ancient C folklore, Bernstein is claimed to have replaced
addition with xor in that later:
http://www.cse.yorku.ca/~oz/hash.html
In any case, Python _used_ to do plain
x = (1000003 * x) ^ y;
but changed to the heart of the current scheme 14 years ago to address Source
Forge bug #942952:
https://github.com/python/cpython/commit/41bd02256f5a2348d2de3d6e5fdfcaeb2fcaaebc#diff-cde4fb3109c243b2c2badb10eeab7fcd
The Source Forge bug reports no longer appear to exist. If you care enough,
dig through the messages with subject:
[ python-Bugs-942952 ] Weakness in tuple hash
here:
https://mail.python.org/pipermail/python-bugs-list/2004-May/subject.html
The thrust was that the simpler approach caused, in real life, hashes of nested
tuples of the form
(X, (X, Y))
to return hash(Y) - the hash(X) parts across calls magically xor'ed out.
It was also systematically the case that
(X, (Y, Z))
and
(Y, (X, Z))
hashed to the same thing.
So the complications were added for a reason, to address the
showed-up-in-real-life pathological cases discussed in the bug report already
linked to. Since I don't believe we've had other reports of real-life
pathologies since, nobody is going to change this again without truly
compelling reasons. Which I haven't seen here, so I'm closing this report
again.
BTW, as also explained in that report, the goal of Python's hash functions is
to minimize collisions in dicts in real life. We generally couldn't care less
whether they "act random", or are hard to out-guess. For example,
assert all(hash(i) == i for i in range(1000000))
has always succeeded.
----------
resolution: -> rejected
status: open -> closed
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com