Tim Peters <[email protected]> added the comment:
And one more:
x = (x * mult) ^ t;
also appears to work equally well. So, way back when, it appears we _could_
have wormed around the disaster du jour just by applying a shift-xor
permutation to the raw hash results.
Note the implication: if we believed our tuple hash worked OK before then (&
we did), adding a shift-xor step would have changed nothing about it _except_
to permute the input space. That alone was enough to repair the nested tuple
problems.
Note that permutations are a bog standard way to improve algorithms with bad
behaviors in some cases. For example, at the end of the Mersenne Twister they
do 4 permutations to improve otherwise-marginal results from "the real"
pseudo-random generator:
y ^= (y >> 11);
y ^= (y << 7) & 0x9d2c5680U;
y ^= (y << 15) & 0xefc60000U;
y ^= (y >> 18);
return y;
At this point I see no "good reason" to prefer any of
x = (x * mult) ^ t; // like Bernstein 33X & FNV-1
x = (x * mult) + t; // like Bernstein 33A
x = (x ^ t) * mult; // like FNV-1a
to any other. In their _original_ contexts, they were applied to longish
sequences of unsigned bytes. But tuples used as keys and set elements are
typically much shorter than strings, so results about how they worked in their
original contexts are largely irrelevant for that reason too.
While lots of non-endcase testing is also needed, at this point I don't have
any real fear about any of them.
As a sanity check,
x = (x * mult) | t;
is disastrous for all the endcase tests. So I believe I really am compiling
and running the code ;-)
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com