Tim Peters <[email protected]> added the comment:
I should have spelled this out before: these are all permutations, so in
general permuting the result space of `x * mult + y` (or any other permutation
involving x and y) is exactly the same as not permuting it but applying a
different permutation to y instead.
Specifically, the sequence:
x = x * mult + y (mod 2**N)
x = P(x) # P is any permutation of N-bit integers
is the same as (and this isn't "deep" - it's trivial):
x = x * mult + Q(y, x) (mod 2**N)
where Q(y, x) = P(x * mult + y) - x * mult (mod 2**N)
Q is a "richer" class of permutation though, because it's a different
permutation for each fixed value of `x`, and uses multiplication to help
disperse bits.
While it would take a lot of work to quantify this, in my experiments the tuple
hash is significantly less touchy when permuting x after than when permuting y
before, presumably because Q is richer.
The two tuple hash tests have many inputs whose tuple component hashes have
very similar (even identical) bit patterns. There's only so much dirt-simple
permutations (like "y ^= y << 1") can do to break the patterns. So, e.g.,
change a shift count, change the multiplier, ..., and at least one of those two
tests fails depressingly often. Not spectacularly, but over the limit they
allow. Q(y, x) does a far better job of magnifying small differences.
Something I haven't tried: use a richer permutation of y that _doesn't_ depend
on x. For example, the frozenset hash has much the same underlying challenge,
and uses this permutation to "blow up" small input differences:
static Py_uhash_t
_shuffle_bits(Py_uhash_t h)
{
return ((h ^ 89869747UL) ^ (h << 16)) * 3644798167UL;
}
But that's a lot more expensive than a single shift-xor, and the speed of tuple
hashing is more important than of frozenset hashing.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com