Tim Peters <t...@python.org> added the comment:

FYI, using this for the guts of the tuple hash works well on everything we've 
discussed.  In particular, no collisions in the current test_tuple hash test, 
and none either in the cases mixing negative and positive little ints.  This 
all remains so using the current multiplier, or "the standard" FNV-1a 
multipliers 16777619UL (32 bit) and 1099511628211UL (64 bit).  I've done no 
other tests:

while (--len >= 0) {
        y = PyObject_Hash(*p++);
        if (y == -1)
            return -1;
                Py_uhash_t t = (Py_uhash_t)y;
                t ^= t << 7;
                x = (x ^ t) * mult;
}

Note that the multiplier doesn't change anymore.  The twist is adding

                t ^= t << 7;

to _permute_ the native hash space (stuff like "t += t >> 16" is a many-to-one 
function, not a permutation).  Other than that, it's exactly FNV-1a.  I don't 
know that 7 is especially magical - it's just the first thing I tried.

In the absence of a real analysis, the intuition is simply that "t ^= t << 7" 
will clear masses of leading sign bits when hashing "small" negative integers.  
For whatever reason(s), they appear to be the cause of the troubles.

However, since that line of code permutes the space, exactly the same 
statistics can be provoked by some other inputs.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to