[issue34751] Hash collisions for tuples

Jeroen Demeyer Tue, 02 Oct 2018 05:38:50 -0700


Jeroen Demeyer <j.deme...@ugent.be> added the comment:


SeaHash seems to be designed for 64 bits. I'm guessing that replacing the 
shifts by

x ^= ((x >> 16) >> (x >> 29))

would be what you'd do for a 32-bit hash. Alternatively, we could always 
compute the hash with 64 bits (using uint64_t) and then truncate at the end if 
needed.

However, when testing the hash function

    for t in INPUT:
        x ^= hash(t)
        x *= MULTIPLIER
        x ^= ((x >> 16) >> (x >> 29))
        x *= MULTIPLIER

It fails horribly on the original and my new testsuite. I'm guessing that the 
problem is that the line x ^= ((x >> 16) >> (x >> 29)) ignores low-order bits 
of x, so it's too close to pure FNV which is known to have problems. When 
replacing the first line of the loop above by x += hash(t) (DJB-style), it 
becomes too close to pure DJB and it also fails horribly because of nested 
tuples.

So it doesn't seem that the line x ^= ((x >> 16) >> (x >> 29)) (which is what 
makes SeaHash special) really helps much to solve the known problems with DJB 
or FNV.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34751] Hash collisions for tuples

Reply via email to