On Thu, Apr 28, 2016 at 4:26 PM, Thomas Gleixner <t...@linutronix.de> wrote: > > I'll try to dig up some time to analyze the hash_long failure unless someone > familiar with the problem is beating me to it.
I'm not sure you need to spend time analyzing failure: if you get bad hashing with hash_long(), then we know that is a bad hash without having to really try to figure out why. It's the hashes that _look_ like they might be good hashes, but there's not a lot of analysis behind it, that I would worry about. The simple prime modulus _should_ be fine, but at the same time I kind of suspect we can do better. Especially since it has two multiplications. Looking around, there's http://burtleburtle.net/bob/hash/integer.html and that 32-bit "full avalanche" hash in six shifts looks like it could be better. You wouldn't want to inline it, but the point of a full avalanche bit mixing _should_ be that you could avoid the whole "upper bits" part, and it should work independently of the target set size. So if that hash works better, it would be a pretty good replacement option for hash_int(). There is also https://gist.github.com/badboy/6267743 that has a 64 bit to 32 bit hash function that might be useful for "hash_long()". Most of the people who worry about hashes tend to have strings to hash, not just a single word like a pointer, but there's clearly people around who have tried to search for good hashes that really spread out the bits. Linus