Re: Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps

Ulf Zibis Wed, 23 May 2012 16:58:54 -0700

Hi,

What about making this approach a little bit more general?

See: Bug <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862>6812862<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862> - provide customizable hash() algorithmin HashMap for speed tuning

     + all later comments.
Then you additionally could save:
    if ((0 != h) && (k instanceof String))

Looking at the codes of many charsets, the main variance seems to be in the lower 8 bits of acharacter, especially if the strings belong to the same language. So if we would compose the initial32-bit values from 4 chars then the murmur3 algorithm could perform almost twice faster.

If you alter all hash maps in JDK to use a new hash value, which noteworthy use cases remain to usethe legacy hashcode()? Do we really need 2 hash fields in String?

In project coin, we have set in stone to use compile time hashes for Strings_in_switch extension. Soit never can't profit from the murmur3 optimization. IMO: what a pity!

(Prominent people have said, it will never make sense to change the String's 
hash algorithm.)
See: http://markmail.org/message/ig3nzmfinfuvgbwz
     http://markmail.org/message/h3nlhhae5qlmf37a


Am 23.05.2012 21:03, schrieb Mike Duigou:

Also, this change

-        return h ^ (h>>>   7) ^ (h>>>   4);
+        h ^= (h>>>   7) ^ (h>>>   4);
+
+        return h;

will make the compiler generates an additional iload/istore pair.
While the Jitted code will be the same, it may bother the inlining heuristic.

Wouldn' t
    return (h ^= (h>>>  7) ^ (h>>>  4));
have the same effect ?

Anyway, please add a comment for later readers.

-Ulf

Re: Review Request CR#7118743 : Alternative Hashing for String with Hash-based Maps

Reply via email to