Hi,
What about making this approach a little bit more general?
See: Bug <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862>6812862
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862> - provide customizable hash() algorithm
in HashMap for speed tuning
+ all later comments.
Then you additionally could save:
if ((0 != h) && (k instanceof String))
Looking at the codes of many charsets, the main variance seems to be in the lower 8 bits of a
character, especially if the strings belong to the same language. So if we would compose the initial
32-bit values from 4 chars then the murmur3 algorithm could perform almost twice faster.
If you alter all hash maps in JDK to use a new hash value, which noteworthy use cases remain to use
the legacy hashcode()? Do we really need 2 hash fields in String?
In project coin, we have set in stone to use compile time hashes for Strings_in_switch extension. So
it never can't profit from the murmur3 optimization. IMO: what a pity!
(Prominent people have said, it will never make sense to change the String's
hash algorithm.)
See: http://markmail.org/message/ig3nzmfinfuvgbwz
http://markmail.org/message/h3nlhhae5qlmf37a
Am 23.05.2012 21:03, schrieb Mike Duigou:
Also, this change
- return h ^ (h>>> 7) ^ (h>>> 4);
+ h ^= (h>>> 7) ^ (h>>> 4);
+
+ return h;
will make the compiler generates an additional iload/istore pair.
While the Jitted code will be the same, it may bother the inlining heuristic.
Wouldn' t
return (h ^= (h>>> 7) ^ (h>>> 4));
have the same effect ?
Anyway, please add a comment for later readers.
-Ulf