Did you run these tests with IBM's VME or DRLVM? IdendityHashMap's distribution of hash code is solely based on System.identityHashCode's results. We need to determine if the VM is doing the best thing first.
-Nathan On Fri, Apr 18, 2008 at 1:53 PM, Aleksey Shipilev < [EMAIL PROTECTED]> wrote: > Hi there, > > I had instrumented the current implementation of j.u.IdentityHashMap > and ran it through ThreadLocalBench and SerialBench. Surprisingly, the > collision rate there (frequency of sutiations when objects have > identical identityHashCode) is rather high - 8 collisions per one > get() on average with load factor of 0.75. > > That mean identityHashCode does not disperse elements well and the > map's performance is subject to clustering. Moreover, the collision > chains are sometimes so big that they're covering a half of storage > array and IHM looses access performance degrading to linear search. > And also the performance of essentially the same IHMs (in terms of > contents) may vary because of different clustering layouts. > > That's essentially the disadvantage of open-addressed hash tables and > there are known ways to overwhelm this problem [1]. But before > actually researching on this topic I need to answer some > "java-spec-legal" questions: > 1. Is it required for IdentityHashMap to use open addressing? > 2. Is it required for IdentityHashMap to use linear probing as > collision resolution scheme? > 3. What's the reason of having the separate implementation of > IdentityHashMap while it can be implemented with overriding of > "equality" operation in ordinary HashMap? > 4. Can identityHashCode be salted with some additional hash to break > clustering? > > I'm actually interested in performance optimization of IHM, so I will > appreciate anyone comments on this topic. > > Thanks, > Aleksey. > > [1] http://en.wikipedia.org/wiki/Hash_table >
