Re: RFR: JDK-8031043 ClassValue's backing map should have a smaller initial size

Peter Levart Mon, 13 Jun 2016 10:20:17 -0700

Hi,

I explored various strategies to minimize worst-case lookup performancefor MethodType keys in LinearProbeHashtable. One idea is from the"Hopscotch hashing" algorithm [1] which tries to optimize placement ofkeys by moving them around at each insertion or deletion. While aconcurrent Hopscotch hashtable is possible, it requires additional"metadata" about buckets which complicates it and does not make itpractical for implementing in Java until Java gets value types andarrays of them. The simplest idea until then is to optimize placement ofkeys when the table is rehashed. Normally when table is rehashed the oldtable is scanned and entries from it inserted into new table. To achievesimilar effect to "Hopscotch hashing", the order in which keys are takenfrom the old table and inserted into new table is changed. Keys areordered by increasing bucket index as it would be computed for the keyin the new table. Inserting in this order minimizes the worst-caselookup performance. Doing this when rehashing and not at every insertionor deletion guarantees that at least half of keys are optimally placed.

Another strategy to minimize worst-case lookup performance is to usequadratic probe sequence instead of linear probe sequence. Normally,when looking up a key, slots in the table are probed in the followingsequence (seq = 0, 1, 2 ...):


    index(seq) = (hashCode + seq) % tableLength

Quadratic probing uses the following probe sequence:

    index(seq) = (hashCode + seq * (seq + 1) / 2) % tableLength

Those two strategies can be combined. Here's a simulation of using thosetwo strategies in an open-addressing hashtable:


http://cr.openjdk.java.net/~plevart/misc/LinearProbeHashTable/lpht_MethodType_probe_sequence.txt

Using those strategies does not affect the average length of probingsequence much (length of 0 means that the key was found at its homelocation, length of 1 means that one non-equal key was probed beforefinding the equal one, etc ...), but worst-case lookup performance isgreatly impacted. Combining both strategies minimizes the worst-caselookup performance.

Benchmarking using those strategies reveals the average lookupperformance is consistently better than using CHM:


http://cr.openjdk.java.net/~plevart/misc/LinearProbeHashTable/lpht_MethodType_bench.pdf

The last trick to make this happen is stolen from CHM. The method type'skey is a WeakReference<MethodType> which caches the hashCode ofMethodType. By using cached hashCode in the key's equals()implementation as a means of optimization, we achieve similar effectthat CHM achieves when it caches key's hashCode(s) in its Entry objects.


Here's the source of above benchmark:

http://cr.openjdk.java.net/~plevart/misc/LinearProbeHashTable/lpht_MethodType_bench_src.tgz

3 variations of LinearProbeHashtable are compared with CHM:

    LinearProbeHashtable - the plain one from webrev.04.4

LinearProbeHashtable1 - using sorting of keys when rehashing tooptimize their placementLinearProbeHashtable2 - combines sorting of keys with quadraticprobe sequence

I think LinearProbeHashtable2 could be used in MethodType interningwithout fear of degrading lookup performance.



Regards, Peter

[1] https://en.wikipedia.org/wiki/Hopscotch_hashing

Re: RFR: JDK-8031043 ClassValue's backing map should have a smaller initial size

Reply via email to