Hello, Windows 7 x64 on Core i7 Q720 (4C/8T/6MB L3, 1,6GHz) 8GB RAM
my numbers in the gist are from a notebook, but I plan to repeat my and your benachmark versions on a different machine tomorow (which has less memory pressure), I will report. I was testing single threaded, will see if this makes a difference. Gruss Bernd Am Tue, 13 Jan 2015 16:51:01 +0100 schrieb Peter Levart <peter.lev...@gmail.com>: > On 01/13/2015 12:36 PM, Doug Lea wrote: > > On 01/12/2015 05:12 PM, Peter Levart wrote: > >> Hi, > >> > >> I added results obtained with JDK 8 (FCS and u20) - same machine, > >> same VM > >> options, just different JDKs: > > > > Thanks very much. It remains a mystery why the original report > > by Bernd showed a 40% difference from jdk7. The approx 10% hit > > seen in these more careful measurements is about what we > > expected as the intended tradeoff for avoiding DOS attacks > > and better performance in more common cases. There are > > still some tweaky improvements we should keep exploring, > > but I don't think there's any need for fundamental changes. > > > > -Doug > > Perhaps it was because of different HW. I ran the tests on Intel® > Core™ i7-2600K CPU with 8 MB of Cache. I wonder what Bernd used. The > number of different keys is 250.000, they are accessed randomly in a > loop. Such HashMap consists of majority of TreeNode(s). TreeNode(s) > have 9 reference fields + int and boolean. I don't claim that such > HashMap fits intirely into Cache, but half of it perhaps does. Maybe > it is just that cache-hit ratio was larger on my HW than on Bernd's > and that Bernd's cache was thrashing more than mine. With JDK 7 > (normal Nodes have just 3 reference fields + int) such HashMap has > roughly half the memory footprint. > > Bernd, what did you run your benchmark on? > > > Peter > > > > >> > >> Original JDK 7 HashMap (and JVM): > >> > >> Benchmark (initialSize) Mode > >> Samples Score Score error Units > >> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >> 2839.458 157.299 ms > >> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >> 2673.924 187.063 ms > >> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >> 686.972 32.928 ms > >> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >> 631.001 6.574 ms > >> > >> Original JDK 8 HashMap (JDK 8 FCS JVM): > >> > >> Benchmark (initialSize) Mode > >> Samples Score Score error Units > >> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >> 3186.305 74.890 ms > >> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >> 2479.155 136.924 ms > >> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >> 673.819 13.236 ms > >> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >> 673.636 8.676 ms > >> > >> Original JDK 8 HashMap (JDK 8u20 JVM): > >> > >> Benchmark (initialSize) Mode > >> Samples Score Score error Units > >> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >> 3107.455 72.524 ms > >> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >> 2986.006 9.796 ms > >> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >> 631.295 7.281 ms > >> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >> 641.041 17.139 ms > >> > >> Original JDK 9 HashMap: > >> > >> Benchmark (initialSize) Mode > >> Samples Score Score error Units > >> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >> 3011.738 78.249 ms > >> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >> 2984.280 48.315 ms > >> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >> 682.060 52.341 ms > >> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >> 685.705 55.183 ms > >> > >> Original JDK 9 HashMap with TREEIFY_THRESHOLD = 1 << 20: > >> > >> Benchmark (initialSize) Mode > >> Samples Score Score error Units > >> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >> 2780.771 236.647 ms > >> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >> 2541.740 233.429 ms > >> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >> 757.364 67.869 ms > >> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >> 671.617 54.943 ms > >> > >> Caching of comparableClassFor (in ClassRepository - good for > >> heterogeneous keys > >> too): > >> > >> Benchmark (initialSize) Mode > >> Samples Score Score error Units > >> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >> 3014.888 71.778 ms > >> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >> 2279.757 54.159 ms > >> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >> 760.743 70.674 ms > >> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >> 725.188 67.853 ms > >> > >> Caching of comparableClassFor (internally - good for homogeneous > >> keys only): > >> > >> Benchmark (initialSize) Mode > >> Samples Score Score error Units > >> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >> 3026.707 84.571 ms > >> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >> 2137.296 66.140 ms > >> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >> 635.964 8.213 ms > >> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >> 685.129 46.783 ms > >> > >> > >> Peter > >> > >> On 01/12/2015 12:26 AM, Peter Levart wrote: > >>> > >>> On 01/11/2015 10:00 PM, Doug Lea wrote: > >>>> On 01/11/2015 02:26 PM, Peter Levart wrote: > >>>> > >>>>> Although majority of entries constitute the bins of size 13 or > >>>>> 14, there's only > >>>>> a single hashCode value per bin. > >>>>> > >>>>> So in this benchmark, treeifying with non-comparable keys is a > >>>>> waste of effort. > >>>> > >>>> On the other hand, the waste seems to only cost about 10% in > >>>> your runs. > >>>> I wonder why the original report using jdk7 vs jdk8 seemed > >>>> larger. > >>> > >>> I don't know. I ran the same benchmark with same VM options on > >>> JDK 7 too. Here > >>> are all results together: > >>> > >>> Original JDK 7 HashMap (and JVM): > >>> > >>> Benchmark (initialSize) Mode > >>> Samples Score Score error Units > >>> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >>> 2839.458 157.299 ms > >>> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >>> 2673.924 187.063 ms > >>> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >>> 686.972 32.928 ms > >>> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >>> 631.001 6.574 ms > >>> > >>> Original JDK 9 HashMap: > >>> > >>> Benchmark (initialSize) Mode > >>> Samples Score Score error Units > >>> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >>> 3011.738 78.249 ms > >>> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >>> 2984.280 48.315 ms > >>> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >>> 682.060 52.341 ms > >>> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >>> 685.705 55.183 ms > >>> > >>> Original JDK 9 HashMap with TREEIFY_THRESHOLD = 1 << 20: > >>> > >>> Benchmark (initialSize) Mode > >>> Samples Score Score error Units > >>> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >>> 2780.771 236.647 ms > >>> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >>> 2541.740 233.429 ms > >>> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >>> 757.364 67.869 ms > >>> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >>> 671.617 54.943 ms > >>> > >>> Caching of comparableClassFor (in ClassRepository - good for > >>> heterogeneous > >>> keys too): > >>> > >>> Benchmark (initialSize) Mode > >>> Samples Score Score error Units > >>> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >>> 3014.888 71.778 ms > >>> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >>> 2279.757 54.159 ms > >>> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >>> 760.743 70.674 ms > >>> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >>> 725.188 67.853 ms > >>> > >>> Caching of comparableClassFor (internally - good for homogeneous > >>> keys only): > >>> > >>> Benchmark (initialSize) Mode > >>> Samples Score Score error Units > >>> j.t.HashMapCollision.badDistNoComp 16 ss 60 > >>> 3026.707 84.571 ms > >>> j.t.HashMapCollision.badDistWithComp 16 ss 60 > >>> 2137.296 66.140 ms > >>> j.t.HashMapCollision.goodDistNoComp 16 ss 60 > >>> 635.964 8.213 ms > >>> j.t.HashMapCollision.goodDistWithComp 16 ss 60 > >>> 685.129 46.783 ms > >>> > >>>> > >>>>> > >>>>> Are there (non-forged) sets of non-comparable keys with > >>>>> hashCodes where > >>>>> treeifying makes sense? > >>>> > >>>> Try using a class like: > >>>> class FHC { float f; int hashCode() { return > >>>> Float.floatToIntBits(f); } } > >>>> and populate with instances with integral values for f. > >>>> > >>>> Similarly for doubles. > >>>> > >>>> Pre-jdk8, we devised a bit-smearing function that (among other > >>>> constraints) did OK for float/double keys with integral values, > >>>> that are not all that rare. With treeification, we don't need to > >>>> penalize classes with decent hashCodes by bit-smearing to still > >>>> get OK performance for these kinds of cases where the tree-based > >>>> hashCode comparison helps more than Comparability per se. > >>> > >>> I see. These keys actually have unique or near unique hashCodes > >>> but which are > >>> not good for power of two length tables without bit-smearing. > >>> With tree bins > >>> we don't need heavy bit-smearing to get decent performance in > >>> speed, but the > >>> table gets quite sparse anyway (although this is the smaller of > >>> the space > >>> overheads - tree nodes are bigger). For example, for 1M integral > >>> Floats, we > >>> get the following: > >>> > >>> >>> Float ... > >>> Capacity: 2097152 > >>> Load factor: 0.75 > >>> Size: 1000000 > >>> Bin sizes: 0*1966080 1*0 2*0 3*24288 4*41248 5*0 > >>> 6*0 7*0 8*0 > >>> 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000 > >>> Empty bins: 93.8 % > >>> Unique hash codes per bin: 0*1966080 1*0 2*0 3*24288 4*41248 5*0 > >>> 6*0 7*0 8*0 > >>> 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000 > >>> > >>> > >>>> > >>>> Also... > >>>> > >>>> It looks like the simplest path to a minor improvement is > >>>> just to cache internally (your fourth test below). But I now > >>>> recall not doing this because it adds to footprint and > >>>> the field could prevent class unloading, for only a small > >>>> benefit. > >>> > >>> Footprint, yes (one reference field in HM instance), while class > >>> unloading is > >>> taken care of using WeakReference: > >>> > >>> http://cr.openjdk.java.net/~plevart/jdk9-dev/HM.comparableClassFor/HomogeneousKeysCache/webrev.01/ > >>> > >>> > >>> > >>>> > >>>> (Every time HashMap has changed, there have been reports of > >>>> performance regressions even though typical performance > >>>> generally improves.) > >>>> > >>>> -Doug > >>> > >>> Regards, Peter > >>> > >>>> > >>>> > >>>>>> Original JDK9 HashMap: > >>>>>> > >>>>>> Benchmark (initialSize) Mode > >>>>>> Samples Score Score error Units > >>>>>> j.t.HashMapCollision.badDistNoComp 16 > >>>>>> ss 60 3011.738 78.249 ms > >>>>>> j.t.HashMapCollision.badDistWithComp 16 > >>>>>> ss 60 2984.280 48.315 ms > >>>>>> j.t.HashMapCollision.goodDistNoComp 16 > >>>>>> ss 60 682.060 52.341 ms > >>>>>> j.t.HashMapCollision.goodDistWithComp 16 > >>>>>> ss 60 685.705 55.183 ms > >>>>>> > >>>>>> Original JDK9 HashMap with TREEIFY_THRESHOLD = 1 << 20: > >>>>>> > >>>>>> Benchmark (initialSize) Mode > >>>>>> Samples Score Score error Units > >>>>>> j.t.HashMapCollision.badDistNoComp 16 > >>>>>> ss 60 2780.771 236.647 ms > >>>>>> j.t.HashMapCollision.badDistWithComp 16 > >>>>>> ss 60 2541.740 233.429 ms > >>>>>> j.t.HashMapCollision.goodDistNoComp 16 > >>>>>> ss 60 757.364 67.869 ms > >>>>>> j.t.HashMapCollision.goodDistWithComp 16 > >>>>>> ss 60 671.617 54.943 ms > >>>>>> > >>>>>> Caching of comparableClassFor (in ClassRepository - good for > >>>>>> heterogeneous > >>>>>> keys too): > >>>>>> > >>>>>> Benchmark (initialSize) Mode > >>>>>> Samples Score Score error Units > >>>>>> j.t.HashMapCollision.badDistNoComp 16 > >>>>>> ss 60 3014.888 71.778 ms > >>>>>> j.t.HashMapCollision.badDistWithComp 16 > >>>>>> ss 60 2279.757 54.159 ms > >>>>>> j.t.HashMapCollision.goodDistNoComp 16 > >>>>>> ss 60 760.743 70.674 ms > >>>>>> j.t.HashMapCollision.goodDistWithComp 16 > >>>>>> ss 60 725.188 67.853 ms > >>>>>> > >>>>>> Caching of comparableClassFor (internally - good for > >>>>>> homogeneous keys only): > >>>>>> > >>>>>> Benchmark (initialSize) Mode > >>>>>> Samples Score Score error Units > >>>>>> j.t.HashMapCollision.badDistNoComp 16 > >>>>>> ss 60 3026.707 84.571 ms > >>>>>> j.t.HashMapCollision.badDistWithComp 16 > >>>>>> ss 60 2137.296 66.140 ms > >>>>>> j.t.HashMapCollision.goodDistNoComp 16 > >>>>>> ss 60 635.964 8.213 ms > >>>>>> j.t.HashMapCollision.goodDistWithComp 16 > >>>>>> ss 60 685.129 46.783 ms > >>>>>> > >>>>>> > >>>> > >>> > >> > > >