Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Hi, took me some time to setup Maven/JMH and learn the basics. (two tools in a day, phew, that's more than I usually do in a year! :) JIT can sometimes optimize the code so aggressively I was trying to bench this aggressively optimized version of the code, with the idea that: - If the JVM thinks this code is not to be optimized, it should not slow down the program too much, and having clean code should suffice. - If it decides to optimize it, then the optimized version must be the fastest possible, not to slow down the whole program. In particular, I've once seen some optimization actually slow things down (a cast from double to int, that was intentionally duplicated, got factored in common path, assuming that it was necessarily fast, but it wasn't... so I replaced one of the (int)x with -(int)-x and the awful perfs went away). But I agree I have no clue whether the optimization would be similar in another program. I suggest you try to express your benchmark in a tried framework like JMH: http://openjdk.java.net/projects/code-tools/jmh/ In particular I suggest studying included samples. Regards, Peter Below the JMH version of my neanderthalian bench :) followed by some results (with 1.7 bytecode, win7, core i7), then a bench similar to the one you posted (http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-July/027681.html), and results again. @State(Scope.Thread) public class JmhIHMBench { private static final int MAX_NBR_OF_MAPPINGS = 1*1000; final MapObject, Object map_original = new IdentityHashMapObject, Object(); // etc. final MapObject, Object map_peter8noloop = new IdentityHashMapPeter8NoLoopObject, Object(); Object[] keys; int i; private static Object[] newKeys(int size) { final Object[] keys = new Object[size]; for (int i=0;ikeys.length;i++) { keys[i] = new Object(); } return keys; } @Setup(Level.Iteration) public void prepare() { keys = newKeys(MAX_NBR_OF_MAPPINGS); i = 0; } @Benchmark public void bench_put_original(Blackhole bh) { bench_put(bh, this.map_original); } // etc. @Benchmark public void bench_put_peter8noloop(Blackhole bh) { bench_put(bh, this.map_peter8noloop); } private void bench_put(Blackhole bh, MapObject,Object map) { if (map.size() == MAX_NBR_OF_MAPPINGS) { map.clear(); i = 0; } final Object kv = keys[i++]; bh.consume(map.put(kv, kv)); } public static void main(String[] args) throws RunnerException { Options opt = new OptionsBuilder() .include(.* + JmhIHMBench.class.getSimpleName() + .*) .warmupIterations(8) .measurementIterations(10) .threads(1) .forks(1) .shouldDoGC(false) .build(); new Runner(opt).run(); } } /* * jdk7u51 (run 1): Benchmark Mode Samples Score Score error Units r.JmhIHMBench.bench_put_original thrpt 10 84087111,944 6033159,942 ops/s r.JmhIHMBench.bench_put_peter7 thrpt 10 68893501,546 1375694,507 ops/s r.JmhIHMBench.bench_put_peter7noloop thrpt 10 79900659,706 6637125,685 ops/s r.JmhIHMBench.bench_put_peter8 thrpt 10 79062559,130 1392057,456 ops/s r.JmhIHMBench.bench_put_peter8noloop thrpt 10 89184331,506 2380981,771 ops/s */ /* * jdk7u51 (run 2): Benchmark Mode Samples Score Score error Units r.JmhIHMBench.bench_put_original thrpt 10 90430298,285 1749471,171 ops/s r.JmhIHMBench.bench_put_peter7 thrpt 10 77895700,268 1118205,815 ops/s r.JmhIHMBench.bench_put_peter7noloop thrpt 10 87449460,422 2241899,330 ops/s r.JmhIHMBench.bench_put_peter8 thrpt 10 79993955,944 1799749,550 ops/s r.JmhIHMBench.bench_put_peter8noloop thrpt 10 89818256,344 886977,083 ops/s */ /* * jdk8u20 (run 1): Benchmark Mode Samples Score Score error Units r.JmhIHMBench.bench_put_original thrpt 10 108675548,411 7130886,988 ops/s r.JmhIHMBench.bench_put_peter7 thrpt 10 94452557,383 4030609,636 ops/s r.JmhIHMBench.bench_put_peter7noloop thrpt 10 99070373,936 2204912,685 ops/s r.JmhIHMBench.bench_put_peter8 thrpt 10 106865766,992 6248051,431 ops/s r.JmhIHMBench.bench_put_peter8noloop thrpt 10 114880028,516 1176180,414 ops/s */ /* * jdk8u20 (run 2): Benchmark Mode Samples Score Score error Units r.JmhIHMBench.bench_put_original thrpt 10 117513077,228 2011080,121 ops/s
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Hi been watching this fascinating discussion - seeing Jeff's benchmark today, was wondering if there isn't already at least one benchmark written with JMH? Wouldn't it make sense to make that part of the submission, as a standard practice in refactoring like this? Regards, Patrick On Mon, Jul 14, 2014 at 2:24 AM, Jeff Hain jeffh...@rocketmail.com wrote: Can you post the benchmark source? Before the source, here are the time ranges runs were stabilizing in for lucky executions (using 1.7 for compiler compliance level, and win7 / core i7) : | original | peter7 | peter7 | peter8 | peter8 | | | | no loop | | no loop | -+--+--+--+--+--+ jdk7u51 | 0.56-59s | 0.66-68s | 0.62-64s | 0.60-63s | 0.70-74s | jdk8u20 | 0.54-58s | 0.64-66s | 0.58-60s | 0.58-61s | 0.73-76s | jdk9 | 0.59-61s | 0.65-67s | 0.73-75s | 0.60-63s | 0.76-78s | = The no-loop version seems only better for jdk = 8 and your webrev 07, and for webrev 08, it seems actually (much) worse whatever the jdk. = jdk 8 looks faster than 7, but for some reason also faster than 9. public class IdentityHashMapPerf { private static final int MAX_NBR_OF_MAPPINGS = 1*1000; private static final int MAX_NBR_OF_CALLS = 100*1000*1000; private static final int NBR_OF_RUNS = 10; public static void main(String[] args) { System.out.println(IdentityHashMapPerf.class.getSimpleName()+...); for (int k=0;kNBR_OF_RUNS;k++) { // Quick run for code discovery for k = 0. final int maxNbrOfCalls = (k == 0) ? MAX_NBR_OF_MAPPINGS : MAX_NBR_OF_CALLS; bench_put(new IdentityHashMapInteger, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter7Integer, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter7NoLoopInteger, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter8Integer, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter8NoLoopInteger, Integer(), maxNbrOfCalls); } System.out.println(...+IdentityHashMapPerf.class.getSimpleName()); } private static void bench_put(MapInteger, Integer map, int maxNbrOfCalls) { for (int k=0;k2;k++) { final Integer[] keys = newKeys(MAX_NBR_OF_MAPPINGS); final int nbrOfClearAndPuts = maxNbrOfCalls/MAX_NBR_OF_MAPPINGS; long a = System.nanoTime(); { for (int cap=0;capnbrOfClearAndPuts;cap++) { map.clear(); for (int i=0;iMAX_NBR_OF_MAPPINGS;i++) { final Integer kv = keys[i]; map.put(kv, kv); } if (map.size() != MAX_NBR_OF_MAPPINGS) { throw new AssertionError(anti optim); } } } long b = System.nanoTime(); System.out.println(nbrOfClearAndPuts+ * +MAX_NBR_OF_MAPPINGS + +map.getClass().getSimpleName()+.put(,) took +((b-a)/1e9)+ s); } } private static Integer[] newKeys(int size) { final Integer[] keys = new Integer[size]; for (int i=0;ikeys.length;i++) { keys[i] = i; } return keys; } } -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Hi Jeff, Home grown loops are not the best way of micro-benchmarking (have done it myself and learned it). JIT can sometimes optimize the code so aggressively that performance differences you obtain from such benchmarks just show that your concrete program can be optimized better and not that an average program using some function you are trying to measure can run faster too. I suggest you try to express your benchmark in a tried framework like JMH: http://openjdk.java.net/projects/code-tools/jmh/ In particular I suggest studying included samples. Regards, Peter On 07/14/2014 02:24 AM, Jeff Hain wrote: Can you post the benchmark source? Before the source, here are the time ranges runs were stabilizing in for lucky executions (using 1.7 for compiler compliance level, and win7 / core i7) : | original | peter7 | peter7 | peter8 | peter8 | | | | no loop | | no loop | -+--+--+--+--+--+ jdk7u51 | 0.56-59s | 0.66-68s | 0.62-64s | 0.60-63s | 0.70-74s | jdk8u20 | 0.54-58s | 0.64-66s | 0.58-60s | 0.58-61s | 0.73-76s | jdk9 | 0.59-61s | 0.65-67s | 0.73-75s | 0.60-63s | 0.76-78s | = The no-loop version seems only better for jdk = 8 and your webrev 07, and for webrev 08, it seems actually (much) worse whatever the jdk. = jdk 8 looks faster than 7, but for some reason also faster than 9. public class IdentityHashMapPerf { private static final int MAX_NBR_OF_MAPPINGS = 1*1000; private static final int MAX_NBR_OF_CALLS = 100*1000*1000; private static final int NBR_OF_RUNS = 10; public static void main(String[] args) { System.out.println(IdentityHashMapPerf.class.getSimpleName()+...); for (int k=0;kNBR_OF_RUNS;k++) { // Quick run for code discovery for k = 0. final int maxNbrOfCalls = (k == 0) ? MAX_NBR_OF_MAPPINGS : MAX_NBR_OF_CALLS; bench_put(new IdentityHashMapInteger, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter7Integer, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter7NoLoopInteger, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter8Integer, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter8NoLoopInteger, Integer(), maxNbrOfCalls); } System.out.println(...+IdentityHashMapPerf.class.getSimpleName()); } private static void bench_put(MapInteger, Integer map, int maxNbrOfCalls) { for (int k=0;k2;k++) { final Integer[] keys = newKeys(MAX_NBR_OF_MAPPINGS); final int nbrOfClearAndPuts = maxNbrOfCalls/MAX_NBR_OF_MAPPINGS; long a = System.nanoTime(); { for (int cap=0;capnbrOfClearAndPuts;cap++) { map.clear(); for (int i=0;iMAX_NBR_OF_MAPPINGS;i++) { final Integer kv = keys[i]; map.put(kv, kv); } if (map.size() != MAX_NBR_OF_MAPPINGS) { throw new AssertionError(anti optim); } } } long b = System.nanoTime(); System.out.println(nbrOfClearAndPuts+ * +MAX_NBR_OF_MAPPINGS + +map.getClass().getSimpleName()+.put(,) took +((b-a)/1e9)+ s); } } private static Integer[] newKeys(int size) { final Integer[] keys = new Integer[size]; for (int i=0;ikeys.length;i++) { keys[i] = i; } return keys; } } -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 10:07 PM, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) The additional loop hurts in my benchmarks (10-15 percents slower). If removing it and adding a specific putAfterResize() method (without the item == k test), replacing continue with return putAfterResize(k, value), it's much better. NB: In one bench I clear the map each time it hits 1000 mappings, so that it always fit in some cache, which is about 4 times faster than going up to 1000*1000 mappings (clear() cost included), and make put internals costs variations more obvious (hopefully). -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/13/2014 01:24 PM, Jeff Hain wrote: On 07/08/2014 10:07 PM, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) The additional loop hurts in my benchmarks (10-15 percents slower). If removing it and adding a specific putAfterResize() method (without the item == k test), replacing continue with return putAfterResize(k, value), it's much better. NB: In one bench I clear the map each time it hits 1000 mappings, so that it always fit in some cache, which is about 4 times faster than going up to 1000*1000 mappings (clear() cost included), and make put internals costs variations more obvious (hopefully). -Jeff Can you post the benchmark source? Thanks, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/13/2014 12:41 AM, Peter Levart wrote: On 07/12/2014 10:46 PM, Ivan Gerasimov wrote: Peter, seems you need to fix capacity(): int capacity = Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)); does not necessarily produces a negative number in the case of overflow. Good catch. You're right. So here's a better variant: return Math.min( MAXIMUM_CAPACITY, Math.max( MINIMUM_CAPACITY, expectedMaxSize Integer.MAX_VALUE / 3 ? MAXIMUM_CAPACITY // 3 * expectedMaxSize would overflow : Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)) ) ); Incorporated in new webrev: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.08/ Also in this webrev: changing the condition by which putAll() decides to call resize() makes a boolean return from resize() unnecessary again, since it's only called when resize is actually needed (from put() when 3*size reaches 2*capacity, from putAll(), when capacity calculated from argument map's size is greater than current capacity). Retry loop in put() can be simplified consequently. Regards, Peter Sincerely yours, Ivan On 12.07.2014 22:12, Peter Levart wrote: On 07/12/2014 05:47 PM, Peter Levart wrote: If we're willing to pay the price of special-casing the non-power-of-2 MAX_CAPACITY = (1 29) + (1 28), which amounts to approx. 4% of performance, Then here's a possible solution: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/ Two fixes: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/ One is a fix for possible overflow in resize() + rearrangement (which is specific to my proposal) and the other is replacement of condition that triggers resize in put() from (3*size length) to (3*size = length). The later should be applied to Martin's latest version too, I think, if it is decided that my proposal is inadequate. Why is (3*size = length) more appropriate condition to trigger resize? Because it is more aligned with capacity(size) function which is basically a clamped Integer.highestOneBit(3 * size). Map preallocation makes a table with length = 2 * capacity(expectedMaxSize) (3 * size 2 * highestOneBit(3*size)) is true for any size, so IHM will never be resized if filled with size elements and table was preallocated with length = 2 * highestOneBit(3*size) even if condition for resizing is changed from (3*size length) to (3*size = length). Current condition sometimes resizes a little to late when preallocation would already create a bigger table. Now if we change that, my proposed webrev.07 does not need MAXIMUM_SIZE constant any more. Attempt to insert the 2^29-th element (when size is about to become 2^29) triggers resize since at that time length == 2 * MAXIMUM_CAPACITY which is exactly 3 * 2^29 and this alone can be used as a trigger to throw OOME in resize()... Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Can you post the benchmark source? Before the source, here are the time ranges runs were stabilizing in for lucky executions (using 1.7 for compiler compliance level, and win7 / core i7) : | original | peter7 | peter7 | peter8 | peter8 | | | | no loop | | no loop | -+--+--+--+--+--+ jdk7u51 | 0.56-59s | 0.66-68s | 0.62-64s | 0.60-63s | 0.70-74s | jdk8u20 | 0.54-58s | 0.64-66s | 0.58-60s | 0.58-61s | 0.73-76s | jdk9 | 0.59-61s | 0.65-67s | 0.73-75s | 0.60-63s | 0.76-78s | = The no-loop version seems only better for jdk = 8 and your webrev 07, and for webrev 08, it seems actually (much) worse whatever the jdk. = jdk 8 looks faster than 7, but for some reason also faster than 9. public class IdentityHashMapPerf { private static final int MAX_NBR_OF_MAPPINGS = 1*1000; private static final int MAX_NBR_OF_CALLS = 100*1000*1000; private static final int NBR_OF_RUNS = 10; public static void main(String[] args) { System.out.println(IdentityHashMapPerf.class.getSimpleName()+...); for (int k=0;kNBR_OF_RUNS;k++) { // Quick run for code discovery for k = 0. final int maxNbrOfCalls = (k == 0) ? MAX_NBR_OF_MAPPINGS : MAX_NBR_OF_CALLS; bench_put(new IdentityHashMapInteger, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter7Integer, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter7NoLoopInteger, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter8Integer, Integer(), maxNbrOfCalls); //bench_put(new IdentityHashMapPeter8NoLoopInteger, Integer(), maxNbrOfCalls); } System.out.println(...+IdentityHashMapPerf.class.getSimpleName()); } private static void bench_put(MapInteger, Integer map, int maxNbrOfCalls) { for (int k=0;k2;k++) { final Integer[] keys = newKeys(MAX_NBR_OF_MAPPINGS); final int nbrOfClearAndPuts = maxNbrOfCalls/MAX_NBR_OF_MAPPINGS; long a = System.nanoTime(); { for (int cap=0;capnbrOfClearAndPuts;cap++) { map.clear(); for (int i=0;iMAX_NBR_OF_MAPPINGS;i++) { final Integer kv = keys[i]; map.put(kv, kv); } if (map.size() != MAX_NBR_OF_MAPPINGS) { throw new AssertionError(anti optim); } } } long b = System.nanoTime(); System.out.println(nbrOfClearAndPuts+ * +MAX_NBR_OF_MAPPINGS + +map.getClass().getSimpleName()+.put(,) took +((b-a)/1e9)+ s); } } private static Integer[] newKeys(int size) { final Integer[] keys = new Integer[size]; for (int i=0;ikeys.length;i++) { keys[i] = i; } return keys; } } -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/11/2014 09:08 PM, Doug Lea wrote: I've been content to just observe Martin and Peter's nice efforts on this, but one note: On 07/11/2014 03:00 PM, Martin Buchholz wrote: On Wed, Jul 9, 2014 at 3:17 PM, Peter Levart peter.lev...@gmail.com wrote: IMH resizing is arranged so that the table is always 33% ... 66% full (if nothing is ever removed from it) except when capacity reaches 2**29, then it can be filled up to the top. avg. # of probes can be taken as a rough estimation of average slow-down, max. # of probes can be taken as a rough estimation of worst-case-operation slow-down. So where to draw the line? Linear probing has long been known to be prone to long worst-case probes, but with recent computer architectures this is offset by its extreme cache friendliness. Bear in mind that the number of bits of identityHashCode is less than 32 on all JVMs I know. It can be as low as 23, which means that you are sure to see a lot of exact collisions on IHMs with only tens of millions of elements, and there's nothing much you can do that will help. -Doug We have max. size = 2^29 - 1. This can not change because of serialization backwards compatibility. It could be a little less, but not much. Experiment shows that it is possible to fill IHM up to 99% with not too much CPU effort. If such IHM is serialized, we must be able to deserialize it... But why do we have to limit table length to 2^30 ? Because it has to be power of 2 and 2^31 is already a negative number... What would happen if max. table length was 2^30 + 2^29 ... Here's what we get now: // max. size = 2^29 - 1 // table length = 2^30 10% max. size, probes: 0.1 avg., 9 max. 20% max. size, probes: 0.1 avg., 15 max. 30% max. size, probes: 0.2 avg., 25 max. 40% max. size, probes: 0.3 avg., 38 max. 50% max. size, probes: 0.5 avg., 64 max. 60% max. size, probes: 0.7 avg., 93 max. 65% max. size, probes: 0.9 avg.,147 max. 70% max. size, probes: 1.2 avg.,224 max. 75% max. size, probes: 1.5 avg.,354 max. 80% max. size, probes: 2.0 avg.,441 max. 85% max. size, probes: 2.8 avg.,789 max. 90% max. size, probes: 4.5 avg., 1869 max. 91% max. size, probes: 5.1 avg., 2377 max. 92% max. size, probes: 5.7 avg., 3409 max. 93% max. size, probes: 6.6 avg., 3804 max. 94% max. size, probes: 7.8 avg., 5824 max. 95% max. size, probes: 9.5 avg., 7021 max. 96% max. size, probes: 12.0 avg., 12607 max. 97% max. size, probes: 16.2 avg., 17643 max. 98% max. size, probes: 24.5 avg., 34561 max. 99% max. size, probes: 49.6 avg., 131371 max. 100% ... haven't waited long enough If table length was 2^30 + 2^29, we would only fill it up to 66% like always and there would be no slow-down: // max. size = 2^29 - 1 // table length = 2^30 + 2^29 10% max. size, probes: 0.0 avg., 7 max. 20% max. size, probes: 0.1 avg., 11 max. 30% max. size, probes: 0.1 avg., 13 max. 40% max. size, probes: 0.2 avg., 20 max. 50% max. size, probes: 0.3 avg., 28 max. 60% max. size, probes: 0.4 avg., 40 max. 65% max. size, probes: 0.4 avg., 42 max. 70% max. size, probes: 0.5 avg., 52 max. 75% max. size, probes: 0.5 avg., 65 max. 80% max. size, probes: 0.6 avg., 87 max. 85% max. size, probes: 0.7 avg., 87 max. 90% max. size, probes: 0.8 avg.,128 max. 91% max. size, probes: 0.8 avg.,128 max. 92% max. size, probes: 0.8 avg.,128 max. 93% max. size, probes: 0.8 avg.,129 max. 94% max. size, probes: 0.9 avg.,129 max. 95% max. size, probes: 0.9 avg.,131 max. 96% max. size, probes: 0.9 avg.,150 max. 97% max. size, probes: 0.9 avg.,150 max. 98% max. size, probes: 1.0 avg.,159 max. 99% max. size, probes: 1.0 avg.,159 max. 100% max. size, probes: 1.0 avg.,159 max. If we're willing to pay the price of special-casing the non-power-of-2 MAX_CAPACITY = (1 29) + (1 28), which amounts to approx. 4% of performance: Original code: BenchmarkMode SamplesScore Score error Units j.t.IHMBench.ihm0Putthrpt30 25217087.59350117.867 ops/s j.t.IHMBench.ihm1Getthrpt30 43017677.457 230902.599 ops/s Patched code: BenchmarkMode SamplesScore Score error Units j.t.IHMBench.ihm0Putthrpt30 24213091.899 122980.408 ops/s j.t.IHMBench.ihm1Getthrpt30 40754537.027 936380.022 ops/s Using JMH benchmark: @State(Scope.Thread) public class IHMBench { static final Object[] keys = new Object[1000_000]; static { for (int i = 0; i keys.length; i++) { keys[i] = new Object(); } } static final Object value = new Object(); final MapObject, Object map = new IdentityHashMapObject, Object(keys.length); int i = 0; @Benchmark public void ihm0Put(Blackhole bh) { bh.consume(map.put(keys[i], value));
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/12/2014 05:47 PM, Peter Levart wrote: If we're willing to pay the price of special-casing the non-power-of-2 MAX_CAPACITY = (1 29) + (1 28), which amounts to approx. 4% of performance, Then here's a possible solution: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/ Two fixes: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/ One is a fix for possible overflow in resize() + rearrangement (which is specific to my proposal) and the other is replacement of condition that triggers resize in put() from (3*size length) to (3*size = length). The later should be applied to Martin's latest version too, I think, if it is decided that my proposal is inadequate. Why is (3*size = length) more appropriate condition to trigger resize? Because it is more aligned with capacity(size) function which is basically a clamped Integer.highestOneBit(3 * size). Map preallocation makes a table with length = 2 * capacity(expectedMaxSize) (3 * size 2 * highestOneBit(3*size)) is true for any size, so IHM will never be resized if filled with size elements and table was preallocated with length = 2 * highestOneBit(3*size) even if condition for resizing is changed from (3*size length) to (3*size = length). Current condition sometimes resizes a little to late when preallocation would already create a bigger table. Now if we change that, my proposed webrev.07 does not need MAXIMUM_SIZE constant any more. Attempt to insert the 2^29-th element (when size is about to become 2^29) triggers resize since at that time length == 2 * MAXIMUM_CAPACITY which is exactly 3 * 2^29 and this alone can be used as a trigger to throw OOME in resize()... Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/12/2014 08:12 PM, Peter Levart wrote: If we're willing to pay the price of special-casing the non-power-of-2 MAX_CAPACITY = (1 29) + (1 28), which amounts to approx. 4% of performance, The cause of performance drop is of course the conditional in: 297 private static int hash(Object x, int length) { 298 int h = System.identityHashCode(x); 299 // multiply by -127 300 h -= h 7; 301 if (length MAXIMUM_CAPACITY * 2) { // length is 2^n 302 // left-shift to use least bit as part of hash 303 return (h 1) (length - 1); 304 } 305 // assert length == MAXIMUM_CAPACITY * 2 306 // clamp 307 h = (MAXIMUM_CAPACITY / 3 * 4 - 1); 308 // Multiply by 3/2 and reset 0th bit 309 return (h + (h 1)) ~1; 310 } If I change it to: 297 private static int hash(Object x, int length) { 298 int h = System.identityHashCode(x); 299 // multiply by -127 300 h -= h 7; 302 // left-shift to use least bit as part of hash 303 return (h 1) (length - 1); 310 } ...performance is restored, but then of course it only works until table is resized to 2*MAX_CAPACITY. Does anybody have any idea how to make it faster? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/12/2014 08:12 PM, Peter Levart wrote: (3 * size 2 * highestOneBit(3*size)) is true for any size, so IHM will never be resized if filled with size elements and table was preallocated with length = 2 * highestOneBit(3*size) even if condition for resizing is changed from (3*size length) to (3*size = length). Current condition sometimes resizes a little to late when preallocation would already create a bigger table. May latest statement is of course false. If length is a power of two, then it can never be equal 3*size. For power of two lengths, both conditions are equivalent. But for my purpose the = is better... Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Peter, seems you need to fix capacity(): int capacity = Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)); does not necessarily produces a negative number in the case of overflow. Why don't you want to use the variant that from the latest Martin's webrev? It seems to work correctly with your proposal too. Sincerely yours, Ivan On 12.07.2014 22:12, Peter Levart wrote: On 07/12/2014 05:47 PM, Peter Levart wrote: If we're willing to pay the price of special-casing the non-power-of-2 MAX_CAPACITY = (1 29) + (1 28), which amounts to approx. 4% of performance, Then here's a possible solution: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/ Two fixes: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/ One is a fix for possible overflow in resize() + rearrangement (which is specific to my proposal) and the other is replacement of condition that triggers resize in put() from (3*size length) to (3*size = length). The later should be applied to Martin's latest version too, I think, if it is decided that my proposal is inadequate. Why is (3*size = length) more appropriate condition to trigger resize? Because it is more aligned with capacity(size) function which is basically a clamped Integer.highestOneBit(3 * size). Map preallocation makes a table with length = 2 * capacity(expectedMaxSize) (3 * size 2 * highestOneBit(3*size)) is true for any size, so IHM will never be resized if filled with size elements and table was preallocated with length = 2 * highestOneBit(3*size) even if condition for resizing is changed from (3*size length) to (3*size = length). Current condition sometimes resizes a little to late when preallocation would already create a bigger table. Now if we change that, my proposed webrev.07 does not need MAXIMUM_SIZE constant any more. Attempt to insert the 2^29-th element (when size is about to become 2^29) triggers resize since at that time length == 2 * MAXIMUM_CAPACITY which is exactly 3 * 2^29 and this alone can be used as a trigger to throw OOME in resize()... Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/12/2014 10:46 PM, Ivan Gerasimov wrote: Peter, seems you need to fix capacity(): int capacity = Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)); does not necessarily produces a negative number in the case of overflow. Good catch. You're right. Why don't you want to use the variant that from the latest Martin's webrev? It seems to work correctly with your proposal too. Not quite. Martin's version is: return (expectedMaxSize MAXIMUM_CAPACITY / 3) ? MAXIMUM_CAPACITY : (expectedMaxSize = 2 * MINIMUM_CAPACITY / 3) ? MINIMUM_CAPACITY : Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)); My MAXIMUM_CAPACITY is not power of two, but: 3 28; If, for example, expectedMaxSize = (1 28) + 1, then Martin's capacity() would return MAXIMUM_CAPACITY, but I would like it to return 2 ^ 29, since expectedMaxSize is only just past 50% of such capacity. So here's a better variant: return Math.min( MAXIMUM_CAPACITY, Math.max( MINIMUM_CAPACITY, expectedMaxSize Integer.MAX_VALUE / 3 ? MAXIMUM_CAPACITY // 3 * expectedMaxSize would overflow : Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)) ) ); I'll put this and another simplification into a new webrev tomorow. Regards, Peter Sincerely yours, Ivan On 12.07.2014 22:12, Peter Levart wrote: On 07/12/2014 05:47 PM, Peter Levart wrote: If we're willing to pay the price of special-casing the non-power-of-2 MAX_CAPACITY = (1 29) + (1 28), which amounts to approx. 4% of performance, Then here's a possible solution: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/ Two fixes: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/ One is a fix for possible overflow in resize() + rearrangement (which is specific to my proposal) and the other is replacement of condition that triggers resize in put() from (3*size length) to (3*size = length). The later should be applied to Martin's latest version too, I think, if it is decided that my proposal is inadequate. Why is (3*size = length) more appropriate condition to trigger resize? Because it is more aligned with capacity(size) function which is basically a clamped Integer.highestOneBit(3 * size). Map preallocation makes a table with length = 2 * capacity(expectedMaxSize) (3 * size 2 * highestOneBit(3*size)) is true for any size, so IHM will never be resized if filled with size elements and table was preallocated with length = 2 * highestOneBit(3*size) even if condition for resizing is changed from (3*size length) to (3*size = length). Current condition sometimes resizes a little to late when preallocation would already create a bigger table. Now if we change that, my proposed webrev.07 does not need MAXIMUM_SIZE constant any more. Attempt to insert the 2^29-th element (when size is about to become 2^29) triggers resize since at that time length == 2 * MAXIMUM_CAPACITY which is exactly 3 * 2^29 and this alone can be used as a trigger to throw OOME in resize()... Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
I've been content to just observe Martin and Peter's nice efforts on this, but one note: On 07/11/2014 03:00 PM, Martin Buchholz wrote: On Wed, Jul 9, 2014 at 3:17 PM, Peter Levart peter.lev...@gmail.com wrote: IMH resizing is arranged so that the table is always 33% ... 66% full (if nothing is ever removed from it) except when capacity reaches 2**29, then it can be filled up to the top. avg. # of probes can be taken as a rough estimation of average slow-down, max. # of probes can be taken as a rough estimation of worst-case-operation slow-down. So where to draw the line? Linear probing has long been known to be prone to long worst-case probes, but with recent computer architectures this is offset by its extreme cache friendliness. Bear in mind that the number of bits of identityHashCode is less than 32 on all JVMs I know. It can be as low as 23, which means that you are sure to see a lot of exact collisions on IHMs with only tens of millions of elements, and there's nothing much you can do that will help. -Doug
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 11:30 PM, Martin Buchholz wrote: Benchmarks welcome. I have run your latest webrev with the following benchamrk: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } Using: java -Xmx4G -Xms4G -jar benchmarks.jar -f 0 -i 30 -wi 10 -t 1 -gc 1 ...and results are: Original: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt30 13305370.384 80122.977ops/s Patched: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt30 13364374.454 124491.206 ops/s Seems performance is the same. Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/09/2014 02:46 AM, Martin Buchholz wrote: Let me understand - you're worried that when size is MAX_CAPACITY - 1, then a call to putAll that does not actually add any elements might throw? This is not possible, because resize() is only called from putAll(map) if argument map.size() this.size. At least one element will be added to the map and it's correct to throw if current size == MAX_CAPACITY - 1. Well, I'm having a hard time considering that corner case a bug, given how unusable the map is at this point. It seems even this corner case is not present. Your suggested fix of returning immediately in case of no resize would cause put to succeed and reach size == MAX_CAPACITY, which you were trying to prevent. That's not possible either, since resize() is always called from put() with current table.length, which makes newLength == 2 * oldLength, therefore (oldLength = newLength) will never succeed in this case. Peter On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote: On 09.07.2014 3:20, Martin Buchholz wrote: I agree with Peter that we do not need to increment modCount on resize. My latest webrev is again done. Ivan, feel free to take over. Yes, thanks! The fix is much much better now. I think I see yet another very minor glitch, though. If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging in another map, which has more elements with putAll(), resize() will be called and it will throw, even though all the elements could fit into the map due to the key duplicates. To avoid this the following check should be done first in resize(): 469 if (oldLength = newLength) 470 return false; Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/09/2014 09:23 AM, Peter Levart wrote: On 07/09/2014 02:46 AM, Martin Buchholz wrote: Let me understand - you're worried that when size is MAX_CAPACITY - 1, then a call to putAll that does not actually add any elements might throw? This is not possible, because resize() is only called from putAll(map) if argument map.size() this.size. At least one element will be added to the map and it's correct to throw if current size == MAX_CAPACITY - 1. ...at least if the argument map obeys the rules of Map contract. Even if it's a HashMap or another IdentityHashMap, it should not contain the same INSTANCE of the key in two or more mappings, should not overshoot reporting it's size() and should be stable during the whole putAll() operation... So calling IHM.addAll() with a live changing ConcurrentHashMap is not advisable. Even then, addAll() could only throw, because at some point the argument's size indicated that IHM could reach it's max. capacity. Peter Well, I'm having a hard time considering that corner case a bug, given how unusable the map is at this point. It seems even this corner case is not present. Your suggested fix of returning immediately in case of no resize would cause put to succeed and reach size == MAX_CAPACITY, which you were trying to prevent. That's not possible either, since resize() is always called from put() with current table.length, which makes newLength == 2 * oldLength, therefore (oldLength = newLength) will never succeed in this case. Peter On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote: On 09.07.2014 3:20, Martin Buchholz wrote: I agree with Peter that we do not need to increment modCount on resize. My latest webrev is again done. Ivan, feel free to take over. Yes, thanks! The fix is much much better now. I think I see yet another very minor glitch, though. If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging in another map, which has more elements with putAll(), resize() will be called and it will throw, even though all the elements could fit into the map due to the key duplicates. To avoid this the following check should be done first in resize(): 469 if (oldLength = newLength) 470 return false; Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
OK, I think we're down to the smallest possible bug: if size == MAX_CAPACITY - 1 and we putAll a concurrent map with greater size, but the concurrent map shrinks while we're adding it, and no actual elements get added to the IHM (but each element put takes ~ 2**28 probes), then an ISE is thrown when it was not strictly necessary. Meanwhile, I'm actually wishing we could throw at something like 80% full... Relatedly: It occurs to me that it's traditional in java.util to throw OOME instead of ISE for capacity exceeded. On Wed, Jul 9, 2014 at 12:33 AM, Peter Levart peter.lev...@gmail.com wrote: On 07/09/2014 09:23 AM, Peter Levart wrote: On 07/09/2014 02:46 AM, Martin Buchholz wrote: Let me understand - you're worried that when size is MAX_CAPACITY - 1, then a call to putAll that does not actually add any elements might throw? This is not possible, because resize() is only called from putAll(map) if argument map.size() this.size. At least one element will be added to the map and it's correct to throw if current size == MAX_CAPACITY - 1. ...at least if the argument map obeys the rules of Map contract. Even if it's a HashMap or another IdentityHashMap, it should not contain the same INSTANCE of the key in two or more mappings, should not overshoot reporting it's size() and should be stable during the whole putAll() operation... So calling IHM.addAll() with a live changing ConcurrentHashMap is not advisable. Even then, addAll() could only throw, because at some point the argument's size indicated that IHM could reach it's max. capacity. Peter Well, I'm having a hard time considering that corner case a bug, given how unusable the map is at this point. It seems even this corner case is not present. Your suggested fix of returning immediately in case of no resize would cause put to succeed and reach size == MAX_CAPACITY, which you were trying to prevent. That's not possible either, since resize() is always called from put() with current table.length, which makes newLength == 2 * oldLength, therefore (oldLength = newLength) will never succeed in this case. Peter On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: On 09.07.2014 3:20, Martin Buchholz wrote: I agree with Peter that we do not need to increment modCount on resize. My latest webrev is again done. Ivan, feel free to take over. Yes, thanks! The fix is much much better now. I think I see yet another very minor glitch, though. If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging in another map, which has more elements with putAll(), resize() will be called and it will throw, even though all the elements could fit into the map due to the key duplicates. To avoid this the following check should be done first in resize(): 469 if (oldLength = newLength) 470 return false; Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/09/2014 06:36 PM, Martin Buchholz wrote: OK, I think we're down to the smallest possible bug: I wouldn't call it a bug, since it's not present. if size == MAX_CAPACITY - 1 and we putAll a concurrent map with greater size, but the concurrent map shrinks while we're adding it, and no actual elements get added to the IHM (but each element put takes ~ 2**28 probes), then an ISE is thrown when it was not strictly necessary. We won't be adding any elements if resize() throws capacity exceeded, so this is just hypothetical thinking what would happen if we didn't call resize() and just started adding elements from the argument map. We might hit the ceiling or we might not. Meanwhile, I'm actually wishing we could throw at something like 80% full... I created a little simulation: public class IMHSimulation { static final int MAXIMUM_CAPACITY = 1 29; static final int MASK = MAXIMUM_CAPACITY - 1; static int hash(Object x) { int h = System.identityHashCode(x); // Multiply by -127 to use least bit as part of hash return (h - (h 7)) MASK; } static int add(BitSet bs, Object o) { int i = hash(o); int probes = 0; while (bs.get(i)) { i = (i + 1) MASK; probes++; } bs.set(i); return probes; } public static void main(String[] args) { BitSet bs = new BitSet(MAXIMUM_CAPACITY); long totalProbes = 0; int maxProbes = 0; int percent = 10; int nextReportAt = (int) ((long) percent * MAXIMUM_CAPACITY / 100); for (int i = 0; i MAXIMUM_CAPACITY; i++) { int probes = add(bs, new Object()); totalProbes += probes; maxProbes = Math.max(maxProbes, probes); if (i == nextReportAt) { System.out.printf( %3d%% full, avg. %4.1f, max. %6d probes\n, percent, (double) totalProbes / i, maxProbes ); if (percent 60) percent += 10; else if (percent 90) percent += 5; else percent++; nextReportAt = (int) ((long) percent * MAXIMUM_CAPACITY / 100); } } } Which produces the following: 10% full, avg. 0.1, max. 9 probes 20% full, avg. 0.1, max. 15 probes 30% full, avg. 0.2, max. 25 probes 40% full, avg. 0.3, max. 38 probes 50% full, avg. 0.5, max. 64 probes 60% full, avg. 0.7, max. 93 probes 65% full, avg. 0.9, max.147 probes 70% full, avg. 1.2, max.224 probes 75% full, avg. 1.5, max.354 probes 80% full, avg. 2.0, max.441 probes 85% full, avg. 2.8, max.789 probes 90% full, avg. 4.5, max. 1869 probes 91% full, avg. 5.1, max. 2377 probes 92% full, avg. 5.7, max. 3409 probes 93% full, avg. 6.6, max. 3804 probes 94% full, avg. 7.8, max. 5824 probes 95% full, avg. 9.5, max. 7021 probes 96% full, avg. 12.0, max. 12607 probes 97% full, avg. 16.2, max. 17643 probes 98% full, avg. 24.5, max. 34561 probes 99% full, avg. 49.6, max. 131371 probes IMH resizing is arranged so that the table is always 33% ... 66% full (if nothing is ever removed from it) except when capacity reaches 2**29, then it can be filled up to the top. avg. # of probes can be taken as a rough estimation of average slow-down, max. # of probes can be taken as a rough estimation of worst-case-operation slow-down. So where to draw the line? Regards, Peter Relatedly: It occurs to me that it's traditional in java.util to throw OOME instead of ISE for capacity exceeded. On Wed, Jul 9, 2014 at 12:33 AM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: On 07/09/2014 09:23 AM, Peter Levart wrote: On 07/09/2014 02:46 AM, Martin Buchholz wrote: Let me understand - you're worried that when size is MAX_CAPACITY - 1, then a call to putAll that does not actually add any elements might throw? This is not possible, because resize() is only called from putAll(map) if argument map.size() this.size. At least one element will be added to the map and it's correct to throw if current size == MAX_CAPACITY - 1. ...at least if the argument map obeys the rules of Map contract. Even if it's a HashMap or another IdentityHashMap, it should not contain the same INSTANCE of the key in two or more mappings, should not overshoot reporting it's size() and should be stable during the whole putAll() operation... So calling IHM.addAll() with a live changing ConcurrentHashMap is not advisable. Even then, addAll() could only throw, because at some point the argument's size indicated that IHM could reach it's max. capacity. Peter Well, I'm having a hard time considering that corner case a bug, given how unusable the map is at
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 09:33 AM, Martin Buchholz wrote: I've updated the webrev http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ It now has all my TODOs done. The test case has been testng-ified. Hi Martin, What happened to the desire that when OOME is thrown on resizing, IHM is left unchanged? Regards, Peter On Mon, Jul 7, 2014 at 6:54 PM, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Unfortunately, x + x 1 causes the same overflow bug as 3*x: x + (x 1) is merely supposed to be possibly more efficient than 3*x. (int)(1431655766 + 1431655766 1) == 2 OK, I think my latest version doesn't have any overflow bugs.
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 12:12 PM, Peter Levart wrote: On 07/08/2014 09:33 AM, Martin Buchholz wrote: I've updated the webrev http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ It now has all my TODOs done. The test case has been testng-ified. Hi Martin, What happened to the desire that when OOME is thrown on resizing, IHM is left unchanged? Regards, Peter Hi Martin, I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ Here's a diff to your version: *** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 12:37:57.267916926 +0200 --- src/share/classes/java/util/IdentityHashMap.java2014-07-08 12:34:25.739767669 +0200 *** *** 437,449 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); ! modCount++; ! tab[i] = k; ! tab[i + 1] = value; ! size++; ! int x = size + (size 1); // Optimized form of 3 * size ! if (x len) ! resize(len); // len == 2 * current capacity. return null; } --- 437,457 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); ! ! int x = size + (size 1) + 3; // Optimized form of 3 * (size + 1) ! if (x len) { ! if (resize(len)) { // len == 2 * current capacity. ! tab = table; ! len = tab.length; ! i = hash(key, len); ! while (tab[i] != null) ! i = nextKeyIndex(i, len); ! } ! modCount++; ! tab[i] = k; ! tab[i + 1] = value; ! size++; ! } return null; } *** *** 452,468 * * @param newCapacity the new capacity, must be a power of two. */ ! private void resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return; } if (oldLength = newLength) ! return; Object[] newTable = new Object[newLength]; --- 460,476 * * @param newCapacity the new capacity, must be a power of two. */ ! private boolean resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return false; } if (oldLength = newLength) ! return false; Object[] newTable = new Object[newLength]; *** *** 480,485 --- 488,494 } } table = newTable; + return true; } /** Regards, Peter On Mon, Jul 7, 2014 at 6:54 PM, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Unfortunately, x + x 1 causes the same overflow bug as 3*x: x + (x 1) is merely supposed to be possibly more efficient than 3*x. (int)(1431655766 + 1431655766 1) == 2 OK, I think my latest version doesn't have any overflow bugs.
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Hi again, Here's a version with (3*size len) replaced with (size len/3) as suggested by Ivan Gerasimov to avoid overflow and also fixed if block if put() that enclosed too much code (in my changed version of Martin's latest webrev): http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.02/ I don't think there's a danger of overflow in capacity() since (expectedMaxSize + (expectedMaxSize 1)) is evaluated only if expectedMaxSize = MAXIMUM_CAPACITY / 3; Here's a diff to latest Martin's version: *** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 12:37:57.267916926 +0200 --- src/share/classes/java/util/IdentityHashMap.java2014-07-08 13:05:46.351810826 +0200 *** *** 437,449 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); modCount++; tab[i] = k; tab[i + 1] = value; size++; - int x = size + (size 1); // Optimized form of 3 * size - if (x len) - resize(len); // len == 2 * current capacity. return null; } --- 437,454 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); + + if (size = len / 3 resize(len)) { // len == 2 * current capacity. + tab = table; + len = tab.length; + i = hash(key, len); + while (tab[i] != null) + i = nextKeyIndex(i, len); + } modCount++; tab[i] = k; tab[i + 1] = value; size++; return null; } *** *** 452,468 * * @param newCapacity the new capacity, must be a power of two. */ ! private void resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return; } if (oldLength = newLength) ! return; Object[] newTable = new Object[newLength]; --- 457,473 * * @param newCapacity the new capacity, must be a power of two. */ ! private boolean resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return false; } if (oldLength = newLength) ! return false; Object[] newTable = new Object[newLength]; *** *** 480,485 --- 485,491 } } table = newTable; + return true; } /** *** *** 494,500 int n = m.size(); if (n == 0) return; ! if (n + (n 1) table.length) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet()) --- 500,506 int n = m.size(); if (n == 0) return; ! if (n table.length / 3) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet()) Regards, Peter On 07/08/2014 12:41 PM, Peter Levart wrote: On 07/08/2014 12:12 PM, Peter Levart wrote: On 07/08/2014 09:33 AM, Martin Buchholz wrote: I've updated the webrev http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ It now has all my TODOs done. The test case has been testng-ified. Hi Martin, What happened to the desire that when OOME is thrown on resizing, IHM is left unchanged? Regards, Peter Hi Martin, I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ Here's a diff to your version: *** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 12:37:57.267916926 +0200 --- src/share/classes/java/util/IdentityHashMap.java2014-07-08 12:34:25.739767669 +0200 *** *** 437,449 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); ! modCount++; ! tab[i] = k; ! tab[i + 1] = value; ! size++; ! int x = size + (size 1); // Optimized form of 3 * size ! if (x len) ! resize(len); // len == 2 * current capacity. return null; } --- 437,457 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); ! ! int x = size + (size 1) + 3; // Optimized form of 3 * (size + 1) ! if (x len) { !
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 08.07.2014 15:25, Peter Levart wrote: Hi again, Here's a version with (3*size len) replaced with (size len/3) as suggested by Ivan Gerasimov to avoid overflow and also fixed if block if put() that enclosed too much code (in my changed version of Martin's latest webrev): It shouldn't be needed, since size MAXIMUM_CAPACITY-1 at this point. http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.02/ I don't think there's a danger of overflow in capacity() since (expectedMaxSize + (expectedMaxSize 1)) is evaluated only if expectedMaxSize = MAXIMUM_CAPACITY / 3; Here's a diff to latest Martin's version: *** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 12:37:57.267916926 +0200 --- src/share/classes/java/util/IdentityHashMap.java2014-07-08 13:05:46.351810826 +0200 *** *** 437,449 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); modCount++; tab[i] = k; tab[i + 1] = value; size++; - int x = size + (size 1); // Optimized form of 3 * size - if (x len) - resize(len); // len == 2 * current capacity. return null; } --- 437,454 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); + + if (size = len / 3 resize(len)) { // len == 2 * current capacity. + tab = table; + len = tab.length; + i = hash(key, len); + while (tab[i] != null) + i = nextKeyIndex(i, len); + } modCount++; tab[i] = k; tab[i + 1] = value; size++; return null; } *** *** 452,468 * * @param newCapacity the new capacity, must be a power of two. */ ! private void resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return; } if (oldLength = newLength) ! return; Object[] newTable = new Object[newLength]; --- 457,473 * * @param newCapacity the new capacity, must be a power of two. */ ! private boolean resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return false; } if (oldLength = newLength) ! return false; Object[] newTable = new Object[newLength]; *** *** 480,485 --- 485,491 } } table = newTable; + return true; } /** *** *** 494,500 int n = m.size(); if (n == 0) return; ! if (n + (n 1) table.length) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet()) --- 500,506 int n = m.size(); if (n == 0) return; ! if (n table.length / 3) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet()) Regards, Peter On 07/08/2014 12:41 PM, Peter Levart wrote: On 07/08/2014 12:12 PM, Peter Levart wrote: On 07/08/2014 09:33 AM, Martin Buchholz wrote: I've updated the webrev http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ It now has all my TODOs done. The test case has been testng-ified. Hi Martin, What happened to the desire that when OOME is thrown on resizing, IHM is left unchanged? Regards, Peter Hi Martin, I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ Here's a diff to your version: *** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 12:37:57.267916926 +0200 --- src/share/classes/java/util/IdentityHashMap.java 2014-07-08 12:34:25.739767669 +0200 *** *** 437,449 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); ! modCount++; ! tab[i] = k; ! tab[i + 1] = value; ! size++; ! int x = size + (size 1); // Optimized form of 3 * size ! if (x len) ! resize(len); // len == 2 * current capacity. return null; } --- 437,457 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); ! !
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 01:53 PM, Ivan Gerasimov wrote: On 08.07.2014 15:25, Peter Levart wrote: Hi again, Here's a version with (3*size len) replaced with (size len/3) as suggested by Ivan Gerasimov to avoid overflow and also fixed if block if put() that enclosed too much code (in my changed version of Martin's latest webrev): It shouldn't be needed, since size MAXIMUM_CAPACITY-1 at this point. That's right. Not in put(). But in putAll() it can overflow, since the argument Map can be of any size that fits in int... So here's my 3rd variation of Martin's latest version: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.03/ And a diff to Martin's latest version: *** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 12:37:57.267916926 +0200 --- src/share/classes/java/util/IdentityHashMap.java2014-07-08 14:07:57.782843612 +0200 *** *** 437,449 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); modCount++; tab[i] = k; tab[i + 1] = value; size++; - int x = size + (size 1); // Optimized form of 3 * size - if (x len) - resize(len); // len == 2 * current capacity. return null; } --- 437,455 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); + + int x = size + (size 1) + 3; // Optimized form of 3 * (size + 1) + if (x len resize(len)) { // len == 2 * current capacity. + tab = table; + len = tab.length; + i = hash(key, len); + while (tab[i] != null) + i = nextKeyIndex(i, len); + } modCount++; tab[i] = k; tab[i + 1] = value; size++; return null; } *** *** 452,468 * * @param newCapacity the new capacity, must be a power of two. */ ! private void resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 - int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return; } if (oldLength = newLength) ! return; Object[] newTable = new Object[newLength]; --- 458,474 * * @param newCapacity the new capacity, must be a power of two. */ ! private boolean resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return false; } + int newLength = newCapacity * 2; if (oldLength = newLength) ! return false; Object[] newTable = new Object[newLength]; *** *** 480,485 --- 486,492 } } table = newTable; + return true; } /** *** *** 494,500 int n = m.size(); if (n == 0) return; ! if (n + (n 1) table.length) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet()) --- 501,507 int n = m.size(); if (n == 0) return; ! if (n table.length / 3) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet())
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 02:20 PM, Peter Levart wrote: That's right. Not in put(). But in putAll() it can overflow, since the argument Map can be of any size that fits in int... So here's my 3rd variation of Martin's latest version: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.03/ Another bug in my part 8-( I should've used the null-masked 'k' instead of 'key' in the index re-compute block: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.04/ And diff to Martin's patch: *** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 12:37:57.267916926 +0200 --- src/share/classes/java/util/IdentityHashMap.java2014-07-08 14:23:25.141249319 +0200 *** *** 437,449 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); modCount++; tab[i] = k; tab[i + 1] = value; size++; - int x = size + (size 1); // Optimized form of 3 * size - if (x len) - resize(len); // len == 2 * current capacity. return null; } --- 437,455 if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); + + int x = size + (size 1) + 3; // Optimized form of 3 * (size + 1) + if (x len resize(len)) { // len == 2 * current capacity. + tab = table; + len = tab.length; + i = hash(k, len); + while (tab[i] != null) + i = nextKeyIndex(i, len); + } modCount++; tab[i] = k; tab[i + 1] = value; size++; return null; } *** *** 452,468 * * @param newCapacity the new capacity, must be a power of two. */ ! private void resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 - int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return; } if (oldLength = newLength) ! return; Object[] newTable = new Object[newLength]; --- 458,474 * * @param newCapacity the new capacity, must be a power of two. */ ! private boolean resize(int newCapacity) { // assert (newCapacity -newCapacity) == newCapacity; // power of 2 Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further ! return false; } + int newLength = newCapacity * 2; if (oldLength = newLength) ! return false; Object[] newTable = new Object[newLength]; *** *** 480,485 --- 486,492 } } table = newTable; + return true; } /** *** *** 494,500 int n = m.size(); if (n == 0) return; ! if (n + (n 1) table.length) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet()) --- 501,507 int n = m.size(); if (n == 0) return; ! if (n table.length / 3) // conservatively pre-expand resize(capacity(n)); for (Entry? extends K, ? extends V e : m.entrySet()) Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 08.07.2014 4:44, Martin Buchholz wrote: On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz marti...@google.com mailto:marti...@google.com wrote: I'd like to offer an alternative version of this change. webrev coming soon. Here's the promised webrev. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ - Fixes a typo in the javadoc. - removes the threshold field (WAT, a cache to avoid multiplying by 3?) - optimizes 3*x into x + x 1 My personal preference would be x + x + x, but I thought JIT can do this kind of optimizations itself. Out of curiosity I've created a microbenchmark: Benchmark Mode SamplesScore Score errorUnits o.s.MyBenchmark.testMethod_01_X3 avgt 200 5.900 0.041ns/op o.s.MyBenchmark.testMethod_02_PPP avgt 200 6.029 0.035ns/op o.s.MyBenchmark.testMethod_03_PSH avgt 200 5.907 0.071ns/op On my machine 3*x and x + (x1) appear to be of the same speed (#1 and #3 above). x + x + x (case #2) happens to be ~2% slower. Given the optimization doesn't introduce any speedup, wouldn't it be better to use 3*x for its clarity? Sincerely yours, Ivan - adds more test assertions - removes the unusual 4/3 slack space for expansion on deserialization. TODO: the test should be testng-ified, I think.
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13088296.198 403446.449 ops/s Patched: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154 ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 03:56 PM, Martin Buchholz wrote: On Tue, Jul 8, 2014 at 5:30 AM, Peter Levart peter.lev...@gmail.com wrote: On 07/08/2014 02:20 PM, Peter Levart wrote: That's right. Not in put(). But in putAll() it can overflow, since the argument Map can be of any size that fits in int... I took another look at putAll. I think we can do more simply, relying on the checks in capacity and resize: int n = m.size(); if (n == 0) return; -if (n threshold) // conservatively pre-expand -resize(capacity(n)); +if (n size) +resize(capacity(n)); // conservatively pre-expand for (Entry? extends K, ? extends V e : m.entrySet()) put(e.getKey(), e.getValue()); Also, note I'm trying to avoid (relatively expensive) integer division, except at compile time. Nice. Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Given the table size if a power of two, another possible optimization would be private static int nextKeyIndex(int i, int len) { -return (i + 2 len ? i + 2 : 0); +return (i + 2) (len - 1); } Or even +int m = len - 1; while ( (item = tab[i]) != null) { ... -i = nextKeyIndex(i, len); +i = (i + 2) m; } where ever applicable. On 08.07.2014 19:14, Martin Buchholz wrote: So ... using 3*x is never wrong, and optimizing to x + (x 1) is at best only going to save a couple of cycles, so 3*x is the right choice except for the most performance sensitive code. In java.util, we tend to go further and write optimal code even when hotspot is likely to make the same optimizations, partly under Doug Lea's performance-obsessed influence. Also, microbenchmarking is notoriously difficult. If you've written a microbenchmark, it would be good to check it in somewhere. I don't know what current openjdk practice is for that... On Tue, Jul 8, 2014 at 7:01 AM, Ivan Gerasimov ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote: On 08.07.2014 4:44, Martin Buchholz wrote: On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz marti...@google.com mailto:marti...@google.com wrote: I'd like to offer an alternative version of this change. webrev coming soon. Here's the promised webrev. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ - Fixes a typo in the javadoc. - removes the threshold field (WAT, a cache to avoid multiplying by 3?) - optimizes 3*x into x + x 1 My personal preference would be x + x + x, but I thought JIT can do this kind of optimizations itself. Out of curiosity I've created a microbenchmark: Benchmark Mode SamplesScore Score errorUnits o.s.MyBenchmark.testMethod_01_X3 avgt 2005.900 0.041ns/op o.s.MyBenchmark.testMethod_02_PPP avgt 2006.029 0.035ns/op o.s.MyBenchmark.testMethod_03_PSH avgt 2005.907 0.071ns/op On my machine 3*x and x + (x1) appear to be of the same speed (#1 and #3 above). x + x + x (case #2) happens to be ~2% slower. Given the optimization doesn't introduce any speedup, wouldn't it be better to use 3*x for its clarity? Sincerely yours, Ivan - adds more test assertions - removes the unusual 4/3 slack space for expansion on deserialization. TODO: the test should be testng-ified, I think.
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 10:07 PM, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). Nice solution. I hope that resize: 458 private boolean resize(int newCapacity) { 459 // assert (newCapacity -newCapacity) == newCapacity; // power of 2 460 int newLength = newCapacity * 2; 461 462 Object[] oldTable = table; 463 int oldLength = oldTable.length; 464 if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further 465 if (size == MAXIMUM_CAPACITY - 1) 466 throw new IllegalStateException(Capacity exhausted.); 467 return false; 468 } 469 if (oldLength = newLength) 470 return false; ... is never (going to be) called in context that would make oldLength = newLength oldLength == 2 * MAXIMUM_CAPACITY size == MAXIMUM_CAPACITY - 1 ... In that case it would be better to return false than to throw IllegalStateException, since no re-sizing was actually requested. if statement (L469,470) could be moved just after newLength assignment (L460) to make resize() more robust to possible future changes of calling code. I wonder what the benchmarks will say. Regards, Peter On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13088296.198 tel:13088296.198 403446.449ops/s Patched: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
If size == MAXIMUM_CAPACITY - 1 and you're in resize, presumably you're about to fill that last empty slot after returning, so you want to throw instead of returning false? Benchmarks welcome. On Tue, Jul 8, 2014 at 2:15 PM, Peter Levart peter.lev...@gmail.com wrote: On 07/08/2014 10:07 PM, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). Nice solution. I hope that resize: 458 private boolean resize(int newCapacity) { 459 // assert (newCapacity -newCapacity) == newCapacity; // power of 2 460 int newLength = newCapacity * 2; 461 462 Object[] oldTable = table; 463 int oldLength = oldTable.length; 464 if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any further 465 if (size == MAXIMUM_CAPACITY - 1) 466 throw new IllegalStateException(Capacity exhausted.); 467 return false; 468 } 469 if (oldLength = newLength) 470 return false; ... is never (going to be) called in context that would make oldLength = newLength oldLength == 2 * MAXIMUM_CAPACITY size == MAXIMUM_CAPACITY - 1 ... In that case it would be better to return false than to throw IllegalStateException, since no re-sizing was actually requested. if statement (L469,470) could be moved just after newLength assignment (L460) to make resize() more robust to possible future changes of calling code. I wonder what the benchmarks will say. Regards, Peter On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode SamplesScore Score error Units j.t.IHMBench.putNewObjectthrpt10 13088296.198 403446.449 ops/s Patched: Benchmark Mode SamplesScore Score error Units j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154 ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Might be worth to add modCount++ before this line: 487 table = newTable; 488 return true; On 09.07.2014 0:07, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13088296.198 tel:13088296.198 403446.449ops/s Patched: Benchmark Mode SamplesScore Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/08/2014 11:39 PM, Ivan Gerasimov wrote: Might be worth to add modCount++ before this line: 487 table = newTable; 488 return true; Not quite, I think. The map has just been resized, but it's contents has not changed yet logically. Regards, Peter On 09.07.2014 0:07, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode Samples Score Score error Units j.t.IHMBench.putNewObjectthrpt10 13088296.198 tel:13088296.198 403446.449ops/s Patched: Benchmark Mode Samples Score Score error Units j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Looks pretty good. I like the additional comments as well. Could you document the return conditions of resize()? A since we're there already suggestion for readObject: if (size 0) throw new InvalidObjectException(Illegal mappings count: + size); Mike On Jul 8 2014, at 13:07 , Martin Buchholz marti...@google.com wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode SamplesScore Score error Units j.t.IHMBench.putNewObjectthrpt10 13088296.198 403446.449 ops/s Patched: Benchmark Mode SamplesScore Score error Units j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154 ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 09.07.2014 1:44, Peter Levart wrote: On 07/08/2014 11:39 PM, Ivan Gerasimov wrote: Might be worth to add modCount++ before this line: 487 table = newTable; 488 return true; Not quite, I think. The map has just been resized, but it's contents has not changed yet logically. IdentityHashMapIterator's methods assume that if modCount didn't change, then the indices calculated earlier remain valid, and this is wrong in the case of resize. Sincerely yours, Ivan Regards, Peter On 09.07.2014 0:07, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode Samples Score Score error Units j.t.IHMBench.putNewObjectthrpt10 13088296.198 tel:13088296.198 403446.449ops/s Patched: Benchmark Mode Samples Score Score error Units j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/09/2014 12:06 AM, Ivan Gerasimov wrote: On 09.07.2014 1:44, Peter Levart wrote: On 07/08/2014 11:39 PM, Ivan Gerasimov wrote: Might be worth to add modCount++ before this line: 487 table = newTable; 488 return true; Not quite, I think. The map has just been resized, but it's contents has not changed yet logically. IdentityHashMapIterator's methods assume that if modCount didn't change, then the indices calculated earlier remain valid, and this is wrong in the case of resize. Sincerely yours, Ivan IdentityHashMapIterator: 713 private abstract class IdentityHashMapIteratorT implements IteratorT { 714 int index = (size != 0 ? 0 : table.length); // current slot. 715 int expectedModCount = modCount; // to support fast-fail 716 int lastReturnedIndex = -1; // to allow remove() 717 boolean indexValid; // To avoid unnecessary next computation 718 Object[] traversalTable = table; // reference to main table or copy ...takes a snap-shot of reference to current table when created, so indexes would still be valid ... ...but resize() also clears old table as it copies elements to new table: 478 oldTable[j] = null; 479 oldTable[j+1] = null; So it would appear that modCount should be incremented even before the copying loop. But as it seems, no user call-backs are possible during resize() in same thread and normal writes are not ordered anyway for other threads and after each successful resize() at least one new key will be added to the map so modCount will be incremented before control is returned to user code anyway. Regards, Peter Regards, Peter On 09.07.2014 0:07, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode Samples Score Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13088296.198 tel:13088296.198 403446.449ops/s Patched: Benchmark Mode Samples Score Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Ah, yes, sure. I overlooked the reference to the table :-) On 09.07.2014 2:42, Peter Levart wrote: On 07/09/2014 12:06 AM, Ivan Gerasimov wrote: On 09.07.2014 1:44, Peter Levart wrote: On 07/08/2014 11:39 PM, Ivan Gerasimov wrote: Might be worth to add modCount++ before this line: 487 table = newTable; 488 return true; Not quite, I think. The map has just been resized, but it's contents has not changed yet logically. IdentityHashMapIterator's methods assume that if modCount didn't change, then the indices calculated earlier remain valid, and this is wrong in the case of resize. Sincerely yours, Ivan IdentityHashMapIterator: 713 private abstract class IdentityHashMapIteratorT implements IteratorT { 714 int index = (size != 0 ? 0 : table.length); // current slot. 715 int expectedModCount = modCount; // to support fast-fail 716 int lastReturnedIndex = -1; // to allow remove() 717 boolean indexValid; // To avoid unnecessary next computation 718 Object[] traversalTable = table; // reference to main table or copy ...takes a snap-shot of reference to current table when created, so indexes would still be valid ... ...but resize() also clears old table as it copies elements to new table: 478 oldTable[j] = null; 479 oldTable[j+1] = null; So it would appear that modCount should be incremented even before the copying loop. But as it seems, no user call-backs are possible during resize() in same thread and normal writes are not ordered anyway for other threads and after each successful resize() at least one new key will be added to the map so modCount will be incremented before control is returned to user code anyway. Regards, Peter Regards, Peter On 09.07.2014 0:07, Martin Buchholz wrote: I updated my webrev and it is again feature-complete. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ (old webrev at http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/ ) This incorporates Peter's idea of making resize return a boolean, keeps the map unchanged if resize throws, moves the check for capacity exceeded into resize, and minimizes bytecode in put(). I'm happy with this (except for degraded behavior near MAX_CAPACITY). On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: On 07/08/2014 03:00 PM, Ivan Gerasimov wrote: I took your latest version of the patch and modified it a little: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/ But isn't it post-insert-resize vs pre-insert-resize problem Doug mentioned above? I've tested a similar fix and it showed slow down of the put() operation. Hi Ivan, Might be that it has to do with # of bytecodes in the method and in-lining threshold. I modified it once more, to make put() method as short as possible: http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/ With this, I ran the following JMH benchmark: @State(Scope.Thread) public class IHMBench { MapObject, Object map = new IdentityHashMapObject, Object(); @Benchmark public void putNewObject(Blackhole bh) { Object o = new Object(); bh.consume(map.put(o, o)); if (map.size() 10) { map = new IdentityHashMapObject, Object(); } } } I get the following results on my i7/Linux using: java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8 -gc 1 -t 1 Original: Benchmark Mode Samples Score Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13088296.198 tel:13088296.198 403446.449ops/s Patched: Benchmark Mode Samples Score Score errorUnits j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154ops/s Can you run your test with webrev.05 and see what you get ? Regards, Peter
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 09.07.2014 3:20, Martin Buchholz wrote: I agree with Peter that we do not need to increment modCount on resize. My latest webrev is again done. Ivan, feel free to take over. Yes, thanks! The fix is much much better now. I think I see yet another very minor glitch, though. If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging in another map, which has more elements with putAll(), resize() will be called and it will throw, even though all the elements could fit into the map due to the key duplicates. To avoid this the following check should be done first in resize(): 469 if (oldLength = newLength) 470 return false; Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Here's the same final webrev by Martin. I only shortened the comment before capacity() and moved the check in resize() upper. http://cr.openjdk.java.net/~igerasim/6904367/4/webrev/ Sincerely yours, Ivan On 09.07.2014 4:25, Ivan Gerasimov wrote: On 09.07.2014 3:20, Martin Buchholz wrote: I agree with Peter that we do not need to increment modCount on resize. My latest webrev is again done. Ivan, feel free to take over. Yes, thanks! The fix is much much better now. I think I see yet another very minor glitch, though. If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging in another map, which has more elements with putAll(), resize() will be called and it will throw, even though all the elements could fit into the map due to the key duplicates. To avoid this the following check should be done first in resize(): 469 if (oldLength = newLength) 470 return false; Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 09.07.2014 4:46, Martin Buchholz wrote: Let me understand - you're worried that when size is MAX_CAPACITY - 1, then a call to putAll that does not actually add any elements might throw? Well, I'm having a hard time considering that corner case a bug, given how unusable the map is at this point. Your suggested fix of returning immediately in case of no resize would cause put to succeed and reach size == MAX_CAPACITY, which you were trying to prevent. Ah, yes. It's almost 5 AM here, I guess I need to have some sleep before trying to fix anything else :-) On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote: On 09.07.2014 3:20, Martin Buchholz wrote: I agree with Peter that we do not need to increment modCount on resize. My latest webrev is again done. Ivan, feel free to take over. Yes, thanks! The fix is much much better now. I think I see yet another very minor glitch, though. If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging in another map, which has more elements with putAll(), resize() will be called and it will throw, even though all the elements could fit into the map due to the key duplicates. To avoid this the following check should be done first in resize(): 469 if (oldLength = newLength) 470 return false; Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 08.07.2014 4:47, Martin Buchholz wrote: I think this code has an off-by-factor-of-2 bug. +if (expectedMaxSize MAXIMUM_CAPACITY / 3) +return MAXIMUM_CAPACITY; No, even though it looks like a bug. (MAXIMUM_CAPACITY / 3) * (3 / 2) == MAXIMUM_CAPACITY / 2. if expected size MAXIMUM_CAPACITY / 3, then the minimum capacity must be MAXIMUM_CAPACITY / 2 then the minimum capacity == MAXIMUM_CAPACITY.
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Thank you Martin for the enhancement! It's a good idea to get rid of threshold variable! When the table grows large, it will start to try to resize the table on every single put(). Though it shouldn't matter much, as in such situation everything is already slow. On 08.07.2014 4:44, Martin Buchholz wrote: On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz marti...@google.com mailto:marti...@google.com wrote: I'd like to offer an alternative version of this change. webrev coming soon. Here's the promised webrev. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/ - Fixes a typo in the javadoc. - removes the threshold field (WAT, a cache to avoid multiplying by 3?) - optimizes 3*x into x + x 1 Unfortunately, x + x 1 causes the same overflow bug as 3*x: (int)(1431655766 + 1431655766 1) == 2 this can happen in capacity() and, theoretically, in putAll(). I propose to leave the check if (expectedMaxSize MAXIMUM_CAPACITY / 3) return MAXIMUM_CAPACITY; which is overflow resistant. In putAll, I guess, something like this can be used: int len = table.length; if (n len || n len - n || n len - (n 1)) resize(capacity(n)); - adds more test assertions - removes the unusual 4/3 slack space for expansion on deserialization. TODO: the test should be testng-ified, I think.
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
So, I reverted this change to the original code, but added a single check of the current table size before doing any modifications to the map. This is to address the issue #4, and this doesn't seem to introduce any significant penalty. Would you please help review the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/3/webrev/ Sincerely yours, Ivan It's possible to avoid the capacity test for the general case: put { ...while loop... if (size threshold) { modCount++; tab[i] = k; tab[i + 1] = value; ++size; } else { putAndResize(k, value, i); } return null; } private void putAndResize(Object k, Object value, int i) { if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); modCount++; Object[] tab = table; tab[i] = k; tab[i + 1] = value; ++size; resize(tab.length); // len == 2 * current capacity. } It seems faster more often than not, but it adds more (and copied) code: not sure it's worth it. -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Thanks for suggestion Jeff! I've tried it, but it doesn't seem to make a difference on my machine. Here are the numbers. I measured the time of a single put in nanoseconds. --- | vanila | fix1 | fix2 client | 8292 | 8220 | 8304 server | 8197 | 8198 | 8221 interpreter | 13980 | 14478 | 14152 fix1 - the fix from webrev #3 fix2 - the fix with your suggestion Sincerely yours, Ivan On 06.07.2014 15:13, Jeff Hain wrote: So, I reverted this change to the original code, but added a single check of the current table size before doing any modifications to the map. This is to address the issue #4, and this doesn't seem to introduce any significant penalty. Would you please help review the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/3/webrev/ http://cr.openjdk.java.net/%7Eigerasim/6904367/3/webrev/ Sincerely yours, Ivan It's possible to avoid the capacity test for the general case: put { ...while loop... if (size threshold) { modCount++; tab[i] = k; tab[i + 1] = value; ++size; } else { putAndResize(k, value, i); } return null; } private void putAndResize(Object k, Object value, int i) { if (size == MAXIMUM_CAPACITY - 1) throw new IllegalStateException(Capacity exhausted.); modCount++; Object[] tab = table; tab[i] = k; tab[i + 1] = value; ++size; resize(tab.length); // len == 2 * current capacity. } It seems faster more often than not, but it adds more (and copied) code: not sure it's worth it. -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Thank you Doug for your comment! On 04.07.2014 23:38, Doug Lea wrote: On 07/04/2014 01:33 PM, Ivan Gerasimov wrote: On 04.07.2014 8:14, David Holmes wrote: Hi Ivan, I find the change to capacity somewhat obfuscating and I can't see what the actual bug was. The bug was in rounding in the expression minCapacity = (3 * expectedMaxSize)/2. Suppose, we want the expected size to be 11, then minCapacity becomes (int)(11 * 3 / 2) == 16. The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which is less than expected size 11. So the bug report was based on someone noticing that with odd-valued sizes, the undocumented 2/3 target load was rounded down, not up, which they didn't like? Not quite. There are two small issues that I'm trying to address here: 1) The wrong capacity is calculated based on the expected size due to rounding error. This is only noticeable for the expected sizes 11, 43, 171, 683 and a few others. This leads to a redundant table resize when inserting the expected number of elements. 2) The wrong capacity due to overflow error. For numbers N larger or equal to 1431655765, the expression (int)(N * 3) / 2 produces non-negative small numbers. Frankly speaking, this is also true for my last proposed fix, so the updated webrev will follow. 3) In put() the table gets resized as soon as the threshold is reached. This is wrong for expected sizes 2, 3, 5, 10, 21, 42, and some others. 4) If put() was unsuccessful, the map gets corrupt. I don't object to rounding it up and modernizing the size calculations to use highestOneBit, but it's a weak pretense :-) Simple rounding the minimum capacity up as minCapacity = (3 * expectedMaxSize + 1)/2 would produce the same result of course. I just took a chance to replace the while loop with a single function call. In the last webrev I also modified the check to address the issue #2 from the list above. To address the issue I combined the division by 2 with the rounding up to the nearest power of two. I also took a chance to replace a while-loop with a single call to the highestOneBit method, which calculates exactly what we need here. The recursive call to put after a resize seems very sub-optimal as you will re-search the map for the non-existent key. Can you not just recompute the correct indices and do the store? The initial rationale for post-insert-resize was to avoid these issues given the need to preserve at least one empty slot. Which I expect to remain faster than pre-resize even with your modified patch. Please provide systematic performance comparisons before committing! -Doug Yes, you are right. Microbenchmark showed more than 20% slowdown of a single put operation. So, I reverted this change to the original code, but added a single check of the current table size before doing any modifications to the map. This is to address the issue #4, and this doesn't seem to introduce any significant penalty. Would you please help review the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/3/webrev/ Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Thanks David for looking at the change! On 04.07.2014 8:14, David Holmes wrote: Hi Ivan, I find the change to capacity somewhat obfuscating and I can't see what the actual bug was. The bug was in rounding in the expression minCapacity = (3 * expectedMaxSize)/2. Suppose, we want the expected size to be 11, then minCapacity becomes (int)(11 * 3 / 2) == 16. The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which is less than expected size 11. To address the issue I combined the division by 2 with the rounding up to the nearest power of two. I also took a chance to replace a while-loop with a single call to the highestOneBit method, which calculates exactly what we need here. The recursive call to put after a resize seems very sub-optimal as you will re-search the map for the non-existent key. Can you not just recompute the correct indices and do the store? Okay, I replaced the recursive call with the index recomputation. As was suggested by Jeff, I added a comment to the MAXIMUM_CAPACITY constant. I also reverted some part of the changes to the resize() function, as it turned out that the new code failed to deal correctly with the table of maximum capacity. I wanted to add checking of this behavior to the regression test, but faced an obstacle: I used reflection to make the MAXIMUM_CAPACITY constant smaller, so it wouldn't be needed to insert 2**29 element to fill the table up. The debugger was showing that the constant was changed, but the old value was still used in the conditions. Currently I commented this test out. If anyone could suggest me how I can properly change MAXIMUM_CAPACITY in this situation, it would be very appreciated. Here is the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/2/webrev/ Would you please help review it? Sincerely yours, Ivan. Alternatively use the original code and if an exception occurs on resize then simply undo the put. I'm always worried that inverting the behaviour like you will have some unexpected side effect somewhere - this code has been like this for a very, very long time. Cheers, David On 4/07/2014 5:38 AM, Ivan Gerasimov wrote: Thank you Jeff! On 03.07.2014 23:07, Jeff Hain wrote: Hi. WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/ The while loop in put(...) supposes that there is at least one free slot, which was the case before (that could be added as comment). Now if you try to go past max capacity, instead of getting an IllegalStateException, you get an infinite loop. Yes, you're right! It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18 (=256, needs to be larger than DEFAULT_CAPACITY, else it can be ignored). With previous version you get IllegalStateException on trying to add 255th mapping (on the resize that occurs when just-after-put-size is 255 = threshold = 255), and with new version you get infinite loop on trying to add 257th mapping. Solutions I see would be adding a throw if size = MAX_CAPACITY before the loop, This would break the case when an existing key is added to the already full table. Moreover, the full table without a single empty slot could cause an infinite looping in get(), remove(), etc. or not adding that overhead and instead throwing when size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY. This seems appropriate. With the current implementation we can only put MAX_CAPACITY-1 elements anyway. I would also rather use == over = for these tests, to underline the fact that size is not supposed to be larger, but there might be some reasons not to. I think it makes the code a bit more stable from the perspective of the future changes. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/ Sincerely yours, Ivan -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
On 07/04/2014 01:33 PM, Ivan Gerasimov wrote: On 04.07.2014 8:14, David Holmes wrote: Hi Ivan, I find the change to capacity somewhat obfuscating and I can't see what the actual bug was. The bug was in rounding in the expression minCapacity = (3 * expectedMaxSize)/2. Suppose, we want the expected size to be 11, then minCapacity becomes (int)(11 * 3 / 2) == 16. The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which is less than expected size 11. So the bug report was based on someone noticing that with odd-valued sizes, the undocumented 2/3 target load was rounded down, not up, which they didn't like? I don't object to rounding it up and modernizing the size calculations to use highestOneBit, but it's a weak pretense :-) To address the issue I combined the division by 2 with the rounding up to the nearest power of two. I also took a chance to replace a while-loop with a single call to the highestOneBit method, which calculates exactly what we need here. The recursive call to put after a resize seems very sub-optimal as you will re-search the map for the non-existent key. Can you not just recompute the correct indices and do the store? The initial rationale for post-insert-resize was to avoid these issues given the need to preserve at least one empty slot. Which I expect to remain faster than pre-resize even with your modified patch. Please provide systematic performance comparisons before committing! -Doug
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Here is the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/2/webrev/ There is another bug with the initial implementation: if put throws, the element is still put (and the size incremented accordingly), and if putting a new mapping after the throwing one, the table gets full and get()/put()/remove() become subject to infinite loop. (and my idea that size was not supposed to go past MAX_CAPACITY was false in that case :-) With the new implementation, if put throws, the map is not modified (not even modCount) so it doesn't get broken. For performance, maybe isolating the code in the if in a int resizeAndPut(Object k, Object item, int len, int i) {...} method would help, reducing put() bytecode size. Having put() find the last free slots in an almost-full table would most often require a lot of time, so I wonder if anyone would have hit any of these infinite loops in practice. -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Ivan, Fields like MAXIMUM_CAPACITY are compile-time constants. Their values are placed directly into the code that uses them - there is no associated getField to load it. Hence changing the field via reflection has no affect on any existing code that references the field. I'm still uncomfortable about the scope of the changes - and Doug rightly mentions doing detailed performance runs before submitting this. I still think recovering from a post-put resize failure might be better than resizing first. David On 5/07/2014 3:33 AM, Ivan Gerasimov wrote: Thanks David for looking at the change! On 04.07.2014 8:14, David Holmes wrote: Hi Ivan, I find the change to capacity somewhat obfuscating and I can't see what the actual bug was. The bug was in rounding in the expression minCapacity = (3 * expectedMaxSize)/2. Suppose, we want the expected size to be 11, then minCapacity becomes (int)(11 * 3 / 2) == 16. The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which is less than expected size 11. To address the issue I combined the division by 2 with the rounding up to the nearest power of two. I also took a chance to replace a while-loop with a single call to the highestOneBit method, which calculates exactly what we need here. The recursive call to put after a resize seems very sub-optimal as you will re-search the map for the non-existent key. Can you not just recompute the correct indices and do the store? Okay, I replaced the recursive call with the index recomputation. As was suggested by Jeff, I added a comment to the MAXIMUM_CAPACITY constant. I also reverted some part of the changes to the resize() function, as it turned out that the new code failed to deal correctly with the table of maximum capacity. I wanted to add checking of this behavior to the regression test, but faced an obstacle: I used reflection to make the MAXIMUM_CAPACITY constant smaller, so it wouldn't be needed to insert 2**29 element to fill the table up. The debugger was showing that the constant was changed, but the old value was still used in the conditions. Currently I commented this test out. If anyone could suggest me how I can properly change MAXIMUM_CAPACITY in this situation, it would be very appreciated. Here is the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/2/webrev/ Would you please help review it? Sincerely yours, Ivan. Alternatively use the original code and if an exception occurs on resize then simply undo the put. I'm always worried that inverting the behaviour like you will have some unexpected side effect somewhere - this code has been like this for a very, very long time. Cheers, David On 4/07/2014 5:38 AM, Ivan Gerasimov wrote: Thank you Jeff! On 03.07.2014 23:07, Jeff Hain wrote: Hi. WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/ The while loop in put(...) supposes that there is at least one free slot, which was the case before (that could be added as comment). Now if you try to go past max capacity, instead of getting an IllegalStateException, you get an infinite loop. Yes, you're right! It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18 (=256, needs to be larger than DEFAULT_CAPACITY, else it can be ignored). With previous version you get IllegalStateException on trying to add 255th mapping (on the resize that occurs when just-after-put-size is 255 = threshold = 255), and with new version you get infinite loop on trying to add 257th mapping. Solutions I see would be adding a throw if size = MAX_CAPACITY before the loop, This would break the case when an existing key is added to the already full table. Moreover, the full table without a single empty slot could cause an infinite looping in get(), remove(), etc. or not adding that overhead and instead throwing when size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY. This seems appropriate. With the current implementation we can only put MAX_CAPACITY-1 elements anyway. I would also rather use == over = for these tests, to underline the fact that size is not supposed to be larger, but there might be some reasons not to. I think it makes the code a bit more stable from the perspective of the future changes. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/ Sincerely yours, Ivan -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Martin's law of expanding capacity: Always grow by using the form newCapacity = oldCapacity + oldCapacity n for some suitable constant n. This will be efficient and more overflow resistant than the alternative newCapacity = oldCapacity * (2**n + 1) / (2**n) Here n == 1. On Thu, Jul 3, 2014 at 9:00 AM, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hello! IdentityHasMap has a couple of small issues. The first one is a performance issue: If you create a map, specifying 42 as the expected number of element, you'll be able to insert only 41 elements into the preallocated table. Inserting the 42th element will trigger resizing of the storage. Another issue is that resizing occurs only after the element insertion. In the extreme case it can lead to the situation when the element is successfully inserted and then en exception is thrown due the map been unable to grow. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-6904367 WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Thanks Martin! On 03.07.2014 21:12, Martin Buchholz wrote: Martin's law of expanding capacity: Always grow by using the form newCapacity = oldCapacity + oldCapacity n for some suitable constant n. This will be efficient and more overflow resistant than the alternative newCapacity = oldCapacity * (2**n + 1) / (2**n) Here n == 1. More precisely here n == 0, because the capacity gets doubled when increased. The formula cap = expSize * 3 / 2 is used to estimate the capacity based on the expected number of the items to be inserted. Sincerely yours, Ivan On Thu, Jul 3, 2014 at 9:00 AM, Ivan Gerasimov ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote: Hello! IdentityHasMap has a couple of small issues. The first one is a performance issue: If you create a map, specifying 42 as the expected number of element, you'll be able to insert only 41 elements into the preallocated table. Inserting the 42th element will trigger resizing of the storage. Another issue is that resizing occurs only after the element insertion. In the extreme case it can lead to the situation when the element is successfully inserted and then en exception is thrown due the map been unable to grow. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-6904367 WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/ Sincerely yours, Ivan
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Hi. WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ The while loop in put(...) supposes that there is at least one free slot, which was the case before (that could be added as comment). Now if you try to go past max capacity, instead of getting an IllegalStateException, you get an infinite loop. It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18 (=256, needs to be larger than DEFAULT_CAPACITY, else it can be ignored). With previous version you get IllegalStateException on trying to add 255th mapping (on the resize that occurs when just-after-put-size is 255 = threshold = 255), and with new version you get infinite loop on trying to add 257th mapping. Solutions I see would be adding a throw if size = MAX_CAPACITY before the loop, or not adding that overhead and instead throwing when size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY. I would also rather use == over = for these tests, to underline the fact that size is not supposed to be larger, but there might be some reasons not to. -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Thank you Jeff! On 03.07.2014 23:07, Jeff Hain wrote: Hi. WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/ The while loop in put(...) supposes that there is at least one free slot, which was the case before (that could be added as comment). Now if you try to go past max capacity, instead of getting an IllegalStateException, you get an infinite loop. Yes, you're right! It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18 (=256, needs to be larger than DEFAULT_CAPACITY, else it can be ignored). With previous version you get IllegalStateException on trying to add 255th mapping (on the resize that occurs when just-after-put-size is 255 = threshold = 255), and with new version you get infinite loop on trying to add 257th mapping. Solutions I see would be adding a throw if size = MAX_CAPACITY before the loop, This would break the case when an existing key is added to the already full table. Moreover, the full table without a single empty slot could cause an infinite looping in get(), remove(), etc. or not adding that overhead and instead throwing when size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY. This seems appropriate. With the current implementation we can only put MAX_CAPACITY-1 elements anyway. I would also rather use == over = for these tests, to underline the fact that size is not supposed to be larger, but there might be some reasons not to. I think it makes the code a bit more stable from the perspective of the future changes. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/ Sincerely yours, Ivan -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/ My test now terminates (exception on MAX_CAPACITY-th put). Maybe where MAX_CAPACITY is defined you could indicate that you can actually have at most MAX_CAPACITY-1 mappings, to ensure that the table always contains some null key-spot (I don't see this invariant explicited already). -Jeff
Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
Hi Ivan, I find the change to capacity somewhat obfuscating and I can't see what the actual bug was. The recursive call to put after a resize seems very sub-optimal as you will re-search the map for the non-existent key. Can you not just recompute the correct indices and do the store? Alternatively use the original code and if an exception occurs on resize then simply undo the put. I'm always worried that inverting the behaviour like you will have some unexpected side effect somewhere - this code has been like this for a very, very long time. Cheers, David On 4/07/2014 5:38 AM, Ivan Gerasimov wrote: Thank you Jeff! On 03.07.2014 23:07, Jeff Hain wrote: Hi. WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/ The while loop in put(...) supposes that there is at least one free slot, which was the case before (that could be added as comment). Now if you try to go past max capacity, instead of getting an IllegalStateException, you get an infinite loop. Yes, you're right! It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18 (=256, needs to be larger than DEFAULT_CAPACITY, else it can be ignored). With previous version you get IllegalStateException on trying to add 255th mapping (on the resize that occurs when just-after-put-size is 255 = threshold = 255), and with new version you get infinite loop on trying to add 257th mapping. Solutions I see would be adding a throw if size = MAX_CAPACITY before the loop, This would break the case when an existing key is added to the already full table. Moreover, the full table without a single empty slot could cause an infinite looping in get(), remove(), etc. or not adding that overhead and instead throwing when size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY. This seems appropriate. With the current implementation we can only put MAX_CAPACITY-1 elements anyway. I would also rather use == over = for these tests, to underline the fact that size is not supposed to be larger, but there might be some reasons not to. I think it makes the code a bit more stable from the perspective of the future changes. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/ Sincerely yours, Ivan -Jeff