Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-16 Thread Jeff Hain


Hi, took me some time to setup Maven/JMH and learn the basics.
(two tools in a day, phew, that's more than I usually do in a year! :)



JIT can sometimes optimize the code so aggressively

I was trying to bench this aggressively optimized
version of the code, with the idea that:
- If the JVM thinks this code is not to be optimized,
  it should not slow down the program too much, and
  having clean code should suffice.
- If it decides to optimize it, then the optimized
  version must be the fastest possible, not to slow
  down the whole program.
  In particular, I've once seen some optimization actually
  slow things down (a cast from double to int, that was
  intentionally duplicated, got factored in common path,
  assuming that it was necessarily fast, but it wasn't...
  so I replaced one of the (int)x with -(int)-x and
  the awful perfs went away).

But I agree I have no clue whether the optimization would
be similar in another program.



I suggest you try to express your benchmark in a tried framework like JMH:

    http://openjdk.java.net/projects/code-tools/jmh/

In particular I suggest studying included samples.

Regards, Peter


Below the JMH version of my neanderthalian bench :)
followed by some results (with 1.7 bytecode, win7, core i7),

then a bench similar to the one you posted
(http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-July/027681.html),
and results again.



@State(Scope.Thread)
public class JmhIHMBench {

    private static final int MAX_NBR_OF_MAPPINGS = 1*1000;
    
    final MapObject, Object map_original = new IdentityHashMapObject, 
Object();
    // etc.
    final MapObject, Object map_peter8noloop = new 
IdentityHashMapPeter8NoLoopObject, Object();

    Object[] keys;
    int i;
    private static Object[] newKeys(int size) {
        final Object[] keys = new Object[size];
        for (int i=0;ikeys.length;i++) {
            keys[i] = new Object();
        }
        return keys;
    }
    
    @Setup(Level.Iteration)
    public void prepare() {
        keys = newKeys(MAX_NBR_OF_MAPPINGS);
        i = 0;
    }

    @Benchmark
    public void bench_put_original(Blackhole bh) {
        bench_put(bh, this.map_original);
    }
    // etc.
    @Benchmark
    public void bench_put_peter8noloop(Blackhole bh) {
        bench_put(bh, this.map_peter8noloop);
    }

    private void bench_put(Blackhole bh, MapObject,Object map) {
        if (map.size() == MAX_NBR_OF_MAPPINGS) {
            map.clear();
            i = 0;
        }
        final Object kv = keys[i++];
        bh.consume(map.put(kv, kv));
    }
    
    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(.* + JmhIHMBench.class.getSimpleName() + .*)
                .warmupIterations(8)
                .measurementIterations(10)
                .threads(1)
                .forks(1)
                .shouldDoGC(false)
                .build();
        new Runner(opt).run();
    }
}
    /*
     * jdk7u51 (run 1):
Benchmark                                Mode   Samples        Score  Score 
error    Units
r.JmhIHMBench.bench_put_original        thrpt        10 84087111,944  
6033159,942    ops/s
r.JmhIHMBench.bench_put_peter7          thrpt        10 68893501,546  
1375694,507    ops/s
r.JmhIHMBench.bench_put_peter7noloop    thrpt        10 79900659,706  
6637125,685    ops/s
r.JmhIHMBench.bench_put_peter8          thrpt        10 79062559,130  
1392057,456    ops/s
r.JmhIHMBench.bench_put_peter8noloop    thrpt        10 89184331,506  
2380981,771    ops/s
     */
    /*
     * jdk7u51 (run 2):
Benchmark                                Mode   Samples        Score  Score 
error    Units
r.JmhIHMBench.bench_put_original        thrpt        10 90430298,285  
1749471,171    ops/s
r.JmhIHMBench.bench_put_peter7          thrpt        10 77895700,268  
1118205,815    ops/s
r.JmhIHMBench.bench_put_peter7noloop    thrpt        10 87449460,422  
2241899,330    ops/s
r.JmhIHMBench.bench_put_peter8          thrpt        10 79993955,944  
1799749,550    ops/s
r.JmhIHMBench.bench_put_peter8noloop    thrpt        10 89818256,344   
886977,083    ops/s
     */
    /*
     * jdk8u20 (run 1):
Benchmark                                Mode   Samples        Score  Score 
error    Units
r.JmhIHMBench.bench_put_original        thrpt        10 108675548,411  
7130886,988    ops/s
r.JmhIHMBench.bench_put_peter7          thrpt        10 94452557,383  
4030609,636    ops/s
r.JmhIHMBench.bench_put_peter7noloop    thrpt        10 99070373,936  
2204912,685    ops/s
r.JmhIHMBench.bench_put_peter8          thrpt        10 106865766,992  
6248051,431    ops/s
r.JmhIHMBench.bench_put_peter8noloop    thrpt        10 114880028,516  
1176180,414    ops/s
     */
    /*
     * jdk8u20 (run 2):
Benchmark                                Mode   Samples        Score  Score 
error    Units
r.JmhIHMBench.bench_put_original        thrpt        10 117513077,228  
2011080,121    ops/s

Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-14 Thread Patrick Wright
Hi

been watching this fascinating discussion - seeing Jeff's benchmark today,
was wondering if there isn't already at least one benchmark written with
JMH? Wouldn't it make sense to make that part of the submission, as a
standard practice in refactoring like this?


Regards,
Patrick


On Mon, Jul 14, 2014 at 2:24 AM, Jeff Hain jeffh...@rocketmail.com wrote:



 Can you post the benchmark source?

 Before the source, here are the time ranges runs were stabilizing in
 for lucky executions (using 1.7 for compiler compliance level, and
 win7 / core i7) :

  | original | peter7   | peter7   | peter8   | peter8   |
  |  |  | no loop  |  | no loop  |
 -+--+--+--+--+--+
 jdk7u51  | 0.56-59s | 0.66-68s | 0.62-64s | 0.60-63s | 0.70-74s |
 jdk8u20  | 0.54-58s | 0.64-66s | 0.58-60s | 0.58-61s | 0.73-76s |
 jdk9 | 0.59-61s | 0.65-67s | 0.73-75s | 0.60-63s | 0.76-78s |

 = The no-loop version seems only better for jdk = 8 and your webrev 07,
  and for webrev 08, it seems actually (much) worse whatever the jdk.
 = jdk 8 looks faster than 7, but for some reason also faster than 9.


 public class IdentityHashMapPerf {
 private static final int MAX_NBR_OF_MAPPINGS = 1*1000;
 private static final int MAX_NBR_OF_CALLS = 100*1000*1000;
 private static final int NBR_OF_RUNS = 10;

 public static void main(String[] args) {

 System.out.println(IdentityHashMapPerf.class.getSimpleName()+...);
 for (int k=0;kNBR_OF_RUNS;k++) {
 // Quick run for code discovery for k = 0.
 final int maxNbrOfCalls = (k == 0) ? MAX_NBR_OF_MAPPINGS
  : MAX_NBR_OF_CALLS;
 bench_put(new IdentityHashMapInteger, Integer(),
 maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter7Integer, Integer(),
 maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter7NoLoopInteger,
 Integer(), maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter8Integer, Integer(),
 maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter8NoLoopInteger,
 Integer(), maxNbrOfCalls);
 }

 System.out.println(...+IdentityHashMapPerf.class.getSimpleName());
 }

 private static void bench_put(MapInteger, Integer map, int
 maxNbrOfCalls) {
 for (int k=0;k2;k++) {
 final Integer[] keys = newKeys(MAX_NBR_OF_MAPPINGS);
 final int nbrOfClearAndPuts =
 maxNbrOfCalls/MAX_NBR_OF_MAPPINGS;
 long a = System.nanoTime();
 {
 for (int cap=0;capnbrOfClearAndPuts;cap++) {
 map.clear();
 for (int i=0;iMAX_NBR_OF_MAPPINGS;i++) {
 final Integer kv = keys[i];
 map.put(kv, kv);
 }
 if (map.size() != MAX_NBR_OF_MAPPINGS) {
 throw new AssertionError(anti optim);
 }
 }
 }
 long b = System.nanoTime();
 System.out.println(nbrOfClearAndPuts+ * +MAX_NBR_OF_MAPPINGS
 + +map.getClass().getSimpleName()+.put(,) took
 +((b-a)/1e9)+ s);
 }
 }

 private static Integer[] newKeys(int size) {
 final Integer[] keys = new Integer[size];
 for (int i=0;ikeys.length;i++) {
 keys[i] = i;
 }
 return keys;

 }
 }


 -Jeff



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-14 Thread Peter Levart

Hi Jeff,

Home grown loops are not the best way of micro-benchmarking (have done 
it myself and learned it). JIT can sometimes optimize the code so 
aggressively that performance differences you obtain from such 
benchmarks just show that your concrete program can be optimized better 
and not that an average program using some function you are trying to 
measure can run faster too.


I suggest you try to express your benchmark in a tried framework like JMH:

http://openjdk.java.net/projects/code-tools/jmh/

In particular I suggest studying included samples.

Regards, Peter

On 07/14/2014 02:24 AM, Jeff Hain wrote:



Can you post the benchmark source?

Before the source, here are the time ranges runs were stabilizing in
for lucky executions (using 1.7 for compiler compliance level, and
win7 / core i7) :

  | original | peter7   | peter7   | peter8   | peter8   |
  |  |  | no loop  |  | no loop  |
-+--+--+--+--+--+
jdk7u51  | 0.56-59s | 0.66-68s | 0.62-64s | 0.60-63s | 0.70-74s |
jdk8u20  | 0.54-58s | 0.64-66s | 0.58-60s | 0.58-61s | 0.73-76s |
jdk9 | 0.59-61s | 0.65-67s | 0.73-75s | 0.60-63s | 0.76-78s |

= The no-loop version seems only better for jdk = 8 and your webrev 07,
  and for webrev 08, it seems actually (much) worse whatever the jdk.
= jdk 8 looks faster than 7, but for some reason also faster than 9.


public class IdentityHashMapPerf {
 private static final int MAX_NBR_OF_MAPPINGS = 1*1000;
 private static final int MAX_NBR_OF_CALLS = 100*1000*1000;
 private static final int NBR_OF_RUNS = 10;
 
 public static void main(String[] args) {

 System.out.println(IdentityHashMapPerf.class.getSimpleName()+...);
 for (int k=0;kNBR_OF_RUNS;k++) {
 // Quick run for code discovery for k = 0.
 final int maxNbrOfCalls = (k == 0) ? MAX_NBR_OF_MAPPINGS
  : MAX_NBR_OF_CALLS;
 bench_put(new IdentityHashMapInteger, Integer(), maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter7Integer, Integer(), 
maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter7NoLoopInteger, Integer(), 
maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter8Integer, Integer(), 
maxNbrOfCalls);
 //bench_put(new IdentityHashMapPeter8NoLoopInteger, Integer(), 
maxNbrOfCalls);
 }
 System.out.println(...+IdentityHashMapPerf.class.getSimpleName());
 }
 
 private static void bench_put(MapInteger, Integer map, int maxNbrOfCalls) {

 for (int k=0;k2;k++) {
 final Integer[] keys = newKeys(MAX_NBR_OF_MAPPINGS);
 final int nbrOfClearAndPuts = maxNbrOfCalls/MAX_NBR_OF_MAPPINGS;
 long a = System.nanoTime();
 {
 for (int cap=0;capnbrOfClearAndPuts;cap++) {
 map.clear();
 for (int i=0;iMAX_NBR_OF_MAPPINGS;i++) {
 final Integer kv = keys[i];
 map.put(kv, kv);
 }
 if (map.size() != MAX_NBR_OF_MAPPINGS) {
 throw new AssertionError(anti optim);
 }
 }
 }
 long b = System.nanoTime();
 System.out.println(nbrOfClearAndPuts+ * +MAX_NBR_OF_MAPPINGS
 + +map.getClass().getSimpleName()+.put(,) took +((b-a)/1e9)+ 
s);
 }
 }
 
 private static Integer[] newKeys(int size) {

 final Integer[] keys = new Integer[size];
 for (int i=0;ikeys.length;i++) {
 keys[i] = i;
 }
 return keys;

 }
}


-Jeff




Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-13 Thread Jeff Hain


On 07/08/2014 10:07 PM, Martin Buchholz wrote:
 I updated my webrev and it is again feature-complete.
 http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/
 http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/
 (old webrev at
 http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/
 http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/
 )

The additional loop hurts in my benchmarks (10-15 percents slower).
If removing it and adding a specific putAfterResize() method (without the
item == k test), replacing continue with return putAfterResize(k, value),
it's much better.

NB: In one bench I clear the map each time it hits 1000 mappings,
so that it always fit in some cache, which is about 4 times faster
than going up to 1000*1000 mappings (clear() cost included), and
make put internals costs variations more obvious (hopefully).

-Jeff


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-13 Thread Peter Levart


On 07/13/2014 01:24 PM, Jeff Hain wrote:

On 07/08/2014 10:07 PM, Martin Buchholz wrote:

I updated my webrev and it is again feature-complete.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/
(old webrev at
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/
)

The additional loop hurts in my benchmarks (10-15 percents slower).
If removing it and adding a specific putAfterResize() method (without the
item == k test), replacing continue with return putAfterResize(k, value),
it's much better.

NB: In one bench I clear the map each time it hits 1000 mappings,
so that it always fit in some cache, which is about 4 times faster
than going up to 1000*1000 mappings (clear() cost included), and
make put internals costs variations more obvious (hopefully).

-Jeff


Can you post the benchmark source?

Thanks, Peter



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-13 Thread Peter Levart


On 07/13/2014 12:41 AM, Peter Levart wrote:


On 07/12/2014 10:46 PM, Ivan Gerasimov wrote:

Peter, seems you need to fix capacity():
int capacity = Integer.highestOneBit(expectedMaxSize + (expectedMaxSize  1));
does not necessarily produces a negative number in the case of overflow.


Good catch. You're right.

So here's a better variant:

return Math.min(
MAXIMUM_CAPACITY,
Math.max(
MINIMUM_CAPACITY,
expectedMaxSize  Integer.MAX_VALUE / 3
? MAXIMUM_CAPACITY // 3 * expectedMaxSize would 
overflow
: Integer.highestOneBit(expectedMaxSize + 
(expectedMaxSize  1))

)
);





Incorporated in new webrev:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.08/

Also in this webrev: changing the condition by which putAll() decides to 
call resize() makes a boolean return from resize() unnecessary again, 
since it's only called when resize is actually needed (from put() when 
3*size reaches 2*capacity, from putAll(), when capacity calculated from 
argument map's size is greater than current capacity). Retry loop in 
put() can be simplified consequently.


Regards, Peter






Sincerely yours,
Ivan


On 12.07.2014 22:12, Peter Levart wrote:


On 07/12/2014 05:47 PM, Peter Levart wrote:
If we're willing to pay the price of special-casing the 
non-power-of-2 MAX_CAPACITY = (1  29) + (1  28), which amounts 
to approx. 4% of performance,


Then here's a possible solution:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/



Two fixes:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/

One is a fix for possible overflow in resize() + rearrangement 
(which is specific to my proposal) and the other is replacement of 
condition that triggers resize in put() from (3*size  length) to 
(3*size = length). The later should be applied to Martin's latest 
version too, I think, if it is decided that my proposal is inadequate.


Why is (3*size = length) more appropriate condition to trigger 
resize? Because it is more aligned with capacity(size) function 
which is basically a clamped Integer.highestOneBit(3 * size).


Map preallocation makes a table with length = 2 * 
capacity(expectedMaxSize)


(3 * size  2 * highestOneBit(3*size)) is true for any size, so IHM 
will never be resized if filled with size elements and table was 
preallocated with length =
2 * highestOneBit(3*size) even if condition for resizing is changed 
from (3*size  length) to (3*size = length). Current condition 
sometimes resizes a little to late when preallocation would already 
create a bigger table.


Now if we change that, my proposed webrev.07 does not need 
MAXIMUM_SIZE constant any more. Attempt to insert the 2^29-th 
element (when size is about to become 2^29) triggers resize since at 
that time length == 2 * MAXIMUM_CAPACITY which is exactly 3 * 2^29 
and this alone can be used as a trigger to throw OOME in resize()...


Regards, Peter









Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-13 Thread Jeff Hain


Can you post the benchmark source?

Before the source, here are the time ranges runs were stabilizing in
for lucky executions (using 1.7 for compiler compliance level, and
win7 / core i7) :

 | original | peter7   | peter7   | peter8   | peter8   |
 |  |  | no loop  |  | no loop  |
-+--+--+--+--+--+
jdk7u51  | 0.56-59s | 0.66-68s | 0.62-64s | 0.60-63s | 0.70-74s |
jdk8u20  | 0.54-58s | 0.64-66s | 0.58-60s | 0.58-61s | 0.73-76s |
jdk9 | 0.59-61s | 0.65-67s | 0.73-75s | 0.60-63s | 0.76-78s |

= The no-loop version seems only better for jdk = 8 and your webrev 07,
 and for webrev 08, it seems actually (much) worse whatever the jdk.
= jdk 8 looks faster than 7, but for some reason also faster than 9.


public class IdentityHashMapPerf {
    private static final int MAX_NBR_OF_MAPPINGS = 1*1000;
    private static final int MAX_NBR_OF_CALLS = 100*1000*1000;
    private static final int NBR_OF_RUNS = 10;
    
    public static void main(String[] args) {
    System.out.println(IdentityHashMapPerf.class.getSimpleName()+...);
    for (int k=0;kNBR_OF_RUNS;k++) {
    // Quick run for code discovery for k = 0.
    final int maxNbrOfCalls = (k == 0) ? MAX_NBR_OF_MAPPINGS
 : MAX_NBR_OF_CALLS;
    bench_put(new IdentityHashMapInteger, Integer(), maxNbrOfCalls);
    //bench_put(new IdentityHashMapPeter7Integer, Integer(), 
maxNbrOfCalls);
    //bench_put(new IdentityHashMapPeter7NoLoopInteger, Integer(), 
maxNbrOfCalls);
    //bench_put(new IdentityHashMapPeter8Integer, Integer(), 
maxNbrOfCalls);
    //bench_put(new IdentityHashMapPeter8NoLoopInteger, Integer(), 
maxNbrOfCalls);
    }
    System.out.println(...+IdentityHashMapPerf.class.getSimpleName());
    }
    
    private static void bench_put(MapInteger, Integer map, int maxNbrOfCalls) 
{
    for (int k=0;k2;k++) {
    final Integer[] keys = newKeys(MAX_NBR_OF_MAPPINGS);
    final int nbrOfClearAndPuts = maxNbrOfCalls/MAX_NBR_OF_MAPPINGS;
    long a = System.nanoTime();
    {
    for (int cap=0;capnbrOfClearAndPuts;cap++) {
    map.clear();
    for (int i=0;iMAX_NBR_OF_MAPPINGS;i++) {
    final Integer kv = keys[i];
    map.put(kv, kv);
    }
    if (map.size() != MAX_NBR_OF_MAPPINGS) {
    throw new AssertionError(anti optim);
    }
    }
    }
    long b = System.nanoTime();
    System.out.println(nbrOfClearAndPuts+ * +MAX_NBR_OF_MAPPINGS
    + +map.getClass().getSimpleName()+.put(,) took 
+((b-a)/1e9)+ s);
    }
    }
    
    private static Integer[] newKeys(int size) {
    final Integer[] keys = new Integer[size];
    for (int i=0;ikeys.length;i++) {
    keys[i] = i;
    }
    return keys;

    }
}


-Jeff


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-12 Thread Peter Levart


On 07/11/2014 09:08 PM, Doug Lea wrote:


I've been content to just observe Martin and Peter's nice efforts
on this, but one note:

On 07/11/2014 03:00 PM, Martin Buchholz wrote:
On Wed, Jul 9, 2014 at 3:17 PM, Peter Levart peter.lev...@gmail.com 
wrote:




IMH resizing is arranged so that the table is always 33% ... 66% 
full (if
nothing is ever removed from it) except when capacity reaches 2**29, 
then

it can be filled up to the top.

avg. # of probes can be taken as a rough estimation of average 
slow-down,
max. # of probes can be taken as a rough estimation of 
worst-case-operation

slow-down.

So where to draw the line?



Linear probing has long been known to be prone to long worst-case 
probes,
but with recent computer architectures this is offset by its extreme 
cache

friendliness.


Bear in mind that the number of bits of identityHashCode is less
than 32 on all JVMs I know. It can be as low as 23, which means that
you are sure to see a lot of exact collisions on IHMs with only
tens of millions of elements, and there's nothing much you can
do that will help.

-Doug




We have max. size = 2^29 - 1. This can not change because of 
serialization backwards compatibility. It could be a little less, but 
not much. Experiment shows that it is possible to fill IHM up to 99% 
with not too much CPU effort. If such IHM is serialized, we must be able 
to deserialize it...


But why do we have to limit table length to 2^30 ? Because it has to be 
power of 2 and 2^31 is already a negative number...

What would happen if max. table length was 2^30 + 2^29 ...

Here's what we get now:

// max. size = 2^29 - 1
// table length = 2^30
 10% max. size, probes:  0.1 avg.,  9 max.
 20% max. size, probes:  0.1 avg., 15 max.
 30% max. size, probes:  0.2 avg., 25 max.
 40% max. size, probes:  0.3 avg., 38 max.
 50% max. size, probes:  0.5 avg., 64 max.
 60% max. size, probes:  0.7 avg., 93 max.
 65% max. size, probes:  0.9 avg.,147 max.
 70% max. size, probes:  1.2 avg.,224 max.
 75% max. size, probes:  1.5 avg.,354 max.
 80% max. size, probes:  2.0 avg.,441 max.
 85% max. size, probes:  2.8 avg.,789 max.
 90% max. size, probes:  4.5 avg.,   1869 max.
 91% max. size, probes:  5.1 avg.,   2377 max.
 92% max. size, probes:  5.7 avg.,   3409 max.
 93% max. size, probes:  6.6 avg.,   3804 max.
 94% max. size, probes:  7.8 avg.,   5824 max.
 95% max. size, probes:  9.5 avg.,   7021 max.
 96% max. size, probes: 12.0 avg.,  12607 max.
 97% max. size, probes: 16.2 avg.,  17643 max.
 98% max. size, probes: 24.5 avg.,  34561 max.
 99% max. size, probes: 49.6 avg., 131371 max.
100% ... haven't waited long enough

If table length was 2^30 + 2^29, we would only fill it up to 66% like 
always and there would be no slow-down:


// max. size = 2^29 - 1
// table length = 2^30 + 2^29
 10% max. size, probes:  0.0 avg.,  7 max.
 20% max. size, probes:  0.1 avg., 11 max.
 30% max. size, probes:  0.1 avg., 13 max.
 40% max. size, probes:  0.2 avg., 20 max.
 50% max. size, probes:  0.3 avg., 28 max.
 60% max. size, probes:  0.4 avg., 40 max.
 65% max. size, probes:  0.4 avg., 42 max.
 70% max. size, probes:  0.5 avg., 52 max.
 75% max. size, probes:  0.5 avg., 65 max.
 80% max. size, probes:  0.6 avg., 87 max.
 85% max. size, probes:  0.7 avg., 87 max.
 90% max. size, probes:  0.8 avg.,128 max.
 91% max. size, probes:  0.8 avg.,128 max.
 92% max. size, probes:  0.8 avg.,128 max.
 93% max. size, probes:  0.8 avg.,129 max.
 94% max. size, probes:  0.9 avg.,129 max.
 95% max. size, probes:  0.9 avg.,131 max.
 96% max. size, probes:  0.9 avg.,150 max.
 97% max. size, probes:  0.9 avg.,150 max.
 98% max. size, probes:  1.0 avg.,159 max.
 99% max. size, probes:  1.0 avg.,159 max.
100% max. size, probes:  1.0 avg.,159 max.


If we're willing to pay the price of special-casing the non-power-of-2 
MAX_CAPACITY = (1  29) + (1  28), which amounts to approx. 4% of 
performance:


Original code:

BenchmarkMode   SamplesScore  Score error Units
j.t.IHMBench.ihm0Putthrpt30 25217087.59350117.867 ops/s
j.t.IHMBench.ihm1Getthrpt30 43017677.457   230902.599 ops/s

Patched code:

BenchmarkMode   SamplesScore  Score error Units
j.t.IHMBench.ihm0Putthrpt30 24213091.899   122980.408 ops/s
j.t.IHMBench.ihm1Getthrpt30 40754537.027   936380.022 ops/s

Using JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

static final Object[] keys = new Object[1000_000];
static {
for (int i = 0; i  keys.length; i++) {
keys[i] = new Object();
}
}

static final Object value = new Object();

final MapObject, Object map = new IdentityHashMapObject, 
Object(keys.length);


int i = 0;

@Benchmark
public void ihm0Put(Blackhole bh) {
bh.consume(map.put(keys[i], value));

Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-12 Thread Peter Levart


On 07/12/2014 05:47 PM, Peter Levart wrote:
If we're willing to pay the price of special-casing the non-power-of-2 
MAX_CAPACITY = (1  29) + (1  28), which amounts to approx. 4% of 
performance,


Then here's a possible solution:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/



Two fixes:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/

One is a fix for possible overflow in resize() + rearrangement (which is 
specific to my proposal) and the other is replacement of condition that 
triggers resize in put() from (3*size  length) to (3*size = length). 
The later should be applied to Martin's latest version too, I think, if 
it is decided that my proposal is inadequate.


Why is (3*size = length) more appropriate condition to trigger resize? 
Because it is more aligned with capacity(size) function which is 
basically a clamped Integer.highestOneBit(3 * size).


Map preallocation makes a table with length = 2 * capacity(expectedMaxSize)

(3 * size  2 * highestOneBit(3*size)) is true for any size, so IHM will 
never be resized if filled with size elements and table was preallocated 
with length =
2 * highestOneBit(3*size) even if condition for resizing is changed from 
(3*size  length) to (3*size = length). Current condition sometimes 
resizes a little to late when preallocation would already create a 
bigger table.


Now if we change that, my proposed webrev.07 does not need MAXIMUM_SIZE 
constant any more. Attempt to insert the 2^29-th element (when size is 
about to become 2^29) triggers resize since at that time length == 2 * 
MAXIMUM_CAPACITY which is exactly 3 * 2^29 and this alone can be used as 
a trigger to throw OOME in resize()...


Regards, Peter



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-12 Thread Peter Levart


On 07/12/2014 08:12 PM, Peter Levart wrote:
If we're willing to pay the price of special-casing the non-power-of-2 
MAX_CAPACITY = (1  29) + (1  28), which amounts to approx. 4% of 
performance,


The cause of performance drop is of course the conditional in:

 297 private static int hash(Object x, int length) {
 298 int h = System.identityHashCode(x);
 299 // multiply by -127
 300 h -= h  7;
 301 if (length  MAXIMUM_CAPACITY * 2) { // length is 2^n
 302 // left-shift to use least bit as part of hash
 303 return (h  1)  (length - 1);
 304 }
 305 // assert length == MAXIMUM_CAPACITY * 2
 306 // clamp
 307 h = (MAXIMUM_CAPACITY / 3 * 4 - 1);
 308 // Multiply by 3/2 and reset 0th bit
 309 return (h + (h  1))  ~1;
 310 }

If I change it to:

 297 private static int hash(Object x, int length) {
 298 int h = System.identityHashCode(x);
 299 // multiply by -127
 300 h -= h  7;
 302 // left-shift to use least bit as part of hash
 303 return (h  1)  (length - 1);
 310 }


...performance is restored, but then of course it only works until table 
is resized to 2*MAX_CAPACITY. Does anybody have any idea how to make it 
faster?




Regards, Peter


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-12 Thread Peter Levart


On 07/12/2014 08:12 PM, Peter Levart wrote:
(3 * size  2 * highestOneBit(3*size)) is true for any size, so IHM 
will never be resized if filled with size elements and table was 
preallocated with length =
2 * highestOneBit(3*size) even if condition for resizing is changed 
from (3*size  length) to (3*size = length). Current condition 
sometimes resizes a little to late when preallocation would already 
create a bigger table.


May latest statement is of course false. If length is a power of two, 
then it can never be equal 3*size. For power of two lengths, both 
conditions are equivalent. But for my purpose the = is better...


Peter



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-12 Thread Ivan Gerasimov

Peter, seems you need to fix capacity():

int capacity = Integer.highestOneBit(expectedMaxSize + (expectedMaxSize  1));

does not necessarily produces a negative number in the case of overflow.

Why don't you want to use the variant that from the latest Martin's webrev?
It seems to work correctly with your proposal too.

Sincerely yours,
Ivan


On 12.07.2014 22:12, Peter Levart wrote:


On 07/12/2014 05:47 PM, Peter Levart wrote:
If we're willing to pay the price of special-casing the 
non-power-of-2 MAX_CAPACITY = (1  29) + (1  28), which amounts to 
approx. 4% of performance,


Then here's a possible solution:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/



Two fixes:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/

One is a fix for possible overflow in resize() + rearrangement (which 
is specific to my proposal) and the other is replacement of condition 
that triggers resize in put() from (3*size  length) to (3*size = 
length). The later should be applied to Martin's latest version too, I 
think, if it is decided that my proposal is inadequate.


Why is (3*size = length) more appropriate condition to trigger 
resize? Because it is more aligned with capacity(size) function which 
is basically a clamped Integer.highestOneBit(3 * size).


Map preallocation makes a table with length = 2 * 
capacity(expectedMaxSize)


(3 * size  2 * highestOneBit(3*size)) is true for any size, so IHM 
will never be resized if filled with size elements and table was 
preallocated with length =
2 * highestOneBit(3*size) even if condition for resizing is changed 
from (3*size  length) to (3*size = length). Current condition 
sometimes resizes a little to late when preallocation would already 
create a bigger table.


Now if we change that, my proposed webrev.07 does not need 
MAXIMUM_SIZE constant any more. Attempt to insert the 2^29-th element 
(when size is about to become 2^29) triggers resize since at that time 
length == 2 * MAXIMUM_CAPACITY which is exactly 3 * 2^29 and this 
alone can be used as a trigger to throw OOME in resize()...


Regards, Peter





Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-12 Thread Peter Levart


On 07/12/2014 10:46 PM, Ivan Gerasimov wrote:

Peter, seems you need to fix capacity():
int capacity = Integer.highestOneBit(expectedMaxSize + (expectedMaxSize  1));
does not necessarily produces a negative number in the case of overflow.


Good catch. You're right.



Why don't you want to use the variant that from the latest Martin's 
webrev?

It seems to work correctly with your proposal too.


Not quite. Martin's version is:

return
(expectedMaxSize  MAXIMUM_CAPACITY / 3) ? MAXIMUM_CAPACITY :
(expectedMaxSize = 2 * MINIMUM_CAPACITY / 3) ? 
MINIMUM_CAPACITY :
Integer.highestOneBit(expectedMaxSize + (expectedMaxSize  
1));


My MAXIMUM_CAPACITY is not power of two, but: 3  28;

If, for example, expectedMaxSize = (1  28) + 1, then Martin's 
capacity() would return MAXIMUM_CAPACITY, but I would like it to return 
2 ^ 29, since expectedMaxSize is only just past 50% of such capacity.


So here's a better variant:

return Math.min(
MAXIMUM_CAPACITY,
Math.max(
MINIMUM_CAPACITY,
expectedMaxSize  Integer.MAX_VALUE / 3
? MAXIMUM_CAPACITY // 3 * expectedMaxSize would 
overflow
: Integer.highestOneBit(expectedMaxSize + 
(expectedMaxSize  1))

)
);


I'll put this and another simplification into a new webrev tomorow.


Regards, Peter




Sincerely yours,
Ivan


On 12.07.2014 22:12, Peter Levart wrote:


On 07/12/2014 05:47 PM, Peter Levart wrote:
If we're willing to pay the price of special-casing the 
non-power-of-2 MAX_CAPACITY = (1  29) + (1  28), which amounts 
to approx. 4% of performance,


Then here's a possible solution:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.06/



Two fixes:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.07/

One is a fix for possible overflow in resize() + rearrangement (which 
is specific to my proposal) and the other is replacement of condition 
that triggers resize in put() from (3*size  length) to (3*size = 
length). The later should be applied to Martin's latest version too, 
I think, if it is decided that my proposal is inadequate.


Why is (3*size = length) more appropriate condition to trigger 
resize? Because it is more aligned with capacity(size) function which 
is basically a clamped Integer.highestOneBit(3 * size).


Map preallocation makes a table with length = 2 * 
capacity(expectedMaxSize)


(3 * size  2 * highestOneBit(3*size)) is true for any size, so IHM 
will never be resized if filled with size elements and table was 
preallocated with length =
2 * highestOneBit(3*size) even if condition for resizing is changed 
from (3*size  length) to (3*size = length). Current condition 
sometimes resizes a little to late when preallocation would already 
create a bigger table.


Now if we change that, my proposed webrev.07 does not need 
MAXIMUM_SIZE constant any more. Attempt to insert the 2^29-th element 
(when size is about to become 2^29) triggers resize since at that 
time length == 2 * MAXIMUM_CAPACITY which is exactly 3 * 2^29 and 
this alone can be used as a trigger to throw OOME in resize()...


Regards, Peter







Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-11 Thread Doug Lea


I've been content to just observe Martin and Peter's nice efforts
on this, but one note:

On 07/11/2014 03:00 PM, Martin Buchholz wrote:

On Wed, Jul 9, 2014 at 3:17 PM, Peter Levart peter.lev...@gmail.com wrote:



IMH resizing is arranged so that the table is always 33% ... 66% full (if
nothing is ever removed from it) except when capacity reaches 2**29, then
it can be filled up to the top.

avg. # of probes can be taken as a rough estimation of average slow-down,
max. # of probes can be taken as a rough estimation of worst-case-operation
slow-down.

So where to draw the line?



Linear probing has long been known to be prone to long worst-case probes,
but with recent computer architectures this is offset by its extreme cache
friendliness.


Bear in mind that the number of bits of identityHashCode is less
than 32 on all JVMs I know. It can be as low as 23, which means that
you are sure to see a lot of exact collisions on IHMs with only
tens of millions of elements, and there's nothing much you can
do that will help.

-Doug




Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-10 Thread Peter Levart

On 07/08/2014 11:30 PM, Martin Buchholz wrote:

Benchmarks welcome.


I have run your latest webrev with the following benchamrk:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject, Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

Using:

java -Xmx4G -Xms4G -jar benchmarks.jar -f 0 -i 30 -wi 10 -t 1 -gc 1

...and results are:

Original:

Benchmark Mode   SamplesScore  Score 
errorUnits

j.t.IHMBench.putNewObjectthrpt30 13305370.384 80122.977ops/s

Patched:

Benchmark Mode   SamplesScore  Score 
errorUnits
j.t.IHMBench.putNewObjectthrpt30 13364374.454 124491.206
ops/s



Seems performance is the same.

Regards, Peter



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-09 Thread Peter Levart


On 07/09/2014 02:46 AM, Martin Buchholz wrote:
Let me understand - you're worried that when size is MAX_CAPACITY - 1, 
then a call to putAll that does not actually add any elements might 
throw?


This is not possible, because resize() is only called from putAll(map) 
if argument map.size()  this.size. At least one element will be added 
to the map and it's correct to throw if current size == MAX_CAPACITY - 1.


 Well, I'm having a hard time considering that corner case a bug, 
given how unusable the map is at this point.


It seems even this corner case is not present.



Your suggested fix of returning immediately in case of no resize would 
cause put to succeed and reach size == MAX_CAPACITY, which you were 
trying to prevent.


That's not possible either, since resize() is always called from put() 
with current table.length, which makes newLength == 2 * oldLength, 
therefore (oldLength = newLength) will never succeed in this case.


Peter




On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov 
ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote:



On 09.07.2014 3:20, Martin Buchholz wrote:

I agree with Peter that we do not need to increment modCount on
resize.

My latest webrev is again done.

Ivan, feel free to take over.


Yes, thanks! The fix is much much better now.

I think I see yet another very minor glitch, though.
If the table is already of size 2 * MAXIMUM_CAPACITY, and we're
merging in another map, which has more elements with putAll(),
resize() will be called and it will throw, even though all the
elements could fit into the map due to the key duplicates.

To avoid this the following check should be done first in resize():

  469 if (oldLength = newLength)
  470 return false;
  


Sincerely yours,
Ivan






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-09 Thread Peter Levart


On 07/09/2014 09:23 AM, Peter Levart wrote:


On 07/09/2014 02:46 AM, Martin Buchholz wrote:
Let me understand - you're worried that when size is MAX_CAPACITY - 
1, then a call to putAll that does not actually add any elements 
might throw?


This is not possible, because resize() is only called from putAll(map) 
if argument map.size()  this.size. At least one element will be added 
to the map and it's correct to throw if current size == MAX_CAPACITY - 1.


...at least if the argument map obeys the rules of Map contract. Even if 
it's a HashMap or another IdentityHashMap, it should not contain the 
same INSTANCE of the key in two or more mappings, should not overshoot 
reporting it's size() and should be stable during the whole putAll() 
operation... So calling IHM.addAll() with a live changing 
ConcurrentHashMap is not advisable. Even then, addAll() could only 
throw, because at some point the argument's size indicated that IHM 
could reach it's max. capacity.


Peter



 Well, I'm having a hard time considering that corner case a bug, 
given how unusable the map is at this point.


It seems even this corner case is not present.



Your suggested fix of returning immediately in case of no resize 
would cause put to succeed and reach size == MAX_CAPACITY, which you 
were trying to prevent.


That's not possible either, since resize() is always called from put() 
with current table.length, which makes newLength == 2 * oldLength, 
therefore (oldLength = newLength) will never succeed in this case.


Peter




On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov 
ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote:



On 09.07.2014 3:20, Martin Buchholz wrote:

I agree with Peter that we do not need to increment modCount on
resize.

My latest webrev is again done.

Ivan, feel free to take over.


Yes, thanks! The fix is much much better now.

I think I see yet another very minor glitch, though.
If the table is already of size 2 * MAXIMUM_CAPACITY, and we're
merging in another map, which has more elements with putAll(),
resize() will be called and it will throw, even though all the
elements could fit into the map due to the key duplicates.

To avoid this the following check should be done first in resize():

  469 if (oldLength = newLength)
  470 return false;
  


Sincerely yours,
Ivan








Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-09 Thread Martin Buchholz
OK, I think we're down to the smallest possible bug:

if size == MAX_CAPACITY - 1 and we putAll a concurrent map with greater
size, but the concurrent map shrinks while we're adding it, and no actual
elements get added to the IHM (but each element put takes ~  2**28 probes),
then an ISE is thrown when it was not strictly necessary.

Meanwhile, I'm actually wishing we could throw at something like 80% full...

Relatedly: It occurs to me that it's traditional in java.util to throw OOME
instead of ISE for capacity exceeded.


On Wed, Jul 9, 2014 at 12:33 AM, Peter Levart peter.lev...@gmail.com
wrote:


 On 07/09/2014 09:23 AM, Peter Levart wrote:


 On 07/09/2014 02:46 AM, Martin Buchholz wrote:

 Let me understand - you're worried that when size is MAX_CAPACITY - 1,
 then a call to putAll that does not actually add any elements might throw?


 This is not possible, because resize() is only called from putAll(map) if
 argument map.size()  this.size. At least one element will be added to the
 map and it's correct to throw if current size == MAX_CAPACITY - 1.


 ...at least if the argument map obeys the rules of Map contract. Even if
 it's a HashMap or another IdentityHashMap, it should not contain the same
 INSTANCE of the key in two or more mappings, should not overshoot
 reporting it's size() and should be stable during the whole putAll()
 operation... So calling IHM.addAll() with a live changing ConcurrentHashMap
 is not advisable. Even then, addAll() could only throw, because at some
 point the argument's size indicated that IHM could reach it's max. capacity.

 Peter



   Well, I'm having a hard time considering that corner case a bug, given
 how unusable the map is at this point.


 It seems even this corner case is not present.


  Your suggested fix of returning immediately in case of no resize would
 cause put to succeed and reach size == MAX_CAPACITY, which you were trying
 to prevent.


 That's not possible either, since resize() is always called from put()
 with current table.length, which makes newLength == 2 * oldLength,
 therefore (oldLength = newLength) will never succeed in this case.

 Peter



 On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov ivan.gerasi...@oracle.com
 wrote:


 On 09.07.2014 3:20, Martin Buchholz wrote:

 I agree with Peter that we do not need to increment modCount on resize.

  My latest webrev is again done.

  Ivan, feel free to take over.


  Yes, thanks! The fix is much much better now.

 I think I see yet another very minor glitch, though.
 If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging
 in another map, which has more elements with putAll(), resize() will be
 called and it will throw, even though all the elements could fit into the
 map due to the key duplicates.

 To avoid this the following check should be done first in resize():

  469 if (oldLength = newLength) 470 return false;

  Sincerely yours,
 Ivan







Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-09 Thread Peter Levart


On 07/09/2014 06:36 PM, Martin Buchholz wrote:

OK, I think we're down to the smallest possible bug:


I wouldn't call it a bug, since it's not present.



if size == MAX_CAPACITY - 1 and we putAll a concurrent map with 
greater size, but the concurrent map shrinks while we're adding it,
and no actual elements get added to the IHM (but each element put 
takes ~  2**28 probes), then an ISE is thrown when it was not strictly 
necessary.


We won't be adding any elements if resize() throws capacity exceeded, 
so this is just hypothetical thinking what would happen if we didn't 
call resize() and just started adding elements from the argument map. We 
might hit the ceiling or we might not.




Meanwhile, I'm actually wishing we could throw at something like 80% 
full...


I  created a little simulation:

public class IMHSimulation {

static final int MAXIMUM_CAPACITY = 1  29;
static final int MASK = MAXIMUM_CAPACITY - 1;

static int hash(Object x) {
int h = System.identityHashCode(x);
// Multiply by -127 to use least bit as part of hash
return (h - (h  7))  MASK;
}

static int add(BitSet bs, Object o) {
int i = hash(o);
int probes = 0;
while (bs.get(i)) {
i = (i + 1)  MASK;
probes++;
}
bs.set(i);
return probes;
}

public static void main(String[] args) {
BitSet bs = new BitSet(MAXIMUM_CAPACITY);
long totalProbes = 0;
int maxProbes = 0;
int percent = 10;
int nextReportAt = (int) ((long) percent * MAXIMUM_CAPACITY / 100);
for (int i = 0; i  MAXIMUM_CAPACITY; i++) {
int probes = add(bs, new Object());
totalProbes += probes;
maxProbes = Math.max(maxProbes, probes);
if (i == nextReportAt) {
System.out.printf(
%3d%% full, avg. %4.1f, max. %6d probes\n,
percent, (double) totalProbes / i, maxProbes
);
if (percent  60)
percent += 10;
else if (percent  90)
percent += 5;
else
percent++;
nextReportAt = (int) ((long) percent * MAXIMUM_CAPACITY 
/ 100);

}
}
}


Which produces the following:

 10% full, avg.  0.1, max.  9 probes
 20% full, avg.  0.1, max. 15 probes
 30% full, avg.  0.2, max. 25 probes
 40% full, avg.  0.3, max. 38 probes
 50% full, avg.  0.5, max. 64 probes
 60% full, avg.  0.7, max. 93 probes
 65% full, avg.  0.9, max.147 probes
 70% full, avg.  1.2, max.224 probes
 75% full, avg.  1.5, max.354 probes
 80% full, avg.  2.0, max.441 probes
 85% full, avg.  2.8, max.789 probes
 90% full, avg.  4.5, max.   1869 probes
 91% full, avg.  5.1, max.   2377 probes
 92% full, avg.  5.7, max.   3409 probes
 93% full, avg.  6.6, max.   3804 probes
 94% full, avg.  7.8, max.   5824 probes
 95% full, avg.  9.5, max.   7021 probes
 96% full, avg. 12.0, max.  12607 probes
 97% full, avg. 16.2, max.  17643 probes
 98% full, avg. 24.5, max.  34561 probes
 99% full, avg. 49.6, max. 131371 probes


IMH resizing is arranged so that the table is always 33% ... 66% full 
(if nothing is ever removed from it) except when capacity reaches 2**29, 
then it can be filled up to the top.


avg. # of probes can be taken as a rough estimation of average 
slow-down, max. # of probes can be taken as a rough estimation of 
worst-case-operation slow-down.


So where to draw the line?

Regards, Peter



Relatedly: It occurs to me that it's traditional in java.util to throw 
OOME instead of ISE for capacity exceeded.





On Wed, Jul 9, 2014 at 12:33 AM, Peter Levart peter.lev...@gmail.com 
mailto:peter.lev...@gmail.com wrote:



On 07/09/2014 09:23 AM, Peter Levart wrote:


On 07/09/2014 02:46 AM, Martin Buchholz wrote:

Let me understand - you're worried that when size is
MAX_CAPACITY - 1, then a call to putAll that does not actually
add any elements might throw?


This is not possible, because resize() is only called from
putAll(map) if argument map.size()  this.size. At least one
element will be added to the map and it's correct to throw if
current size == MAX_CAPACITY - 1.


...at least if the argument map obeys the rules of Map contract.
Even if it's a HashMap or another IdentityHashMap, it should not
contain the same INSTANCE of the key in two or more mappings,
should not overshoot reporting it's size() and should be stable
during the whole putAll() operation... So calling IHM.addAll()
with a live changing ConcurrentHashMap is not advisable. Even
then, addAll() could only throw, because at some point the
argument's size indicated that IHM could reach it's max. capacity.

Peter





 Well, I'm having a hard time considering that corner case a
bug, given how unusable the map is at 

Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart

On 07/08/2014 09:33 AM, Martin Buchholz wrote:

I've updated the webrev
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/
It now has all my TODOs done.
The test case has been testng-ified.


Hi Martin,

What happened to the desire that when OOME is thrown on resizing, IHM is 
left unchanged?


Regards, Peter



On Mon, Jul 7, 2014 at 6:54 PM, Ivan Gerasimov ivan.gerasi...@oracle.com
wrote:


Unfortunately, x + x  1 causes the same overflow bug as 3*x:


x + (x  1) is merely supposed to be possibly more efficient than 3*x.



(int)(1431655766 + 1431655766  1) == 2


OK, I think my latest version doesn't have any overflow bugs.




Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart

On 07/08/2014 12:12 PM, Peter Levart wrote:

On 07/08/2014 09:33 AM, Martin Buchholz wrote:

I've updated the webrev
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 


It now has all my TODOs done.
The test case has been testng-ified.


Hi Martin,

What happened to the desire that when OOME is thrown on resizing, IHM 
is left unchanged?


Regards, Peter


Hi Martin,

I took your latest version of the patch and modified it a little:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

Here's a diff to your version:

*** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 
12:37:57.267916926 +0200
--- src/share/classes/java/util/IdentityHashMap.java2014-07-08 
12:34:25.739767669 +0200

***
*** 437,449 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
! modCount++;
! tab[i] = k;
! tab[i + 1] = value;
! size++;
! int x = size + (size  1); // Optimized form of 3 * size
! if (x  len)
! resize(len); // len == 2 * current capacity.
  return null;
}

--- 437,457 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
!
! int x = size + (size  1) + 3; // Optimized form of 3 * (size 
+ 1)

! if (x  len) {
! if (resize(len)) { // len == 2 * current capacity.
! tab = table;
! len = tab.length;
! i = hash(key, len);
! while (tab[i] != null)
! i = nextKeyIndex(i, len);
! }
! modCount++;
! tab[i] = k;
! tab[i + 1] = value;
! size++;
! }
  return null;
  }

***
*** 452,468 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private void resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

  int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return;
  }
  if (oldLength = newLength)
! return;

  Object[] newTable = new Object[newLength];

--- 460,476 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private boolean resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

  int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return false;
  }
  if (oldLength = newLength)
! return false;

  Object[] newTable = new Object[newLength];

***
*** 480,485 
--- 488,494 
  }
  }
  table = newTable;
+ return true;
  }

  /**


Regards, Peter





On Mon, Jul 7, 2014 at 6:54 PM, Ivan Gerasimov 
ivan.gerasi...@oracle.com

wrote:


Unfortunately, x + x  1 causes the same overflow bug as 3*x:


x + (x  1) is merely supposed to be possibly more efficient than 3*x.



(int)(1431655766 + 1431655766  1) == 2


OK, I think my latest version doesn't have any overflow bugs.






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart

Hi again,

Here's a version with (3*size  len) replaced with (size  len/3) as 
suggested by Ivan Gerasimov to avoid overflow and also fixed if block if 
put() that enclosed too much code (in my changed version of Martin's 
latest webrev):


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.02/

I don't think there's a danger of overflow in capacity() since 
(expectedMaxSize + (expectedMaxSize  1)) is evaluated only if 
expectedMaxSize = MAXIMUM_CAPACITY / 3;


Here's a diff to latest Martin's version:

*** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 
12:37:57.267916926 +0200
--- src/share/classes/java/util/IdentityHashMap.java2014-07-08 
13:05:46.351810826 +0200

***
*** 437,449 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
- int x = size + (size  1); // Optimized form of 3 * size
- if (x  len)
- resize(len); // len == 2 * current capacity.
  return null;
  }

--- 437,454 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
+
+ if (size = len / 3  resize(len)) { // len == 2 * current 
capacity.

+ tab = table;
+ len = tab.length;
+ i = hash(key, len);
+ while (tab[i] != null)
+ i = nextKeyIndex(i, len);
+ }
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
  return null;
  }

***
*** 452,468 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private void resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

  int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return;
  }
  if (oldLength = newLength)
! return;

  Object[] newTable = new Object[newLength];

--- 457,473 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private boolean resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

  int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return false;
  }
  if (oldLength = newLength)
! return false;

  Object[] newTable = new Object[newLength];

***
*** 480,485 
--- 485,491 
  }
  }
  table = newTable;
+ return true;
  }

  /**
***
*** 494,500 
  int n = m.size();
  if (n == 0)
  return;
! if (n + (n  1)  table.length) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())
--- 500,506 
  int n = m.size();
  if (n == 0)
  return;
! if (n  table.length / 3) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())


Regards, Peter

On 07/08/2014 12:41 PM, Peter Levart wrote:

On 07/08/2014 12:12 PM, Peter Levart wrote:

On 07/08/2014 09:33 AM, Martin Buchholz wrote:

I've updated the webrev
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 


It now has all my TODOs done.
The test case has been testng-ified.


Hi Martin,

What happened to the desire that when OOME is thrown on resizing, IHM 
is left unchanged?


Regards, Peter


Hi Martin,

I took your latest version of the patch and modified it a little:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

Here's a diff to your version:

*** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 
12:37:57.267916926 +0200
--- src/share/classes/java/util/IdentityHashMap.java2014-07-08 
12:34:25.739767669 +0200

***
*** 437,449 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
! modCount++;
! tab[i] = k;
! tab[i + 1] = value;
! size++;
! int x = size + (size  1); // Optimized form of 3 * size
! if (x  len)
! resize(len); // len == 2 * current capacity.
  return null;
}

--- 437,457 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
!
! int x = size + (size  1) + 3; // Optimized form of 3 * 
(size + 1)

! if (x  len) {
! 

Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov


On 08.07.2014 15:25, Peter Levart wrote:

Hi again,

Here's a version with (3*size  len) replaced with (size  len/3) as 
suggested by Ivan Gerasimov to avoid overflow and also fixed if block 
if put() that enclosed too much code (in my changed version of 
Martin's latest webrev):



It shouldn't be needed, since size  MAXIMUM_CAPACITY-1 at this point.


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.02/

I don't think there's a danger of overflow in capacity() since 
(expectedMaxSize + (expectedMaxSize  1)) is evaluated only if 
expectedMaxSize = MAXIMUM_CAPACITY / 3;


Here's a diff to latest Martin's version:

*** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 
12:37:57.267916926 +0200
--- src/share/classes/java/util/IdentityHashMap.java2014-07-08 
13:05:46.351810826 +0200

***
*** 437,449 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
- int x = size + (size  1); // Optimized form of 3 * size
- if (x  len)
- resize(len); // len == 2 * current capacity.
  return null;
  }

--- 437,454 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
+
+ if (size = len / 3  resize(len)) { // len == 2 * current 
capacity.

+ tab = table;
+ len = tab.length;
+ i = hash(key, len);
+ while (tab[i] != null)
+ i = nextKeyIndex(i, len);
+ }
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
  return null;
  }

***
*** 452,468 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private void resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

  int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return;
  }
  if (oldLength = newLength)
! return;

  Object[] newTable = new Object[newLength];

--- 457,473 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private boolean resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

  int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return false;
  }
  if (oldLength = newLength)
! return false;

  Object[] newTable = new Object[newLength];

***
*** 480,485 
--- 485,491 
  }
  }
  table = newTable;
+ return true;
  }

  /**
***
*** 494,500 
  int n = m.size();
  if (n == 0)
  return;
! if (n + (n  1)  table.length) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())
--- 500,506 
  int n = m.size();
  if (n == 0)
  return;
! if (n  table.length / 3) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())


Regards, Peter

On 07/08/2014 12:41 PM, Peter Levart wrote:

On 07/08/2014 12:12 PM, Peter Levart wrote:

On 07/08/2014 09:33 AM, Martin Buchholz wrote:

I've updated the webrev
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 


It now has all my TODOs done.
The test case has been testng-ified.


Hi Martin,

What happened to the desire that when OOME is thrown on resizing, 
IHM is left unchanged?


Regards, Peter


Hi Martin,

I took your latest version of the patch and modified it a little:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

Here's a diff to your version:

*** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 
12:37:57.267916926 +0200
--- src/share/classes/java/util/IdentityHashMap.java 2014-07-08 
12:34:25.739767669 +0200

***
*** 437,449 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
! modCount++;
! tab[i] = k;
! tab[i + 1] = value;
! size++;
! int x = size + (size  1); // Optimized form of 3 * size
! if (x  len)
! resize(len); // len == 2 * current capacity.
  return null;
}

--- 437,457 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
!
!

Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart

On 07/08/2014 01:53 PM, Ivan Gerasimov wrote:

On 08.07.2014 15:25, Peter Levart wrote:

Hi again,

Here's a version with (3*size  len) replaced with (size  len/3) as 
suggested by Ivan Gerasimov to avoid overflow and also fixed if block 
if put() that enclosed too much code (in my changed version of 
Martin's latest webrev):



It shouldn't be needed, since size  MAXIMUM_CAPACITY-1 at this point.


That's right. Not in put(). But in putAll() it can overflow, since the 
argument Map can be of any size that fits in int... So here's my 3rd 
variation of Martin's latest version:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.03/


And a diff to Martin's latest version:

*** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 
12:37:57.267916926 +0200
--- src/share/classes/java/util/IdentityHashMap.java2014-07-08 
14:07:57.782843612 +0200

***
*** 437,449 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
- int x = size + (size  1); // Optimized form of 3 * size
- if (x  len)
- resize(len); // len == 2 * current capacity.
  return null;
  }

--- 437,455 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
+
+ int x = size + (size  1) + 3; // Optimized form of 3 * (size 
+ 1)

+ if (x  len  resize(len)) { // len == 2 * current capacity.
+ tab = table;
+ len = tab.length;
+ i = hash(key, len);
+ while (tab[i] != null)
+ i = nextKeyIndex(i, len);
+ }
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
  return null;
  }

***
*** 452,468 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private void resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

- int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return;
  }
  if (oldLength = newLength)
! return;

  Object[] newTable = new Object[newLength];

--- 458,474 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private boolean resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2


  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return false;
  }
+ int newLength = newCapacity * 2;
  if (oldLength = newLength)
! return false;

  Object[] newTable = new Object[newLength];

***
*** 480,485 
--- 486,492 
  }
  }
  table = newTable;
+ return true;
  }

  /**
***
*** 494,500 
  int n = m.size();
  if (n == 0)
  return;
! if (n + (n  1)  table.length) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())
--- 501,507 
  int n = m.size();
  if (n == 0)
  return;
! if (n  table.length / 3) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart

On 07/08/2014 02:20 PM, Peter Levart wrote:
That's right. Not in put(). But in putAll() it can overflow, since the 
argument Map can be of any size that fits in int... So here's my 3rd 
variation of Martin's latest version:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.03/

Another bug in my part 8-( I should've used the null-masked 'k' instead 
of 'key' in the index re-compute block:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.04/

And diff to Martin's patch:

*** src/share/classes/java/util/IdentityHashMap.java.mb 2014-07-08 
12:37:57.267916926 +0200
--- src/share/classes/java/util/IdentityHashMap.java2014-07-08 
14:23:25.141249319 +0200

***
*** 437,449 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
- int x = size + (size  1); // Optimized form of 3 * size
- if (x  len)
- resize(len); // len == 2 * current capacity.
  return null;
  }

--- 437,455 

  if (size == MAXIMUM_CAPACITY - 1)
  throw new IllegalStateException(Capacity exhausted.);
+
+ int x = size + (size  1) + 3; // Optimized form of 3 * (size 
+ 1)

+ if (x  len  resize(len)) { // len == 2 * current capacity.
+ tab = table;
+ len = tab.length;
+ i = hash(k, len);
+ while (tab[i] != null)
+ i = nextKeyIndex(i, len);
+ }
  modCount++;
  tab[i] = k;
  tab[i + 1] = value;
  size++;
  return null;
  }

***
*** 452,468 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private void resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2

- int newLength = newCapacity * 2;

  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return;
  }
  if (oldLength = newLength)
! return;

  Object[] newTable = new Object[newLength];

--- 458,474 
   *
   * @param newCapacity the new capacity, must be a power of two.
   */
! private boolean resize(int newCapacity) {
  // assert (newCapacity  -newCapacity) == newCapacity; // 
power of 2


  Object[] oldTable = table;
  int oldLength = oldTable.length;
  if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further

! return false;
  }
+ int newLength = newCapacity * 2;
  if (oldLength = newLength)
! return false;

  Object[] newTable = new Object[newLength];

***
*** 480,485 
--- 486,492 
  }
  }
  table = newTable;
+ return true;
  }

  /**
***
*** 494,500 
  int n = m.size();
  if (n == 0)
  return;
! if (n + (n  1)  table.length) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())
--- 501,507 
  int n = m.size();
  if (n == 0)
  return;
! if (n  table.length / 3) // conservatively pre-expand
  resize(capacity(n));

  for (Entry? extends K, ? extends V e : m.entrySet())


Regards, Peter



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov


On 08.07.2014 4:44, Martin Buchholz wrote:




On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz marti...@google.com 
mailto:marti...@google.com wrote:


I'd like to offer an alternative version of this change.  webrev
coming soon.


Here's the promised webrev.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/


- Fixes a typo in the javadoc.
- removes the threshold field (WAT, a cache to avoid multiplying by 3?)
- optimizes 3*x into x + x  1


My personal preference would be x + x + x, but I thought JIT can do this 
kind of optimizations itself.

Out of curiosity I've created a microbenchmark:

Benchmark Mode   SamplesScore Score 
errorUnits
o.s.MyBenchmark.testMethod_01_X3  avgt   200 5.900
0.041ns/op
o.s.MyBenchmark.testMethod_02_PPP avgt   200 6.029
0.035ns/op
o.s.MyBenchmark.testMethod_03_PSH avgt   200 5.907
0.071ns/op


On my machine 3*x and x + (x1) appear to be of the same speed (#1 and 
#3 above).

x + x + x (case #2) happens to be ~2% slower.

Given the optimization doesn't introduce any speedup, wouldn't it be 
better to use 3*x for its clarity?


Sincerely yours,
Ivan



- adds more test assertions
- removes the unusual 4/3 slack space for expansion on deserialization.

TODO: the test should be testng-ified, I think.






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart

On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

I took your latest version of the patch and modified it a little:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/



But isn't it post-insert-resize vs pre-insert-resize problem Doug 
mentioned above?
I've tested a similar fix and it showed slow down of the put() operation. 

Hi Ivan,

Might be that it has to do with # of bytecodes in the method and 
in-lining threshold. I modified it once more, to make put() method as 
short as possible:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/

With this, I ran the following JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject, Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

I get the following results on my i7/Linux using:

java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 
8 -gc 1 -t 1


Original:
Benchmark Mode   SamplesScore  Score 
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13088296.198 403446.449
ops/s


Patched:
Benchmark Mode   SamplesScore  Score 
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154
ops/s



Can you run your test with webrev.05 and see what you get ?

Regards, Peter



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart

On 07/08/2014 03:56 PM, Martin Buchholz wrote:

On Tue, Jul 8, 2014 at 5:30 AM, Peter Levart peter.lev...@gmail.com wrote:


On 07/08/2014 02:20 PM, Peter Levart wrote:


That's right. Not in put(). But in putAll() it can overflow, since the
argument Map can be of any size that fits in int...



I took another look at putAll.  I think we can do more simply, relying on
the checks in capacity and resize:

   int n = m.size();
  if (n == 0)
  return;
-if (n  threshold) // conservatively pre-expand
-resize(capacity(n));
+if (n  size)
+resize(capacity(n)); // conservatively pre-expand

  for (Entry? extends K, ? extends V e : m.entrySet())
  put(e.getKey(), e.getValue());

Also, note I'm trying to avoid (relatively expensive) integer division,
except at compile time.


Nice.

Peter



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov
Given the table size if a power of two, another possible optimization 
would be

 private static int nextKeyIndex(int i, int len) {
-return (i + 2  len ? i + 2 : 0);
+return (i + 2)  (len - 1);
 }

Or even
+int m = len - 1;
 while ( (item = tab[i]) != null) {
...
-i = nextKeyIndex(i, len);
+i = (i + 2)  m;
 }
where ever applicable.

On 08.07.2014 19:14, Martin Buchholz wrote:
So ... using 3*x is never wrong, and optimizing to x + (x  1) is at 
best only going to save a couple of cycles, so 3*x is the right choice 
except for the most performance sensitive code.


In java.util, we tend to go further and write optimal code even when 
hotspot is likely to make the same optimizations, partly under Doug 
Lea's performance-obsessed influence.


Also, microbenchmarking is notoriously difficult.

If you've written a microbenchmark, it would be good to check it in 
somewhere.  I don't know what current openjdk practice is for that...



On Tue, Jul 8, 2014 at 7:01 AM, Ivan Gerasimov 
ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote:



On 08.07.2014 4:44, Martin Buchholz wrote:




On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz
marti...@google.com mailto:marti...@google.com wrote:

I'd like to offer an alternative version of this change.
 webrev coming soon.


Here's the promised webrev.

http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/

http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/

- Fixes a typo in the javadoc.
- removes the threshold field (WAT, a cache to avoid
multiplying by 3?)
- optimizes 3*x into x + x  1


My personal preference would be x + x + x, but I thought JIT can
do this kind of optimizations itself.
Out of curiosity I've created a microbenchmark:

Benchmark Mode SamplesScore 
Score errorUnits
o.s.MyBenchmark.testMethod_01_X3  avgt 2005.900   
0.041ns/op
o.s.MyBenchmark.testMethod_02_PPP avgt 2006.029   
0.035ns/op
o.s.MyBenchmark.testMethod_03_PSH avgt 2005.907   
0.071ns/op


On my machine 3*x and x + (x1) appear to be of the same speed
(#1 and #3 above).
x + x + x (case #2) happens to be ~2% slower.

Given the optimization doesn't introduce any speedup, wouldn't it
be better to use 3*x for its clarity?

Sincerely yours,
Ivan




- adds more test assertions
- removes the unusual 4/3 slack space for expansion on
deserialization.

TODO: the test should be testng-ified, I think.









Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart


On 07/08/2014 10:07 PM, Martin Buchholz wrote:

I updated my webrev and it is again feature-complete.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/

(old webrev at
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/

)

This incorporates Peter's idea of making resize return a boolean, 
keeps the map unchanged if resize throws, moves the check for capacity 
exceeded into resize, and minimizes bytecode in put().  I'm happy with 
this (except for degraded behavior near MAX_CAPACITY).


Nice solution. I hope that resize:

 458 private boolean resize(int newCapacity) {
 459 // assert (newCapacity  -newCapacity) == newCapacity; // power of 
2
 460 int newLength = newCapacity * 2;
 461
 462 Object[] oldTable = table;
 463 int oldLength = oldTable.length;
 464 if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
further
 465 if (size == MAXIMUM_CAPACITY - 1)
 466 throw new IllegalStateException(Capacity exhausted.);
 467 return false;
 468 }
 469 if (oldLength = newLength)
 470 return false;

... is never (going to be) called in context that would make oldLength 
= newLength  oldLength == 2 * MAXIMUM_CAPACITY  size == 
MAXIMUM_CAPACITY - 1 ...


In that case it would be better to return false than to throw 
IllegalStateException, since no re-sizing was actually requested. if 
statement (L469,470) could be moved just after newLength assignment 
(L460) to make resize() more robust to possible future changes of 
calling code.


I wonder what the benchmarks will say.

Regards, Peter




On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com 
mailto:peter.lev...@gmail.com wrote:


On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

I took your latest version of the patch and modified it a
little:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/


But isn't it post-insert-resize vs pre-insert-resize problem
Doug mentioned above?
I've tested a similar fix and it showed slow down of the put()
operation.

Hi Ivan,

Might be that it has to do with # of bytecodes in the method and
in-lining threshold. I modified it once more, to make put() method
as short as possible:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/
http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/

With this, I ran the following JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject, Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

I get the following results on my i7/Linux using:

java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i
10 -wi 8 -gc 1 -t 1

Original:

Benchmark Mode   SamplesScore  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13088296.198
tel:13088296.198 403446.449ops/s

Patched:

Benchmark Mode   SamplesScore  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13180594.537
282047.154ops/s


Can you run your test with webrev.05 and see what you get ?

Regards, Peter






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Martin Buchholz
If size == MAXIMUM_CAPACITY - 1 and you're in resize, presumably you're
about to fill that last empty slot after returning, so you want to throw
instead of returning false?

Benchmarks welcome.


On Tue, Jul 8, 2014 at 2:15 PM, Peter Levart peter.lev...@gmail.com wrote:


 On 07/08/2014 10:07 PM, Martin Buchholz wrote:

 I updated my webrev and it is again feature-complete.

 http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/
  (old webrev at

 http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/
  )

  This incorporates Peter's idea of making resize return a boolean, keeps
 the map unchanged if resize throws, moves the check for capacity exceeded
 into resize, and minimizes bytecode in put().  I'm happy with this (except
 for degraded behavior near MAX_CAPACITY).


 Nice solution. I hope that resize:

  458 private boolean resize(int newCapacity) {
  459 // assert (newCapacity  -newCapacity) == newCapacity; // power 
 of 2
  460 int newLength = newCapacity * 2;
  461
  462 Object[] oldTable = table;
  463 int oldLength = oldTable.length;
  464 if (oldLength == 2 * MAXIMUM_CAPACITY) { // can't expand any 
 further
  465 if (size == MAXIMUM_CAPACITY - 1)
  466 throw new IllegalStateException(Capacity exhausted.);
  467 return false;
  468 }
  469 if (oldLength = newLength)
  470 return false;

 ... is never (going to be) called in context that would make oldLength =
 newLength  oldLength == 2 * MAXIMUM_CAPACITY  size == MAXIMUM_CAPACITY
 - 1 ...

 In that case it would be better to return false than to throw
 IllegalStateException, since no re-sizing was actually requested. if
 statement (L469,470) could be moved just after newLength assignment (L460)
 to make resize() more robust to possible future changes of calling code.

 I wonder what the benchmarks will say.

 Regards, Peter




 On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com
 wrote:

 On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

  I took your latest version of the patch and modified it a little:

 http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/


 But isn't it post-insert-resize vs pre-insert-resize problem Doug
 mentioned above?
 I've tested a similar fix and it showed slow down of the put()
 operation.

  Hi Ivan,

 Might be that it has to do with # of bytecodes in the method and
 in-lining threshold. I modified it once more, to make put() method as short
 as possible:

 http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/

 With this, I ran the following JMH benchmark:

 @State(Scope.Thread)
 public class IHMBench {

 MapObject, Object map = new IdentityHashMapObject, Object();

 @Benchmark
 public void putNewObject(Blackhole bh) {
 Object o = new Object();
 bh.consume(map.put(o, o));
 if (map.size()  10) {
 map = new IdentityHashMapObject, Object();
 }
 }
 }

 I get the following results on my i7/Linux using:

 java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi
 8 -gc 1 -t 1

 Original:

 Benchmark Mode   SamplesScore  Score error
  Units
  j.t.IHMBench.putNewObjectthrpt10 13088296.198 403446.449
  ops/s

 Patched:

 Benchmark Mode   SamplesScore  Score error
  Units
  j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154
  ops/s


 Can you run your test with webrev.05 and see what you get ?

 Regards, Peter






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov

Might be worth to add modCount++ before this line:

 487 table = newTable;
 488 return true;


On 09.07.2014 0:07, Martin Buchholz wrote:

I updated my webrev and it is again feature-complete.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/

(old webrev at
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/

)

This incorporates Peter's idea of making resize return a boolean, 
keeps the map unchanged if resize throws, moves the check for capacity 
exceeded into resize, and minimizes bytecode in put().  I'm happy with 
this (except for degraded behavior near MAX_CAPACITY).





On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com 
mailto:peter.lev...@gmail.com wrote:


On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

I took your latest version of the patch and modified it a
little:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/


But isn't it post-insert-resize vs pre-insert-resize problem
Doug mentioned above?
I've tested a similar fix and it showed slow down of the put()
operation.

Hi Ivan,

Might be that it has to do with # of bytecodes in the method and
in-lining threshold. I modified it once more, to make put() method
as short as possible:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/
http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/

With this, I ran the following JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject, Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

I get the following results on my i7/Linux using:

java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i
10 -wi 8 -gc 1 -t 1

Original:

Benchmark Mode   SamplesScore  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13088296.198
tel:13088296.198 403446.449ops/s

Patched:

Benchmark Mode   SamplesScore  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13180594.537
282047.154ops/s


Can you run your test with webrev.05 and see what you get ?

Regards, Peter






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart


On 07/08/2014 11:39 PM, Ivan Gerasimov wrote:

Might be worth to add modCount++ before this line:

  487 table = newTable;
  488 return true;

Not quite, I think. The map has just been resized, but it's contents has 
not changed yet logically.


Regards, Peter


On 09.07.2014 0:07, Martin Buchholz wrote:

I updated my webrev and it is again feature-complete.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/

(old webrev at
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/

)

This incorporates Peter's idea of making resize return a boolean, 
keeps the map unchanged if resize throws, moves the check for 
capacity exceeded into resize, and minimizes bytecode in put().  I'm 
happy with this (except for degraded behavior near MAX_CAPACITY).





On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com 
mailto:peter.lev...@gmail.com wrote:


On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

I took your latest version of the patch and modified it a
little:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/


But isn't it post-insert-resize vs pre-insert-resize problem
Doug mentioned above?
I've tested a similar fix and it showed slow down of the
put() operation.

Hi Ivan,

Might be that it has to do with # of bytecodes in the method and
in-lining threshold. I modified it once more, to make put()
method as short as possible:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/
http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/

With this, I ran the following JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject, Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

I get the following results on my i7/Linux using:

java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i
10 -wi 8 -gc 1 -t 1

Original:

Benchmark Mode   Samples  Score  Score error
   Units
j.t.IHMBench.putNewObjectthrpt10 13088296.198
tel:13088296.198 403446.449ops/s

Patched:

Benchmark Mode   Samples  Score  Score error
   Units
j.t.IHMBench.putNewObjectthrpt10 13180594.537
282047.154ops/s


Can you run your test with webrev.05 and see what you get ?

Regards, Peter








Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Mike Duigou
Looks pretty good. I like the additional comments as well.

Could you document the return conditions of resize()?

A since we're there already suggestion for readObject:

if (size  0)
  throw new InvalidObjectException(Illegal mappings count:  + size);

Mike

On Jul 8 2014, at 13:07 , Martin Buchholz marti...@google.com wrote:

 I updated my webrev and it is again feature-complete.
 http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/
 (old webrev at
 http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/
 )
 
 This incorporates Peter's idea of making resize return a boolean, keeps the
 map unchanged if resize throws, moves the check for capacity exceeded into
 resize, and minimizes bytecode in put().  I'm happy with this (except for
 degraded behavior near MAX_CAPACITY).
 
 
 
 
 On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com wrote:
 
 On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:
 
 I took your latest version of the patch and modified it a little:
 
 http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/
 
 
 But isn't it post-insert-resize vs pre-insert-resize problem Doug
 mentioned above?
 I've tested a similar fix and it showed slow down of the put() operation.
 
 Hi Ivan,
 
 Might be that it has to do with # of bytecodes in the method and in-lining
 threshold. I modified it once more, to make put() method as short as
 possible:
 
 http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/
 
 With this, I ran the following JMH benchmark:
 
 @State(Scope.Thread)
 public class IHMBench {
 
MapObject, Object map = new IdentityHashMapObject, Object();
 
@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
 }
 
 I get the following results on my i7/Linux using:
 
 java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0 -i 10 -wi 8
 -gc 1 -t 1
 
 Original:
 
 Benchmark Mode   SamplesScore  Score error
 Units
 j.t.IHMBench.putNewObjectthrpt10 13088296.198 403446.449
 ops/s
 
 Patched:
 
 Benchmark Mode   SamplesScore  Score error
 Units
 j.t.IHMBench.putNewObjectthrpt10 13180594.537 282047.154
 ops/s
 
 
 Can you run your test with webrev.05 and see what you get ?
 
 Regards, Peter
 
 



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov


On 09.07.2014 1:44, Peter Levart wrote:


On 07/08/2014 11:39 PM, Ivan Gerasimov wrote:

Might be worth to add modCount++ before this line:

  487 table = newTable;
  488 return true;

Not quite, I think. The map has just been resized, but it's contents 
has not changed yet logically.


IdentityHashMapIterator's methods assume that if modCount didn't change, 
then the indices calculated earlier remain valid, and this is wrong in 
the case of resize.


Sincerely yours,
Ivan



Regards, Peter


On 09.07.2014 0:07, Martin Buchholz wrote:

I updated my webrev and it is again feature-complete.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/

(old webrev at
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/

)

This incorporates Peter's idea of making resize return a boolean, 
keeps the map unchanged if resize throws, moves the check for 
capacity exceeded into resize, and minimizes bytecode in put().  I'm 
happy with this (except for degraded behavior near MAX_CAPACITY).





On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart peter.lev...@gmail.com 
mailto:peter.lev...@gmail.com wrote:


On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

I took your latest version of the patch and modified it
a little:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/


But isn't it post-insert-resize vs pre-insert-resize problem
Doug mentioned above?
I've tested a similar fix and it showed slow down of the
put() operation.

Hi Ivan,

Might be that it has to do with # of bytecodes in the method and
in-lining threshold. I modified it once more, to make put()
method as short as possible:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/
http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/

With this, I ran the following JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject, Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

I get the following results on my i7/Linux using:

java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0
-i 10 -wi 8 -gc 1 -t 1

Original:

Benchmark Mode   Samples  Score  Score error
   Units
j.t.IHMBench.putNewObjectthrpt10 13088296.198
tel:13088296.198 403446.449ops/s

Patched:

Benchmark Mode   Samples  Score  Score error
   Units
j.t.IHMBench.putNewObjectthrpt10 13180594.537
282047.154ops/s


Can you run your test with webrev.05 and see what you get ?

Regards, Peter










Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Peter Levart


On 07/09/2014 12:06 AM, Ivan Gerasimov wrote:


On 09.07.2014 1:44, Peter Levart wrote:


On 07/08/2014 11:39 PM, Ivan Gerasimov wrote:

Might be worth to add modCount++ before this line:

  487 table = newTable;
  488 return true;

Not quite, I think. The map has just been resized, but it's contents 
has not changed yet logically.


IdentityHashMapIterator's methods assume that if modCount didn't 
change, then the indices calculated earlier remain valid, and this is 
wrong in the case of resize.


Sincerely yours,
Ivan


IdentityHashMapIterator:

 713 private abstract class IdentityHashMapIteratorT implements 
IteratorT {
 714 int index = (size != 0 ? 0 : table.length); // current slot.
 715 int expectedModCount = modCount; // to support fast-fail
 716 int lastReturnedIndex = -1;  // to allow remove()
 717 boolean indexValid; // To avoid unnecessary next computation
 718 Object[] traversalTable = table; // reference to main table or copy


...takes a snap-shot of reference to current table when created, so 
indexes would still be valid ...


...but resize() also clears old table as it copies elements to new table:

 478 oldTable[j] = null;
 479 oldTable[j+1] = null;


So it would appear that modCount should be incremented even before the 
copying loop.


But as it seems, no user call-backs are possible during resize() in same 
thread and normal writes are not ordered anyway for other threads and 
after each successful resize() at least one new key will be added to 
the map so modCount will be incremented before control is returned to 
user code anyway.


Regards, Peter





Regards, Peter


On 09.07.2014 0:07, Martin Buchholz wrote:

I updated my webrev and it is again feature-complete.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/

(old webrev at
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/

)

This incorporates Peter's idea of making resize return a boolean, 
keeps the map unchanged if resize throws, moves the check for 
capacity exceeded into resize, and minimizes bytecode in put(). 
 I'm happy with this (except for degraded behavior near MAX_CAPACITY).





On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart 
peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote:


On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

I took your latest version of the patch and modified it
a little:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/


But isn't it post-insert-resize vs pre-insert-resize
problem Doug mentioned above?
I've tested a similar fix and it showed slow down of the
put() operation.

Hi Ivan,

Might be that it has to do with # of bytecodes in the method
and in-lining threshold. I modified it once more, to make put()
method as short as possible:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/
http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/

With this, I ran the following JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject,
Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

I get the following results on my i7/Linux using:

java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0
-i 10 -wi 8 -gc 1 -t 1

Original:

Benchmark Mode   Samples  Score  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13088296.198
tel:13088296.198 403446.449ops/s

Patched:

Benchmark Mode   Samples  Score  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13180594.537
282047.154ops/s


Can you run your test with webrev.05 and see what you get ?

Regards, Peter












Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov

Ah, yes, sure.
I overlooked the reference to the table :-)

On 09.07.2014 2:42, Peter Levart wrote:


On 07/09/2014 12:06 AM, Ivan Gerasimov wrote:


On 09.07.2014 1:44, Peter Levart wrote:


On 07/08/2014 11:39 PM, Ivan Gerasimov wrote:

Might be worth to add modCount++ before this line:

  487 table = newTable;
  488 return true;

Not quite, I think. The map has just been resized, but it's contents 
has not changed yet logically.


IdentityHashMapIterator's methods assume that if modCount didn't 
change, then the indices calculated earlier remain valid, and this is 
wrong in the case of resize.


Sincerely yours,
Ivan


IdentityHashMapIterator:

  713 private abstract class IdentityHashMapIteratorT implements 
IteratorT {
  714 int index = (size != 0 ? 0 : table.length); // current slot.
  715 int expectedModCount = modCount; // to support fast-fail
  716 int lastReturnedIndex = -1;  // to allow remove()
  717 boolean indexValid; // To avoid unnecessary next computation
  718 Object[] traversalTable = table; // reference to main table or 
copy

...takes a snap-shot of reference to current table when created, so 
indexes would still be valid ...


...but resize() also clears old table as it copies elements to new table:

  478 oldTable[j] = null;
  479 oldTable[j+1] = null;

So it would appear that modCount should be incremented even before the 
copying loop.


But as it seems, no user call-backs are possible during resize() in 
same thread and normal writes are not ordered anyway for other threads 
and after each successful resize() at least one new key will be 
added to the map so modCount will be incremented before control is 
returned to user code anyway.


Regards, Peter





Regards, Peter


On 09.07.2014 0:07, Martin Buchholz wrote:

I updated my webrev and it is again feature-complete.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/

(old webrev at
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity.0/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity.0/

)

This incorporates Peter's idea of making resize return a boolean, 
keeps the map unchanged if resize throws, moves the check for 
capacity exceeded into resize, and minimizes bytecode in put(). 
 I'm happy with this (except for degraded behavior near MAX_CAPACITY).





On Tue, Jul 8, 2014 at 8:06 AM, Peter Levart 
peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote:


On 07/08/2014 03:00 PM, Ivan Gerasimov wrote:

I took your latest version of the patch and modified
it a little:


http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.01/

http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.01/


But isn't it post-insert-resize vs pre-insert-resize
problem Doug mentioned above?
I've tested a similar fix and it showed slow down of the
put() operation.

Hi Ivan,

Might be that it has to do with # of bytecodes in the method
and in-lining threshold. I modified it once more, to make
put() method as short as possible:

http://cr.openjdk.java.net/~plevart/jdk9-dev/IdentityHashMap/webrev.05/
http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/IdentityHashMap/webrev.05/

With this, I ran the following JMH benchmark:

@State(Scope.Thread)
public class IHMBench {

MapObject, Object map = new IdentityHashMapObject,
Object();

@Benchmark
public void putNewObject(Blackhole bh) {
Object o = new Object();
bh.consume(map.put(o, o));
if (map.size()  10) {
map = new IdentityHashMapObject, Object();
}
}
}

I get the following results on my i7/Linux using:

java -Xmx4G -Xms4G -XX:+UseParallelGC -jar benchmarks.jar -f 0
-i 10 -wi 8 -gc 1 -t 1

Original:

Benchmark Mode   Samples  Score  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13088296.198
tel:13088296.198 403446.449ops/s

Patched:

Benchmark Mode   Samples  Score  Score
errorUnits
j.t.IHMBench.putNewObjectthrpt10 13180594.537
282047.154ops/s


Can you run your test with webrev.05 and see what you get ?

Regards, Peter














Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov


On 09.07.2014 3:20, Martin Buchholz wrote:

I agree with Peter that we do not need to increment modCount on resize.

My latest webrev is again done.

Ivan, feel free to take over.


Yes, thanks! The fix is much much better now.

I think I see yet another very minor glitch, though.
If the table is already of size 2 * MAXIMUM_CAPACITY, and we're merging 
in another map, which has more elements with putAll(), resize() will be 
called and it will throw, even though all the elements could fit into 
the map due to the key duplicates.


To avoid this the following check should be done first in resize():

 469 if (oldLength = newLength)
 470 return false;
 


Sincerely yours,
Ivan


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov

Here's the same final webrev by Martin.
I only shortened the comment before capacity() and moved the check in 
resize() upper.


http://cr.openjdk.java.net/~igerasim/6904367/4/webrev/

Sincerely yours,
Ivan

On 09.07.2014 4:25, Ivan Gerasimov wrote:


On 09.07.2014 3:20, Martin Buchholz wrote:

I agree with Peter that we do not need to increment modCount on resize.

My latest webrev is again done.

Ivan, feel free to take over.


Yes, thanks! The fix is much much better now.

I think I see yet another very minor glitch, though.
If the table is already of size 2 * MAXIMUM_CAPACITY, and we're 
merging in another map, which has more elements with putAll(), 
resize() will be called and it will throw, even though all the 
elements could fit into the map due to the key duplicates.


To avoid this the following check should be done first in resize():

 469 if (oldLength = newLength)
 470 return false;


Sincerely yours,
Ivan






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-08 Thread Ivan Gerasimov


On 09.07.2014 4:46, Martin Buchholz wrote:
Let me understand - you're worried that when size is MAX_CAPACITY - 1, 
then a call to putAll that does not actually add any elements might 
throw?  Well, I'm having a hard time considering that corner case a 
bug, given how unusable the map is at this point.


Your suggested fix of returning immediately in case of no resize would 
cause put to succeed and reach size == MAX_CAPACITY, which you were 
trying to prevent.



Ah, yes.
It's almost 5 AM here, I guess I need to have some sleep before trying 
to fix anything else :-)




On Tue, Jul 8, 2014 at 5:25 PM, Ivan Gerasimov 
ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote:



On 09.07.2014 3:20, Martin Buchholz wrote:

I agree with Peter that we do not need to increment modCount on
resize.

My latest webrev is again done.

Ivan, feel free to take over.


Yes, thanks! The fix is much much better now.

I think I see yet another very minor glitch, though.
If the table is already of size 2 * MAXIMUM_CAPACITY, and we're
merging in another map, which has more elements with putAll(),
resize() will be called and it will throw, even though all the
elements could fit into the map due to the key duplicates.

To avoid this the following check should be done first in resize():

  469 if (oldLength = newLength)
  470 return false;
  


Sincerely yours,
Ivan






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-07 Thread Ivan Gerasimov


On 08.07.2014 4:47, Martin Buchholz wrote:

I think this code has an off-by-factor-of-2 bug.

+if (expectedMaxSize  MAXIMUM_CAPACITY / 3)
+return MAXIMUM_CAPACITY;



No, even though it looks like a bug.

(MAXIMUM_CAPACITY / 3) * (3 / 2) == MAXIMUM_CAPACITY / 2.

if expected size  MAXIMUM_CAPACITY / 3,
then the minimum capacity must be  MAXIMUM_CAPACITY / 2
then the minimum capacity == MAXIMUM_CAPACITY.




Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-07 Thread Ivan Gerasimov


Thank you Martin for the enhancement!

It's a good idea to get rid of threshold variable!
When the table grows large, it will start to try to resize the table on 
every single put().
Though it shouldn't matter much, as in such situation everything is 
already slow.



On 08.07.2014 4:44, Martin Buchholz wrote:


On Mon, Jul 7, 2014 at 9:41 AM, Martin Buchholz marti...@google.com 
mailto:marti...@google.com wrote:


I'd like to offer an alternative version of this change.  webrev
coming soon.


Here's the promised webrev.
http://cr.openjdk.java.net/~martin/webrevs/openjdk9/IdentityHashMap-capacity/ 
http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/IdentityHashMap-capacity/


- Fixes a typo in the javadoc.
- removes the threshold field (WAT, a cache to avoid multiplying by 3?)
- optimizes 3*x into x + x  1


Unfortunately, x + x  1 causes the same overflow bug as 3*x:
(int)(1431655766 + 1431655766  1) == 2

this can happen in capacity() and, theoretically, in putAll().

I propose to leave the check
 if (expectedMaxSize  MAXIMUM_CAPACITY / 3)
 return MAXIMUM_CAPACITY;
which is overflow resistant.

In putAll, I guess, something like this can be used:
int len = table.length;
if (n  len || n  len - n || n  len - (n  1))
resize(capacity(n));



- adds more test assertions
- removes the unusual 4/3 slack space for expansion on deserialization.
TODO: the test should be testng-ified, I think.






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-06 Thread Jeff Hain



So, I reverted this change to the original code, but added a single 
check of the current table size before doing any modifications to the map.
This is to address the issue #4, and this doesn't seem to introduce any 
significant penalty.

Would you please help review the updated webrev:

http://cr.openjdk.java.net/~igerasim/6904367/3/webrev/

Sincerely yours,

Ivan

It's possible to avoid the capacity test for the general case:
put {
    ...while loop...

    if (size  threshold) {
        modCount++;
        tab[i] = k;
        tab[i + 1] = value;
    ++size;
    } else {
    putAndResize(k, value, i);
    }
    return null;
}
private void putAndResize(Object k, Object value, int i) {
    if (size == MAXIMUM_CAPACITY - 1)
    throw new IllegalStateException(Capacity exhausted.);
    modCount++;
    Object[] tab = table;
    tab[i] = k;
    tab[i + 1] = value;
    ++size;
    resize(tab.length); // len == 2 * current capacity.
}

It seems faster more often than not, but it adds more (and copied) code:
not sure it's worth it.

-Jeff


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-06 Thread Ivan Gerasimov

Thanks for suggestion Jeff!

I've tried it, but it doesn't seem to make a difference on my machine.
Here are the numbers. I measured the time of a single put in nanoseconds.


--- | vanila | fix1  | fix2
client  | 8292   | 8220  | 8304
server  | 8197   | 8198  | 8221
interpreter | 13980  | 14478 | 14152


fix1 - the fix from webrev #3
fix2 - the fix with your suggestion

Sincerely yours,
Ivan


On 06.07.2014 15:13, Jeff Hain wrote:



So, I reverted this change to the original code, but added a single
check of the current table size before doing any modifications to the 
map.

This is to address the issue #4, and this doesn't seem to introduce any
significant penalty.

Would you please help review the updated webrev:

http://cr.openjdk.java.net/~igerasim/6904367/3/webrev/ 
http://cr.openjdk.java.net/%7Eigerasim/6904367/3/webrev/


Sincerely yours,

Ivan

It's possible to avoid the capacity test for the general case:
put {
...while loop...

if (size  threshold) {
modCount++;
tab[i] = k;
tab[i + 1] = value;
++size;
} else {
putAndResize(k, value, i);
}
return null;
}
private void putAndResize(Object k, Object value, int i) {
if (size == MAXIMUM_CAPACITY - 1)
throw new IllegalStateException(Capacity exhausted.);
modCount++;
Object[] tab = table;
tab[i] = k;
tab[i + 1] = value;
++size;
resize(tab.length); // len == 2 * current capacity.
}

It seems faster more often than not, but it adds more (and copied) code:
not sure it's worth it.

-Jeff






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-05 Thread Ivan Gerasimov

Thank you Doug for your comment!

On 04.07.2014 23:38, Doug Lea wrote:

On 07/04/2014 01:33 PM, Ivan Gerasimov wrote:

On 04.07.2014 8:14, David Holmes wrote:

Hi Ivan,

I find the change to capacity somewhat obfuscating and I can't see 
what the

actual bug was.

The bug was in rounding in the expression minCapacity = (3 * 
expectedMaxSize)/2.
Suppose, we want the expected size to be 11, then minCapacity becomes 
(int)(11 *

3 / 2) == 16.
The threshold is calculated later to be (int)(16 * 2 / 3) == 10, 
which is less

than expected size 11.


So the bug report was based on someone noticing that with
odd-valued sizes, the undocumented 2/3 target load was
rounded down, not up, which they didn't like?


Not quite.

There are two small issues that I'm trying to address here:
1) The wrong capacity is calculated based on the expected size due to 
rounding error. This is only noticeable for the expected sizes 11, 43, 
171, 683 and a few others. This leads to a redundant table resize when 
inserting the expected number of elements.
2) The wrong capacity due to overflow error. For numbers N larger or 
equal to 1431655765, the expression (int)(N * 3) / 2 produces 
non-negative small numbers. Frankly speaking, this is also true for my 
last proposed fix, so the updated webrev will follow.
3) In put() the table gets resized as soon as the threshold is reached. 
This is wrong for expected sizes 2, 3, 5, 10, 21, 42, and some others.

4) If put() was unsuccessful, the map gets corrupt.



I don't object to rounding it up and modernizing the size
calculations to use highestOneBit, but it's a weak pretense :-)



Simple rounding the minimum capacity up as minCapacity = (3 * 
expectedMaxSize + 1)/2 would produce the same result of course.

I just took a chance to replace the while loop with a single function call.

In the last webrev I also modified the check to address the issue #2 
from the list above.




To address the issue I combined the division by 2 with the rounding 
up to the

nearest power of two.
I also took a chance to replace a while-loop with a single call to the
highestOneBit method, which calculates exactly what we need here.

The recursive call to put after a resize seems very sub-optimal as 
you will
re-search the map for the non-existent key. Can you not just 
recompute the

correct indices and do the store?


The initial rationale for post-insert-resize was to avoid these issues
given the need to preserve at least one empty slot. Which I
expect to remain faster than pre-resize even with your modified patch.
Please provide systematic performance comparisons before committing!

-Doug 


Yes, you are right.
Microbenchmark showed more than 20% slowdown of a single put operation.
So, I reverted this change to the original code, but added a single 
check of the current table size before doing any modifications to the map.
This is to address the issue #4, and this doesn't seem to introduce any 
significant penalty.


Would you please help review the updated webrev:

http://cr.openjdk.java.net/~igerasim/6904367/3/webrev/

Sincerely yours,
Ivan



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-04 Thread Ivan Gerasimov

Thanks David for looking at the change!

On 04.07.2014 8:14, David Holmes wrote:

Hi Ivan,

I find the change to capacity somewhat obfuscating and I can't see 
what the actual bug was.


The bug was in rounding in the expression minCapacity = (3 * 
expectedMaxSize)/2.
Suppose, we want the expected size to be 11, then minCapacity becomes 
(int)(11 * 3 / 2) == 16.
The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which 
is less than expected size 11.


To address the issue I combined the division by 2 with the rounding up 
to the nearest power of two.
I also took a chance to replace a while-loop with a single call to the 
highestOneBit method, which calculates exactly what we need here.


The recursive call to put after a resize seems very sub-optimal as you 
will re-search the map for the non-existent key. Can you not just 
recompute the correct indices and do the store?



Okay, I replaced the recursive call with the index recomputation.

As was suggested by Jeff, I added a comment to the MAXIMUM_CAPACITY 
constant.


I also reverted some part of the changes to the resize() function, as it 
turned out that the new code failed to deal correctly with the table of 
maximum capacity.


I wanted to add checking of this behavior to the regression test, but 
faced an obstacle: I used reflection to make the MAXIMUM_CAPACITY 
constant smaller, so it wouldn't be needed to insert 2**29 element to 
fill the table up. The debugger was showing that the constant was 
changed, but the old value was still used in the conditions.


Currently I commented this test out. If anyone could suggest me how I 
can properly change MAXIMUM_CAPACITY in this situation, it would be very 
appreciated.


Here is the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/2/webrev/

Would you please help review it?

Sincerely yours,
Ivan.

Alternatively use the original code and if an exception occurs on 
resize then simply undo the put. I'm always worried that inverting 
the behaviour like you will have some unexpected side effect somewhere 
- this code has been like this for a very, very long time.


Cheers,
David

On 4/07/2014 5:38 AM, Ivan Gerasimov wrote:

Thank you Jeff!


On 03.07.2014 23:07, Jeff Hain wrote:


Hi.


WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/
http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/

The while loop in put(...) supposes that there is at least one free
slot,
which was the case before (that could be added as comment).

Now if you try to go past max capacity, instead of getting an
IllegalStateException,
you get an infinite loop.


Yes, you're right!


It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18
(=256, needs to be larger than DEFAULT_CAPACITY, else it can be
ignored).
With previous version you get IllegalStateException on trying to add
255th mapping
(on the resize that occurs when just-after-put-size is 255 =
threshold = 255),
and with new version you get infinite loop on trying to add 257th
mapping.

Solutions I see would be adding a throw if size = MAX_CAPACITY
before the loop,


This would break the case when an existing key is added to the already
full table.
Moreover, the full table without a single empty slot could cause an
infinite looping in get(), remove(), etc.


or not adding that overhead and instead throwing when
size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY.



This seems appropriate.
With the current implementation we can only put MAX_CAPACITY-1 elements
anyway.


I would also rather use == over = for these tests, to underline
the fact
that size is not supposed to be larger, but there might be some
reasons not to.


I think it makes the code a bit more stable from the perspective of the
future changes.

Here's the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/

Sincerely yours,
Ivan




-Jeff












Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-04 Thread Doug Lea

On 07/04/2014 01:33 PM, Ivan Gerasimov wrote:

On 04.07.2014 8:14, David Holmes wrote:

Hi Ivan,

I find the change to capacity somewhat obfuscating and I can't see what the
actual bug was.


The bug was in rounding in the expression minCapacity = (3 * expectedMaxSize)/2.
Suppose, we want the expected size to be 11, then minCapacity becomes (int)(11 *
3 / 2) == 16.
The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which is less
than expected size 11.


So the bug report was based on someone noticing that with
odd-valued sizes, the undocumented 2/3 target load was
rounded down, not up, which they didn't like?

I don't object to rounding it up and modernizing the size
calculations to use highestOneBit, but it's a weak pretense :-)



To address the issue I combined the division by 2 with the rounding up to the
nearest power of two.
I also took a chance to replace a while-loop with a single call to the
highestOneBit method, which calculates exactly what we need here.


The recursive call to put after a resize seems very sub-optimal as you will
re-search the map for the non-existent key. Can you not just recompute the
correct indices and do the store?


The initial rationale for post-insert-resize was to avoid these issues
given the need to preserve at least one empty slot. Which I
expect to remain faster than pre-resize even with your modified patch.
Please provide systematic performance comparisons before committing!

-Doug



Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-04 Thread Jeff Hain


Here is the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/2/webrev/

There is another bug with the initial implementation:
if put throws, the element is still put (and the size incremented accordingly),
and if putting a new mapping after the throwing one, the table gets full
and get()/put()/remove() become subject to infinite loop.
(and my idea that size was not supposed to go past MAX_CAPACITY was false in 
that case :-)

With the new implementation, if put throws, the map is not modified
(not even modCount) so it doesn't get broken.

For performance, maybe isolating the code in the if in a
int resizeAndPut(Object k, Object item, int len, int i) {...}
method would help, reducing put() bytecode size.

Having put() find the last free slots in an almost-full table
would most often require a lot of time, so I wonder if
anyone would have hit any of these infinite loops in practice.


-Jeff


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-04 Thread David Holmes

Ivan,

Fields like MAXIMUM_CAPACITY are compile-time constants. Their values 
are placed directly into the code that uses them - there is no 
associated getField to load it. Hence changing the field via reflection 
has no affect on any existing code that references the field.


I'm still uncomfortable about the scope of the changes - and Doug 
rightly mentions doing detailed performance runs before submitting this. 
I still think recovering from a post-put resize failure might be better 
than resizing first.


David

On 5/07/2014 3:33 AM, Ivan Gerasimov wrote:

Thanks David for looking at the change!

On 04.07.2014 8:14, David Holmes wrote:

Hi Ivan,

I find the change to capacity somewhat obfuscating and I can't see
what the actual bug was.


The bug was in rounding in the expression minCapacity = (3 *
expectedMaxSize)/2.
Suppose, we want the expected size to be 11, then minCapacity becomes
(int)(11 * 3 / 2) == 16.
The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which
is less than expected size 11.

To address the issue I combined the division by 2 with the rounding up
to the nearest power of two.
I also took a chance to replace a while-loop with a single call to the
highestOneBit method, which calculates exactly what we need here.


The recursive call to put after a resize seems very sub-optimal as you
will re-search the map for the non-existent key. Can you not just
recompute the correct indices and do the store?


Okay, I replaced the recursive call with the index recomputation.

As was suggested by Jeff, I added a comment to the MAXIMUM_CAPACITY
constant.

I also reverted some part of the changes to the resize() function, as it
turned out that the new code failed to deal correctly with the table of
maximum capacity.

I wanted to add checking of this behavior to the regression test, but
faced an obstacle: I used reflection to make the MAXIMUM_CAPACITY
constant smaller, so it wouldn't be needed to insert 2**29 element to
fill the table up. The debugger was showing that the constant was
changed, but the old value was still used in the conditions.

Currently I commented this test out. If anyone could suggest me how I
can properly change MAXIMUM_CAPACITY in this situation, it would be very
appreciated.

Here is the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/2/webrev/

Would you please help review it?

Sincerely yours,
Ivan.


Alternatively use the original code and if an exception occurs on
resize then simply undo the put. I'm always worried that inverting
the behaviour like you will have some unexpected side effect somewhere
- this code has been like this for a very, very long time.

Cheers,
David

On 4/07/2014 5:38 AM, Ivan Gerasimov wrote:

Thank you Jeff!


On 03.07.2014 23:07, Jeff Hain wrote:


Hi.


WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/
http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/

The while loop in put(...) supposes that there is at least one free
slot,
which was the case before (that could be added as comment).

Now if you try to go past max capacity, instead of getting an
IllegalStateException,
you get an infinite loop.


Yes, you're right!


It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18
(=256, needs to be larger than DEFAULT_CAPACITY, else it can be
ignored).
With previous version you get IllegalStateException on trying to add
255th mapping
(on the resize that occurs when just-after-put-size is 255 =
threshold = 255),
and with new version you get infinite loop on trying to add 257th
mapping.

Solutions I see would be adding a throw if size = MAX_CAPACITY
before the loop,


This would break the case when an existing key is added to the already
full table.
Moreover, the full table without a single empty slot could cause an
infinite looping in get(), remove(), etc.


or not adding that overhead and instead throwing when
size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY.



This seems appropriate.
With the current implementation we can only put MAX_CAPACITY-1 elements
anyway.


I would also rather use == over = for these tests, to underline
the fact
that size is not supposed to be larger, but there might be some
reasons not to.


I think it makes the code a bit more stable from the perspective of the
future changes.

Here's the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/

Sincerely yours,
Ivan




-Jeff












Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-03 Thread Martin Buchholz
Martin's law of expanding capacity:

Always grow by using the form

newCapacity = oldCapacity + oldCapacity  n

for some suitable constant n.  This will be efficient and more overflow
resistant than the alternative

newCapacity = oldCapacity * (2**n + 1) / (2**n)

Here n == 1.


On Thu, Jul 3, 2014 at 9:00 AM, Ivan Gerasimov ivan.gerasi...@oracle.com
wrote:

 Hello!

 IdentityHasMap has a couple of small issues.
 The first one is a performance issue: If you create a map, specifying 42
 as the expected number of element, you'll be able to insert only 41
 elements into the preallocated table. Inserting the 42th element will
 trigger resizing of the storage.

 Another issue is that resizing occurs only after the element insertion.
 In the extreme case it can lead to the situation when the element is
 successfully inserted and then en exception is thrown due the map been
 unable to grow.

 Would you please help review the fix?

 BUGURL: https://bugs.openjdk.java.net/browse/JDK-6904367
 WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/

 Sincerely yours,
 Ivan




Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-03 Thread Ivan Gerasimov

Thanks Martin!

On 03.07.2014 21:12, Martin Buchholz wrote:

Martin's law of expanding capacity:

Always grow by using the form

newCapacity = oldCapacity + oldCapacity  n

for some suitable constant n.  This will be efficient and more 
overflow resistant than the alternative


newCapacity = oldCapacity * (2**n + 1) / (2**n)

Here n == 1.



More precisely here n == 0, because the capacity gets doubled when 
increased.


The formula cap = expSize * 3 / 2 is used to estimate the capacity based 
on the expected number of the items to be inserted.


Sincerely yours,
Ivan



On Thu, Jul 3, 2014 at 9:00 AM, Ivan Gerasimov 
ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote:


Hello!

IdentityHasMap has a couple of small issues.
The first one is a performance issue: If you create a map,
specifying 42 as the expected number of element, you'll be able to
insert only 41 elements into the preallocated table. Inserting the
42th element will trigger resizing of the storage.

Another issue is that resizing occurs only after the element
insertion.
In the extreme case it can lead to the situation when the element
is successfully inserted and then en exception is thrown due the
map been unable to grow.

Would you please help review the fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-6904367
WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/
http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/

Sincerely yours,
Ivan






Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-03 Thread Jeff Hain


Hi.




WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/
The while loop in put(...) supposes that there is at least one free slot,
which was the case
 before (that could be added as comment).

Now if you try to go past max capacity, instead of getting an 
IllegalStateException,
you get an infinite loop.

It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18
(=256, needs to be larger than DEFAULT_CAPACITY, else it can be ignored).
With previous version you get IllegalStateException on trying to add 255th 
mapping
(on the resize that occurs when just-after-put-size is 255 = threshold = 255),
and with new version you get infinite loop on trying to add 257th mapping.

Solutions I see would be adding a throw if size = MAX_CAPACITY
before the loop, or not adding that overhead and instead throwing when
size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY.

I would also rather use == over = for these tests, to
 underline the fact
that size is not supposed to be larger, but there might be some reasons not to.


-Jeff


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-03 Thread Ivan Gerasimov

Thank you Jeff!


On 03.07.2014 23:07, Jeff Hain wrote:


Hi.


WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/ 
http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/


The while loop in put(...) supposes that there is at least one free 
slot,

which was the case before (that could be added as comment).

Now if you try to go past max capacity, instead of getting an 
IllegalStateException,

you get an infinite loop.


Yes, you're right!


It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18
(=256, needs to be larger than DEFAULT_CAPACITY, else it can be 
ignored).
With previous version you get IllegalStateException on trying to add 
255th mapping
(on the resize that occurs when just-after-put-size is 255 = 
threshold = 255),

and with new version you get infinite loop on trying to add 257th mapping.

Solutions I see would be adding a throw if size = MAX_CAPACITY
before the loop,

This would break the case when an existing key is added to the already 
full table.
Moreover, the full table without a single empty slot could cause an 
infinite looping in get(), remove(), etc.



or not adding that overhead and instead throwing when
size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY.



This seems appropriate.
With the current implementation we can only put MAX_CAPACITY-1 elements 
anyway.


I would also rather use == over = for these tests, to underline 
the fact
that size is not supposed to be larger, but there might be some 
reasons not to.


I think it makes the code a bit more stable from the perspective of the 
future changes.


Here's the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/

Sincerely yours,
Ivan




-Jeff







Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-03 Thread Jeff Hain


http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/

My test now terminates (exception on MAX_CAPACITY-th put).

Maybe where MAX_CAPACITY is defined you could indicate
that you can actually have at most MAX_CAPACITY-1 mappings,
to ensure that the table always contains some null key-spot
(I don't see this invariant explicited already).

-Jeff


Re: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size

2014-07-03 Thread David Holmes

Hi Ivan,

I find the change to capacity somewhat obfuscating and I can't see what 
the actual bug was.


The recursive call to put after a resize seems very sub-optimal as you 
will re-search the map for the non-existent key. Can you not just 
recompute the correct indices and do the store?


Alternatively use the original code and if an exception occurs on resize 
then simply undo the put. I'm always worried that inverting the 
behaviour like you will have some unexpected side effect somewhere - 
this code has been like this for a very, very long time.


Cheers,
David

On 4/07/2014 5:38 AM, Ivan Gerasimov wrote:

Thank you Jeff!


On 03.07.2014 23:07, Jeff Hain wrote:


Hi.


WEBREV: http://cr.openjdk.java.net/~igerasim/6904367/0/webrev/
http://cr.openjdk.java.net/%7Eigerasim/6904367/0/webrev/

The while loop in put(...) supposes that there is at least one free
slot,
which was the case before (that could be added as comment).

Now if you try to go past max capacity, instead of getting an
IllegalStateException,
you get an infinite loop.


Yes, you're right!


It's easy to test with a loop of 1000 puts and MAX_CAPACITY = 18
(=256, needs to be larger than DEFAULT_CAPACITY, else it can be
ignored).
With previous version you get IllegalStateException on trying to add
255th mapping
(on the resize that occurs when just-after-put-size is 255 =
threshold = 255),
and with new version you get infinite loop on trying to add 257th
mapping.

Solutions I see would be adding a throw if size = MAX_CAPACITY
before the loop,


This would break the case when an existing key is added to the already
full table.
Moreover, the full table without a single empty slot could cause an
infinite looping in get(), remove(), etc.


or not adding that overhead and instead throwing when
size = MAX_CAPACITY-1 instead of when size = MAX_CAPACITY.



This seems appropriate.
With the current implementation we can only put MAX_CAPACITY-1 elements
anyway.


I would also rather use == over = for these tests, to underline
the fact
that size is not supposed to be larger, but there might be some
reasons not to.


I think it makes the code a bit more stable from the perspective of the
future changes.

Here's the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/1/webrev/

Sincerely yours,
Ivan




-Jeff