Hi David!
On 10/10/2014 05:22 PM, David Holmes wrote:
Hi Claes,
On 11/10/2014 1:10 AM, Claes Redestad wrote:
Hi all,
please review this patch which attempts to clean up synchronization and
improve scalability when defining and getting java.lang.Package objects.
Is this a real problem or a theoretical one?
Deadlocks in getPackage() have historically caused issues
(https://bugs.openjdk.java.net/browse/JDK-7001933), so contention can
happen. Still somewhat theoretical, though.
I've not previously heard of getting Package objects as being
critical. ConcurrentHashMap improves the contended case but if we
aren't normally contended then it will degrade performance. How does
this perform on real code?
Experiments (like the trivial microbenchmark below) indicate this
improves performance even in uncontended cases, which can probably be
explained by the removal of explicit synchronization in favor of
volatile reads inside CHM. I've run a number of larger benchmarks mostly
to ensure there are no unexpected regressions, but not found a case
where this patch has a significant impact. There's a small footprint win
in effect by merging two maps in java.lang.Package and not saving
ancestral entries in the ClassLoader packages map which shows up on some
footprint measures and might have more impact when scaling up to larger
apps.
I'm also wondering how this impacts the initialization order of the VM
as we need a lot more classes to be loaded and initialized when the
first Package is created.
Since this is a concern, I've ran a number of internal startup
benchmarks but not found the changes to have any statistical significant
startup impact in either direction. Since ConcurrentHashMap is already
in use by any parallel capable java.lang.ClassLoader, java.lang.Thread
etc, it would seem it'll already be loaded and initialized early.
/Claes
Thanks,
David
webrev: http://cr.openjdk.java.net/~redestad/8060130/webrev.02/
bug: https://bugs.openjdk.java.net/browse/JDK-8060130
Testing: jtreg, UTE vm.parallel_class_loading.testlist, various
benchmarks
Torturing the retrieval code with a simple microbenchmark[1] shows that
the existing code
could cause bottlenecks, but also that the proposed patch works slightly
faster even in
uncontended cases:
Benchmark Mode Samples Score Score error
Units
baseline, 1 thread:
o.s.SimpleBench.getClassPackage thrpt 10 11.621 0.618
ops/us
o.s.SimpleBench.getPackage thrpt 10 41.754 3.381
ops/us
o.s.SimpleBench.getPackages thrpt 10 0.009 0.000
ops/us
baseline, 2 threads:
o.s.SimpleBench.getClassPackage thrpt 10 7.884 1.977
ops/us
o.s.SimpleBench.getPackage thrpt 10 17.013 8.079
ops/us
o.s.SimpleBench.getPackages thrpt 10 0.004 0.001
ops/us
patch applied, 1 thread:
o.s.SimpleBench.getClassPackage thrpt 10 13.519 0.170
ops/us
o.s.SimpleBench.getPackage thrpt 10 59.999 0.988
ops/us
o.s.SimpleBench.getPackages thrpt 10 0.019 0.001
ops/us
patch applied, 2 threads:
o.s.SimpleBench.getClassPackage thrpt 10 19.181 3.688
ops/us
o.s.SimpleBench.getPackage thrpt 10 79.708 31.220
ops/us
o.s.SimpleBench.getPackages thrpt 10 0.025 0.006
ops/us
/Claes
[1]
package org.sample;
import org.openjdk.jmh.annotations.*;
@State(Scope.Thread)
public class SimpleBench {
@Benchmark
public Package[] getPackages() {
return Package.getPackages();
}
@Benchmark
public Package getClassPackage() {
return this.getClass().getPackage();
}
@Benchmark
public Package getPackage() {
return Package.getPackage("java.util.zip");
}
}