Hi David!

On 10/10/2014 05:22 PM, David Holmes wrote:
Hi Claes,

On 11/10/2014 1:10 AM, Claes Redestad wrote:
Hi all,

please review this patch which attempts to clean up synchronization and
improve scalability when defining and getting java.lang.Package objects.

Is this a real problem or a theoretical one?
Deadlocks in getPackage() have historically caused issues (https://bugs.openjdk.java.net/browse/JDK-7001933), so contention can happen. Still somewhat theoretical, though.
I've not previously heard of getting Package objects as being critical. ConcurrentHashMap improves the contended case but if we aren't normally contended then it will degrade performance. How does this perform on real code?
Experiments (like the trivial microbenchmark below) indicate this improves performance even in uncontended cases, which can probably be explained by the removal of explicit synchronization in favor of volatile reads inside CHM. I've run a number of larger benchmarks mostly to ensure there are no unexpected regressions, but not found a case where this patch has a significant impact. There's a small footprint win in effect by merging two maps in java.lang.Package and not saving ancestral entries in the ClassLoader packages map which shows up on some footprint measures and might have more impact when scaling up to larger apps.

I'm also wondering how this impacts the initialization order of the VM as we need a lot more classes to be loaded and initialized when the first Package is created.
Since this is a concern, I've ran a number of internal startup benchmarks but not found the changes to have any statistical significant startup impact in either direction. Since ConcurrentHashMap is already in use by any parallel capable java.lang.ClassLoader, java.lang.Thread etc, it would seem it'll already be loaded and initialized early.

/Claes


Thanks,
David

webrev: http://cr.openjdk.java.net/~redestad/8060130/webrev.02/
bug: https://bugs.openjdk.java.net/browse/JDK-8060130

Testing: jtreg, UTE vm.parallel_class_loading.testlist, various benchmarks

Torturing the retrieval code with a simple microbenchmark[1] shows that
the existing code
could cause bottlenecks, but also that the proposed patch works slightly
faster even in
uncontended cases:

Benchmark Mode Samples Score Score error Units
baseline, 1 thread:
o.s.SimpleBench.getClassPackage    thrpt       10 11.621 0.618
ops/us
o.s.SimpleBench.getPackage         thrpt       10 41.754 3.381
ops/us
o.s.SimpleBench.getPackages        thrpt       10 0.009 0.000
ops/us

baseline, 2 threads:
o.s.SimpleBench.getClassPackage    thrpt       10 7.884 1.977
ops/us
o.s.SimpleBench.getPackage         thrpt       10 17.013 8.079
ops/us
o.s.SimpleBench.getPackages        thrpt       10 0.004 0.001
ops/us

patch applied, 1 thread:
o.s.SimpleBench.getClassPackage    thrpt       10 13.519 0.170
ops/us
o.s.SimpleBench.getPackage         thrpt       10 59.999 0.988
ops/us
o.s.SimpleBench.getPackages        thrpt       10 0.019 0.001
ops/us

patch applied, 2 threads:
o.s.SimpleBench.getClassPackage    thrpt       10 19.181 3.688
ops/us
o.s.SimpleBench.getPackage         thrpt       10 79.708 31.220
ops/us
o.s.SimpleBench.getPackages        thrpt       10 0.025 0.006
ops/us

/Claes

[1]
package org.sample;

import org.openjdk.jmh.annotations.*;

@State(Scope.Thread)
public class SimpleBench {

     @Benchmark
     public Package[] getPackages() {
         return Package.getPackages();
     }

     @Benchmark
     public Package getClassPackage() {
         return this.getClass().getPackage();
     }

     @Benchmark
     public Package getPackage() {
         return Package.getPackage("java.util.zip");
     }
}


Reply via email to