FYI: probably is better/simpler to review it via webrev.

After migration to the lcms from the kcms the performance of some operations 
was regressed. One possible workaround was to split the operation into multiple 
threads. But unfortunately, we have too many bottlenecks which prevent using 
multithreading. This is the request to remove/minimize such bottlenecks(at 
least some of them), but it does not affect the single-threaded performance it 
should be handled separately.

The main code pattern optimized here is this:
    activate();
    byte[] theHeader = getData(cmmProfile, icSigHead);
    ---->  CMSManager.getModule().getTagData(p, tagSignature);
Notes about the code above:

1. Before the change "activate()" method checked that the "cmmProfile" field 
was not null. After that we usually used the "cmmProfile" as the parameter to 
some other method. This included two volatile reads, and also required to check 
when we need to call the "activate()" method before usage of the "cmmProfile" 
field.
Solution: "activate()" renamed to the "cmmProfile()" which became an accessor 
for the field, so we will get one volatile read and can easily monitor the 
usage of the field itself(it is used directly only in this method).

2. The synchronized static method "CMSManager.getModule()" reimplemented to 
have only one volatile read.

3. The usage of locking in the "getTagData()" is changed. Instead of the 
synchronized instance methods, we now use the mix of "ConcurrentHashMap" and 
StampedLock.

See some comments inline.

Some numbers(small numbers are better):

1. Performance of ((ICC_ProfileRGB) 
ICC_Profile.getInstance(ColorSpace.CS_sRGB)).getMatrix();

jdk 15.0.2
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetMatrix       avgt    5   19,624 ±    0,059  us/op
CMMPerf.testGetMatrix                  avgt    5    0,154 ±    0,001  us/op

jdk - before the fix
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetMatrix       avgt    5   12,935 ±    0,042  us/op
CMMPerf.testGetMatrix                  avgt    5    0,127 ±    0,007  us/op

jdk - after the fix
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetMatrix       avgt    5    0,561 ±    0,005  us/op
CMMPerf.testGetMatrix                  avgt    5    0,092 ±    0,001  us/op

2. Part of performance gain in jdk17 is from some other fixes, for example
    Performance of ICC_Profile.getInstance(ColorSpace.CS_sRGB); and 
ColorSpace.getInstance(ColorSpace.CS_sRGB);

jdk 15.0.2
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetSRGBProfile  avgt    5    2,299 ±    0,032  us/op
CMMPerf.ThreadsMAX.testGetSRGBSpace    avgt    5    2,210 ±    0,051  us/op
CMMPerf.testGetSRGBProfile             avgt    5    0,019 ±    0,001  us/op
CMMPerf.testGetSRGBSpace               avgt    5    0,018 ±    0,001  us/op

jdk - same before/ after the fix
Benchmark                              Mode  Cnt    Score      Error  Units
CMMPerf.ThreadsMAX.testGetSRGBProfile  avgt    5    0,005 ±    0,001  us/op
CMMPerf.ThreadsMAX.testGetSRGBSpace    avgt    5    0,005 ±    0,001  us/op
CMMPerf.testGetSRGBProfile             avgt    5    0,005 ±    0,001  us/op
CMMPerf.testGetSRGBSpace               avgt    5    0,005 ±    0,001  us/op

note "ThreadsMAX" is 32 threads.

-------------

Commit messages:
 - Update ICC_Profile.java
 - Merge branch 'master' into threads
 - Update LCMS.c
 - Initial fix

Changes: https://git.openjdk.java.net/jdk/pull/2957/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2957&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8263482
  Stats: 212 lines in 5 files changed: 52 ins; 83 del; 77 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2957.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2957/head:pull/2957

PR: https://git.openjdk.java.net/jdk/pull/2957

Reply via email to