Re: RFR: 8285398: Cache the results of constraint checks

2022-04-26 Thread Daniel Jeliński
On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:

> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
> portion of time is spent in HandshakeContext initialization, specifically in 
> DisabledAlgorithmConstraints class.
> 
> There are only a few instances of that class, and they are immutable. Caching 
> the results should be a low-risk operation.
> 
> The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
> can be removed from memory after a period of inactivity. Under normal 
> circumstances the cache holds no more than 100 algorithms.

Thanks for reviewing!

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Xue-Lei Andrew Fan
On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:

> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
> portion of time is spent in HandshakeContext initialization, specifically in 
> DisabledAlgorithmConstraints class.
> 
> There are only a few instances of that class, and they are immutable. Caching 
> the results should be a low-risk operation.
> 
> The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
> can be removed from memory after a period of inactivity. Under normal 
> circumstances the cache holds no more than 100 algorithms.

Marked as reviewed by xuelei (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Xue-Lei Andrew Fan
On Mon, 25 Apr 2022 16:04:22 GMT, Anthony Scarpino  
wrote:

> It also shows that more caching probably would help further.

+1.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Xue-Lei Andrew Fan
On Mon, 25 Apr 2022 17:22:57 GMT, Daniel Jeliński  wrote:

>>> With all the above in mind I decided not to use `sun.security.util.Cache` 
>>> here
>> 
>> I was not meant to use Cache and timeout for this update.
>> 
>> SoftReference and  this patch should work good in larger memory boxes.  
>> Performance improvement sometimes is a trade-off game between memory and 
>> CPU. Did you have a chance to check the performance in the limited memory 
>> circumstance?  like '-mx64M".
>
> I run a few tests:
> `-Xmx16m`, with this patch, still better than the original version:
> 
> Benchmark (resume)  (tlsVersion)   Mode  Cnt Score 
> Error  Units
> SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  4477.444 ± 
> 375.975  ops/s
> SSLHandshake.doHandshake  true   TLS  thrpt5   681.471 ±  
> 72.531  ops/s
> SSLHandshake.doHandshake false   TLSv1.2  thrpt5   335.366 ±  
> 89.056  ops/s
> SSLHandshake.doHandshake false   TLS  thrpt5   308.711 ±  
> 90.134  ops/s
> 
> 
> `-Xmx8m`, before patch:
> 
> Benchmark (resume)  (tlsVersion)   Mode  CntScore
> Error  Units
> SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  153.760 ± 
> 12.025  ops/s
> SSLHandshake.doHandshake  true   TLS  thrpt5   70.700 ±  
> 7.506  ops/s
> SSLHandshake.doHandshake false   TLSv1.2  thrpt5   67.459 ±  
> 4.325  ops/s
> SSLHandshake.doHandshake false   TLS  thrpt5   64.695 ±  
> 1.647  ops/s
> 
> after:
> 
> Benchmark (resume)  (tlsVersion)   Mode  CntScore
> Error  Units
> SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  557.324 ± 
> 50.269  ops/s
> SSLHandshake.doHandshake  true   TLS  thrpt5  155.258 ± 
> 13.866  ops/s
> SSLHandshake.doHandshake false   TLSv1.2  thrpt5  181.755 ± 
> 29.319  ops/s
> SSLHandshake.doHandshake false   TLS  thrpt5  120.627 ± 
> 25.832  ops/s
> 
> Much slower, but still faster with the patch.
> With `-Xmx4m` the test failed with OOM.

Thanks for the additional testing.  The improvement is impressive.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Daniel Jeliński
On Mon, 25 Apr 2022 14:33:13 GMT, Xue-Lei Andrew Fan  wrote:

>> `SoftReference`s are guaranteed to survive one GC after use; beyond that 
>> their lifespan is determined by `SoftRefLRUPolicyMSPerMB` and the amount of 
>> memory available.
>
>> With all the above in mind I decided not to use `sun.security.util.Cache` 
>> here
> 
> I was not meant to use Cache and timeout for this update.
> 
> SoftReference and  this patch should work good in larger memory boxes.  
> Performance improvement sometimes is a trade-off game between memory and CPU. 
> Did you have a chance to check the performance in the limited memory 
> circumstance?  like '-mx64M".

I run a few tests:
`-Xmx16m`, with this patch, still better than the original version:

Benchmark (resume)  (tlsVersion)   Mode  Cnt Score 
Error  Units
SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  4477.444 ± 
375.975  ops/s
SSLHandshake.doHandshake  true   TLS  thrpt5   681.471 ±  
72.531  ops/s
SSLHandshake.doHandshake false   TLSv1.2  thrpt5   335.366 ±  
89.056  ops/s
SSLHandshake.doHandshake false   TLS  thrpt5   308.711 ±  
90.134  ops/s


`-Xmx8m`, before patch:

Benchmark (resume)  (tlsVersion)   Mode  CntScoreError  
Units
SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  153.760 ± 12.025  
ops/s
SSLHandshake.doHandshake  true   TLS  thrpt5   70.700 ±  7.506  
ops/s
SSLHandshake.doHandshake false   TLSv1.2  thrpt5   67.459 ±  4.325  
ops/s
SSLHandshake.doHandshake false   TLS  thrpt5   64.695 ±  1.647  
ops/s

after:

Benchmark (resume)  (tlsVersion)   Mode  CntScoreError  
Units
SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  557.324 ± 50.269  
ops/s
SSLHandshake.doHandshake  true   TLS  thrpt5  155.258 ± 13.866  
ops/s
SSLHandshake.doHandshake false   TLSv1.2  thrpt5  181.755 ± 29.319  
ops/s
SSLHandshake.doHandshake false   TLS  thrpt5  120.627 ± 25.832  
ops/s

Much slower, but still faster with the patch.
With `-Xmx4m` the test failed with OOM.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Sean Coffey
On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:

> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
> portion of time is spent in HandshakeContext initialization, specifically in 
> DisabledAlgorithmConstraints class.
> 
> There are only a few instances of that class, and they are immutable. Caching 
> the results should be a low-risk operation.
> 
> The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
> can be removed from memory after a period of inactivity. Under normal 
> circumstances the cache holds no more than 100 algorithms.

Another nice performance boost in this area. Looks good to me.

-

Marked as reviewed by coffeys (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Anthony Scarpino
On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:

> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
> portion of time is spent in HandshakeContext initialization, specifically in 
> DisabledAlgorithmConstraints class.
> 
> There are only a few instances of that class, and they are immutable. Caching 
> the results should be a low-risk operation.
> 
> The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
> can be removed from memory after a period of inactivity. Under normal 
> circumstances the cache holds no more than 100 algorithms.

I'm good with this idea.  It's not caching the whole list like I struggled 
with, but it's a good solution for the algorithm name based constraints.  It 
also shows that more caching probably would help further.
I'll let Xuelei finish his review comments

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Xue-Lei Andrew Fan
On Mon, 25 Apr 2022 13:59:44 GMT, Daniel Jeliński  wrote:

>> FWIW: I wouldn't expect `SoftReference` (as opposed to `WeakReference`) to 
>> be eagerly cleaned.
>
> `SoftReference`s are guaranteed to survive one GC after use; beyond that 
> their lifespan is determined by `SoftRefLRUPolicyMSPerMB` and the amount of 
> memory available.

> With all the above in mind I decided not to use `sun.security.util.Cache` here

I was not meant to use Cache and timeout for this update.

SoftReference and  this patch should work good in larger memory boxes.  
Performance improvement sometimes is a trade-off game between memory and CPU. 
Did you have a chance to check the performance in the limited memory 
circumstance?  like '-mx64M".

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Daniel Jeliński
On Mon, 25 Apr 2022 13:22:34 GMT, Daniel Fuchs  wrote:

>> Right, soft references are likely to be cleaned if they are not used in an 
>> entire GC cycle.
>> Using a soft reference for each map entry would not help here; note that all 
>> keys and all values in this map are GC roots (keys are enum names, values 
>> are `TRUE` and `FALSE`), so in practice they would never be collected.
>> Even if we made the entries collectable, it would not help with the TLS 
>> case; SSLEngine & SSLSocket only check the constraints once at the beginning 
>> of the handshake, so they would all expire at the same time anyway. What's 
>> even worse, we would then have to clear the expired entries from the map.
>> 
>> I also considered limiting the size of the map. That's a tricky subject; we 
>> would need to get the size limit exactly right. Given that the algorithms 
>> are always checked in the same order, if the cache were too small, we would 
>> get zero hits.
>> With all the above in mind I decided not to use `sun.security.util.Cache` 
>> here.
>
> FWIW: I wouldn't expect `SoftReference` (as opposed to `WeakReference`) to be 
> eagerly cleaned.

`SoftReference`s are guaranteed to survive one GC after use; beyond that their 
lifespan is determined by `SoftRefLRUPolicyMSPerMB` and the amount of memory 
available.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Daniel Fuchs
On Mon, 25 Apr 2022 06:46:20 GMT, Daniel Jeliński  wrote:

>> src/java.base/share/classes/sun/security/util/DisabledAlgorithmConstraints.java
>>  line 105:
>> 
>>> 103: private final Set disabledAlgorithms;
>>> 104: private final Constraints algorithmConstraints;
>>> 105: private volatile SoftReference> cacheRef =
>> 
>> It looks like a one-clear-for-all mechanism.  Once the clearing happens, the 
>> full map will be collected.  For SoftReferences, it clears them fairly 
>> eagerly.  Maybe, the performance could be further improved in practice by 
>> using soft reference for each map entry (See sun.security.util.Cache code 
>> for an example). 
>> 
>> I will have another look next week.
>
> Right, soft references are likely to be cleaned if they are not used in an 
> entire GC cycle.
> Using a soft reference for each map entry would not help here; note that all 
> keys and all values in this map are GC roots (keys are enum names, values are 
> `TRUE` and `FALSE`), so in practice they would never be collected.
> Even if we made the entries collectable, it would not help with the TLS case; 
> SSLEngine & SSLSocket only check the constraints once at the beginning of the 
> handshake, so they would all expire at the same time anyway. What's even 
> worse, we would then have to clear the expired entries from the map.
> 
> I also considered limiting the size of the map. That's a tricky subject; we 
> would need to get the size limit exactly right. Given that the algorithms are 
> always checked in the same order, if the cache were too small, we would get 
> zero hits.
> With all the above in mind I decided not to use `sun.security.util.Cache` 
> here.

FWIW: I wouldn't expect `SoftReference` (as opposed to `WeakReference`) to be 
eagerly cleaned.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-25 Thread Daniel Jeliński
On Sat, 23 Apr 2022 14:57:01 GMT, Xue-Lei Andrew Fan  wrote:

>> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
>> portion of time is spent in HandshakeContext initialization, specifically in 
>> DisabledAlgorithmConstraints class.
>> 
>> There are only a few instances of that class, and they are immutable. 
>> Caching the results should be a low-risk operation.
>> 
>> The cache is implemented as a softly reachable ConcurrentHashMap; this way 
>> it can be removed from memory after a period of inactivity. Under normal 
>> circumstances the cache holds no more than 100 algorithms.
>
> src/java.base/share/classes/sun/security/util/DisabledAlgorithmConstraints.java
>  line 105:
> 
>> 103: private final Set disabledAlgorithms;
>> 104: private final Constraints algorithmConstraints;
>> 105: private volatile SoftReference> cacheRef =
> 
> It looks like a one-clear-for-all mechanism.  Once the clearing happens, the 
> full map will be collected.  For SoftReferences, it clears them fairly 
> eagerly.  Maybe, the performance could be further improved in practice by 
> using soft reference for each map entry (See sun.security.util.Cache code for 
> an example). 
> 
> I will have another look next week.

Right, soft references are likely to be cleaned if they are not used in an 
entire GC cycle.
Using a soft reference for each map entry would not help here; note that all 
keys and all values in this map are GC roots (keys are enum names, values are 
`TRUE` and `FALSE`), so in practice they would never be collected.
Even if we made the entries collectable, it would not help with the TLS case; 
SSLEngine & SSLSocket only check the constraints once at the beginning of the 
handshake, so they would all expire at the same time anyway. What's even worse, 
we would then have to clear the expired entries from the map.

I also considered limiting the size of the map. That's a tricky subject; we 
would need to get the size limit exactly right. Given that the algorithms are 
always checked in the same order, if the cache were too small, we would get 
zero hits.
With all the above in mind I decided not to use `sun.security.util.Cache` here.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-23 Thread Xue-Lei Andrew Fan
On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:

> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
> portion of time is spent in HandshakeContext initialization, specifically in 
> DisabledAlgorithmConstraints class.
> 
> There are only a few instances of that class, and they are immutable. Caching 
> the results should be a low-risk operation.
> 
> The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
> can be removed from memory after a period of inactivity. Under normal 
> circumstances the cache holds no more than 100 algorithms.

src/java.base/share/classes/sun/security/util/DisabledAlgorithmConstraints.java 
line 105:

> 103: private final Set disabledAlgorithms;
> 104: private final Constraints algorithmConstraints;
> 105: private volatile SoftReference> cacheRef =

It looks like a one-clear-for-all mechanism.  Once the clearing happens, the 
full map will be collected.  For SoftReferences, it clears them fairly eagerly. 
 Maybe, the performance could be further improved in practice by using soft 
reference for each map entry (See sun.security.util.Cache code for an example). 

I will have another look next week.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-22 Thread Daniel Jeliński
On Fri, 22 Apr 2022 02:13:14 GMT, David Schlosnagle  
wrote:

>> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
>> portion of time is spent in HandshakeContext initialization, specifically in 
>> DisabledAlgorithmConstraints class.
>> 
>> There are only a few instances of that class, and they are immutable. 
>> Caching the results should be a low-risk operation.
>> 
>> The cache is implemented as a softly reachable ConcurrentHashMap; this way 
>> it can be removed from memory after a period of inactivity. Under normal 
>> circumstances the cache holds no more than 100 algorithms.
>
> src/java.base/share/classes/sun/security/util/DisabledAlgorithmConstraints.java
>  line 969:
> 
>> 967: result = checkAlgorithm(disabledAlgorithms, algorithm, 
>> decomposer);
>> 968: cache.put(algorithm, result);
>> 969: return result;
> 
> Would it be worth using `cache.computeIfAbsent` or do you want to avoid 
> lambda allocation overhead and potentially blocking concurrent handshakes on 
> writer thread?
> 
> Suggestion:
> 
> return cache.computeIfAbsent(algorithm, algo -> 
> checkAlgorithm(disabledAlgorithms, algo, decomposer));

I generally prefer using `get` over `computeIfAbsent` when optimizing for 
performance. While `computeIfAbsent` is more concise, it's also much slower 
than `get` when the entry is already present.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-21 Thread David Schlosnagle
On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:

> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
> portion of time is spent in HandshakeContext initialization, specifically in 
> DisabledAlgorithmConstraints class.
> 
> There are only a few instances of that class, and they are immutable. Caching 
> the results should be a low-risk operation.
> 
> The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
> can be removed from memory after a period of inactivity. Under normal 
> circumstances the cache holds no more than 100 algorithms.

src/java.base/share/classes/sun/security/util/DisabledAlgorithmConstraints.java 
line 969:

> 967: result = checkAlgorithm(disabledAlgorithms, algorithm, 
> decomposer);
> 968: cache.put(algorithm, result);
> 969: return result;

Would it be worth using `cache.computeIfAbsent` or do you want to avoid lambda 
allocation overhead and potentially blocking concurrent handshakes on writer 
thread?

Suggestion:

return cache.computeIfAbsent(algorithm, algo -> 
checkAlgorithm(disabledAlgorithms, algo, decomposer));

-

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-21 Thread Peter Firmstone

Nice.

On 22/04/2022 6:47 am, Daniel Jeliński wrote:

On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:


Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
portion of time is spent in HandshakeContext initialization, specifically in 
DisabledAlgorithmConstraints class.

There are only a few instances of that class, and they are immutable. Caching 
the results should be a low-risk operation.

The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
can be removed from memory after a period of inactivity. Under normal 
circumstances the cache holds no more than 100 algorithms.

before:

Benchmark (resume)  (tlsVersion)   Mode  Cnt Score  
Error  Units
SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  2165.081 ± 
440.204  ops/s
SSLHandshake.doHandshake  true   TLS  thrpt5   534.681 ± 
146.931  ops/s
SSLHandshake.doHandshake false   TLSv1.2  thrpt5   369.104 ±  
11.143  ops/s
SSLHandshake.doHandshake false   TLS  thrpt5   253.903 ±  
58.056  ops/s

after:

Benchmark (resume)  (tlsVersion)   Mode  Cnt  Score 
Error  Units
SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  10440.501 ± 
478.177  ops/s
SSLHandshake.doHandshake  true   TLS  thrpt5762.995 ±  
33.842  ops/s
SSLHandshake.doHandshake false   TLSv1.2  thrpt5440.471 ±  
52.867  ops/s
SSLHandshake.doHandshake false   TLS  thrpt5305.928 ±  
57.847  ops/s

After this patch the HandshakeContext initialization practically disappears 
from the CPU profile; it only takes ~5% in TLS1.2 resumption, and much less in 
the remaining scenarios.

-

PR: https://git.openjdk.java.net/jdk/pull/8349


--
Regards,
 
Peter Firmstone

0498 286 363
Zeus Project Services Pty Ltd.



RFR: 8285398: Cache the results of constraint checks

2022-04-21 Thread Daniel Jeliński
Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
portion of time is spent in HandshakeContext initialization, specifically in 
DisabledAlgorithmConstraints class.

There are only a few instances of that class, and they are immutable. Caching 
the results should be a low-risk operation.

The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
can be removed from memory after a period of inactivity. Under normal 
circumstances the cache holds no more than 100 algorithms.

-

Commit messages:
 - Implement constraint cache

Changes: https://git.openjdk.java.net/jdk/pull/8349/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=8349=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8285398
  Stats: 26 lines in 1 file changed: 23 ins; 1 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8349.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8349/head:pull/8349

PR: https://git.openjdk.java.net/jdk/pull/8349


Re: RFR: 8285398: Cache the results of constraint checks

2022-04-21 Thread Daniel Jeliński
On Thu, 21 Apr 2022 19:58:39 GMT, Daniel Jeliński  wrote:

> Profiling the TLS handshakes using SSLHandshake benchmark shows that a large 
> portion of time is spent in HandshakeContext initialization, specifically in 
> DisabledAlgorithmConstraints class.
> 
> There are only a few instances of that class, and they are immutable. Caching 
> the results should be a low-risk operation.
> 
> The cache is implemented as a softly reachable ConcurrentHashMap; this way it 
> can be removed from memory after a period of inactivity. Under normal 
> circumstances the cache holds no more than 100 algorithms.

before:

Benchmark (resume)  (tlsVersion)   Mode  Cnt Score  
Error  Units
SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  2165.081 ± 
440.204  ops/s
SSLHandshake.doHandshake  true   TLS  thrpt5   534.681 ± 
146.931  ops/s
SSLHandshake.doHandshake false   TLSv1.2  thrpt5   369.104 ±  
11.143  ops/s
SSLHandshake.doHandshake false   TLS  thrpt5   253.903 ±  
58.056  ops/s

after:

Benchmark (resume)  (tlsVersion)   Mode  Cnt  Score 
Error  Units
SSLHandshake.doHandshake  true   TLSv1.2  thrpt5  10440.501 ± 
478.177  ops/s
SSLHandshake.doHandshake  true   TLS  thrpt5762.995 ±  
33.842  ops/s
SSLHandshake.doHandshake false   TLSv1.2  thrpt5440.471 ±  
52.867  ops/s
SSLHandshake.doHandshake false   TLS  thrpt5305.928 ±  
57.847  ops/s

After this patch the HandshakeContext initialization practically disappears 
from the CPU profile; it only takes ~5% in TLS1.2 resumption, and much less in 
the remaining scenarios.

-

PR: https://git.openjdk.java.net/jdk/pull/8349