Hi Eirik,

Thanks for thinking about this issue. First, I agree that some changes are probably useful, but we probably need to discuss this more, and maybe try to address the more pressing issues first. Some of my thoughts:

- I don't think the cache should be removed. I think it can improve performance (especially for certs), and is not an unreasonable thing for the JDK to try to help with.

- I do think there should be a *supported* way to turn each of the caches off individually. If an application has their own caching mechanism, then they should be able to use it instead of the JDKs.

- We have evidence from at least 2 customers that the CRL cache is the more problematic issue. I don't remember the history, but I think the "750" size may have been chosen arbitrarily to match the certificate cache. As a short-term fix, I would probably be ok with lowering this default value to something more reasonable, but I am not sure what that should be offhand.

- I think knobs to control the cache properties sound like they would be useful, but I also think many users would not know what to use as reasonable values, and may end up just turning off the cache and seeing if that helps.

- I'm a little surprised the SoftReferences still caused so many entries to be retained. A comment (which may be out of date) in sun.security.util.Cache.java says that they may be purged too eagerly, which would seem to have helped the issues that the customers reported. Some testing may be in order to see what the current behavior is with different VM settings.

--Sean

On 4/24/23 2:32 AM, Eirik Bjørsnøs wrote:
Hi,

When reaching out to the BouncyCastle community regarding the deprecated javax.security.cert APIs, I got some interesting feedback from Matti Aarnio of Methics Ltd:

    Long ago we did encounter problems with JRE's X509CertImpl.java
      class, and more so with X509CRLImpl.java. Both have internal
    caches of binary form input that is then converted to parsed forms.

    Especially the CRL cache did not (still does not?) have nice "that
    version is obsoleted, drop it to garbage collection" handling of
    parsed CRLs, while it had way too large cache size.
    A server running for months and getting a new CRL every 12 hours
    eventually grew the memory footprint of obsolete parsed data up to
    ridiculous amounts of gigabytes of heap footprint.

    Once we realized that also certificate parser had such a cache and
    that there is no official way to control that cache, we became very
much averse of using those APIs. It seems the lack of cache control of these static implementation caches can be problematic, especially in the CRL case. The default for both caches is a max number of entries of 750 with an unlimited lifetime. The inability of expiring outdated CRLs seems particularly problematic for Matti.

A similar issue was reported in JDK-8059007 back in 2014, with the following comment from Sean:

    We need to introduce some system properties (or something similar)
    to allow the crl/cert caches to be adjusted. Changing to RFE.


While it would certainly be nice to add some form of cache control, the solution space is large:

-1: Remove the cache. One could claim that the parser/factory level is the wrong level to introduce parsing. There are so many ways to tune parsing, there exists many excellent libraries. If apps need caching, they should take care of it at the app level.

0: Do nothing, there haven't been that many complaints

1: Add an option to entirely disable the cache such that apps can do their own caching (or not) without wasting space in the default cache.

2: Add two system properties to control max entries of the cert and crl cache. Setting these to 0 would disable the cache.

3: Add four system properties to control max entries and max lifetime for each of the cert and crl caches. Setting max entries to 0 would disable a cache.

4: Tune the default max entries and lifetime to make everyone happy.

5: Add a pluggable API to control caching. Think AbstractCacheControllingMaxEntrySizeProxyFactoryContext :-)

6: Something I did not consider.

-1 is probably not desired/possible because of the behavioural change. 1 is simple but very constrained. 2 and 3 are similar in shape where 3 allows more flexibility introducing lifetime. 4 seems like wishful thinking. You could probably guess I'm not a big fan of 5. 6 seems promising :-)

I have a sketch implementation of 3 available, but it would be great to gather some opinions before I put more work into this:

https://github.com/openjdk/jdk/compare/master...eirbjo:jdk:x509factory-cache-config 
<https://github.com/openjdk/jdk/compare/master...eirbjo:jdk:x509factory-cache-config>

Thanks,
Eirik.

[1] https://bugs.openjdk.org/browse/JDK-8059007 <https://bugs.openjdk.org/browse/JDK-8059007>



Reply via email to