Minor update (webrev 06): * Rebased to cece972575ac [1] (latest JDK revision today) * Compiler warnings fixed
* http://cr.openjdk.java.net/~mbalao/webrevs/6913047/6913047.webrev.06/ * http://cr.openjdk.java.net/~mbalao/webrevs/6913047/6913047.webrev.06.zip Kind regards, Martin.- -- [1] - http://hg.openjdk.java.net/jdk/jdk/rev/cece972575ac On Thu, Oct 12, 2017 at 10:00 AM, Martin Balao <mba...@redhat.com> wrote: > Webrev 04 uploaded to cr.openjdk.java.net: > > * http://cr.openjdk.java.net/~sgehwolf/webrevs/mbalaoal/ > JDK-6913047/webrev.04/ (browse online) > * http://cr.openjdk.java.net/~sgehwolf/webrevs/mbalaoal/ > JDK-6913047/webrev.04/6913047.webrev.04.zip (download) > > On Wed, Oct 11, 2017 at 10:31 AM, Martin Balao <mba...@redhat.com> wrote: > >> Hi, >> >> I'd like to propose a fix for bug JDK-6913047: "Long term memory leak >> when using PKCS11 and JCE exceeds 32 bit process address space" [1]. This >> fix does not contain changes in the GC and is SunPKCS11 internal only. >> >> PROBLEM >> ........................................................ >> >> When using the SunPKCS11 crypto provider (for cipher, signature, mac, key >> generation or any other operation), multiple key objects may be created. >> I.e.: every time a TLS session is established, a unique master key (derived >> from the pre-master-key) has to be created and then used for encryption and >> decryption operations. This is a legitimate use case in which key caching >> does not make sense as each key is unique per session. These keys are of >> P11Key type and have a corresponding native key object created. In the case >> of NSS SunPKCS11 backend (PKCS11 software token), this native key object is >> temporarily stored in the process native heap. The interface is simple: a >> JNI call is done to create a native key object (C_CreateObject, >> C_CopyObject, C_DeriveKey, C_GenerateKeys, etc., according to the PKCS11 >> interface) and an integer handler is kept in the Java side (P11Key). When >> the P11Key object is destroyed, a finalizer code is executed to free the >> native key object (through C_DestroyObject JNI call). The problem is that >> finalizer code execution happens only if the JVM garbage-collector cleans >> up the P11Key object. That may be delayed or not done at all, depending on >> different GC algorithms, parameters and environment conditions. As a >> result, the native heap may be exhausted with not freed native key objects, >> and the JVM will then crash -this is particularly true for 32 bits VMs >> where the virtual address space can be exhausted-. >> >> >> SCOPE >> ........................................................ >> >> The fix is proposed for SunPKCS11 with NSS backend only. Other PKCS11 >> backends are not currently under scope. It's likely that hardware PKCS11 >> backends store native key objects in their own memory, preventing a native >> heap exhaustion and a JVM crash. However, it might be possible to cause an >> exhaustion on their own memory blocking key objects creation at some point. >> In any case, this is speculative as no tests were done on our side with >> real hardware. >> >> >> SOLUTION >> ........................................................ >> >> Assuming that native keys are extractable, the idea is to hold native key >> data in the Java heap while keys are not in use. When a P11Key is created, >> every CK_ATTRIBUTE (PKCS11) value for the native key is queried, data >> stored in an opaque Java byte[] (inside the P11Key object) and native key >> destroyed. Every time the P11Key is about to be used, the native key is >> created with the stored data. After usage, the native key is again >> destroyed. Thus, it's not necessary to wait for a finalizer execution to >> cleanup native resources: cleanup is done at deterministic and >> previously-known points. This comes with a resposibility for key users >> -which are all SunPKCS11 internal services like P11Signature, P11Cipher, >> P11KeyGenerator, etc.-: create and destroy native keys through a reference >> counting scheme exposed by P11Key class. There are two kind of usages: >> >> 1) stateless: the native key is "atomically" created, used and >> destroyed. I.e.: MAC calculation, getEncodedInternal operation (on P11Key >> objects), signature operations, TLS key derivation, etc. >> >> 2) statefull: the native key is created, one or multiple intermediate >> actions are performed by the key user, a final action is performed and >> finally the native key is destroyed. I.e.: cipher operations. >> >> For keys that are extractable but sensitive (CKA_SENSITIVE attribute is >> true), as the case when operating in FIPS mode, wrapping/unwrapping is used >> as a workaround to extract session keys. Wrapper key is global and lives >> forever. >> >> There are no interface changes for SunPKCS11 external users. >> >> If keys are not extractable or the feature cannot be enabled for any >> other reason, the previous finalizer scheme is used as a fallback. >> >> >> ADDITIONAL IMPLEMENTATION NOTES >> ........................................................ >> >> When a P11Key is created, a constructor parameter exists to indicate if >> the feature is enabled for that specific key. For this feature to be >> enabled, 2 additional conditions apply: 1) SunPKCS11 backend has to be NSS, >> and 2) key has to be extractable. If the feature is not enabled for a key, >> behavior is as previous to this patch (native key destruction relies on >> finalizer execution). >> >> The only P11Key user that explicitly does not use this feature is >> P11KeyStore. This is because these keys (token keys) are managed by alias >> names and makes no sense to remove them from the key store (they need to be >> accessible by an alias at any time). >> >> Because P11Key objects can be used by multiple threads at a time, there >> is a thread-safe reference counting scheme in order to decide when a native >> key object has to be created or destroyed. The SunPKCS11 internal API to >> use a P11Key is as follows: 1) increment the reference counter (which will >> eventually create the native key object if it doesn't exist), 2) use the >> key and 3) decrement the reference counter (which will eventually destroy >> the native key if there it's not being used by anyone else). >> >> The reason why an opaque byte[] is used in P11Key objects to store native >> keys data (instead of a CK_ATTRIBUTE[] Java objects, queried by Java's >> C_GetAttributeValue function) is performance. My prototypes show a >> difference of 4x in speed. 2 functions were added to libj2pkcs11 library: >> getNativeKeyInfo (to extract the opaque byte[] from a native key object) >> and createNativeKey (to create a native key object from an opaque byte[]). >> >> >> CHANGESET >> ........................................................ >> >> This changeset is JDK-10 (at jdk c8796a577885 rev) based: >> >> * http://people.redhat.com/mbalaoal/webrevs/jdk_6913047_sunpkc >> s11_nss_memory_leak/2017_10_06/6913047.webrev.04/ (browse online) >> * http://people.redhat.com/mbalaoal/webrevs/jdk_6913047_sunpkc >> s11_nss_memory_leak/2017_10_06/6913047.webrev.04.zip (download) >> >> >> TESTING >> ........................................................ >> >> Test suite for 32 bits JVMs only: http://people.redhat.com/mbala >> oal/webrevs/jdk_6913047_sunpkcs11_nss_memory_leak/2017_10_ >> 06/Bug6913047.java >> >> * Suite (Bug6913047.java) >> * Tests JVM memory exhaustion while using keys for different services: >> P11Cipher, P11Signature, P11KeyAgreement, P11Mac, P11Digest, >> P11KeyGenerator, P11KeyFactory, etc. >> * Tests functional regression. >> * Including Key Stores (P11KeyStore) >> >> Parameters to run the reproducer (on JDK-10): >> * javac: --add-modules jdk.crypto.cryptoki --add-exports >> java.base/sun.security.internal.spec=ALL-UNNAMED >> * java: -XX:+UseParallelGC -Xmx3500m --add-modules jdk.crypto.cryptoki >> --add-opens java.base/javax.crypto=ALL-UNNAMED --add-opens >> jdk.crypto.cryptoki/sun.security.pkcs11=ALL-UNNAMED >> >> You can also use jtreg. >> >> >> PERFORMANCE >> ........................................................ >> >> For a quick reproducer previously developed (which looped 100000 times >> creating P11Cipher and P11Key objects to encrypt a plaintext), these are >> the figures I got: >> >> * real 1m11.328s (without fix) >> * real 1m12.795s (with fix) >> >> Performance penalty seems to be low in current state. >> >> OTHER >> ....................................................... >> >> My employer has an OCA agreement with Oracle and this work has been done >> in that context. >> >> Look forward to your comments. >> >> Kind regards, >> Martin.- >> >> -- >> [1] - https://bugs.openjdk.java.net/browse/JDK-6913047 >> > >