On Tue, 6 Sep 2022 20:25:13 GMT, Valerie Peng <[email protected]> wrote:
>> There is a race condition in JDK's SessionManager, which can lead to random
>> exceptions.
>>
>> **Exception:**
>>
>> javax.net.ssl.SSLException: Internal error: close session with active objects
>> at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:133)
>> at
>> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:371)
>> at
>> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:314)
>> at
>> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:309)
>> at
>> java.base/sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1707)
>> at
>> java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1080)
>> at
>> java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:971)
>> at SSLSocketServer.serverLoop(SSLSocketServer.java:133)
>> at SSLSocketServer$1.run(SSLSocketServer.java:75)
>> at java.base/java.lang.Thread.run(Thread.java:833)
>> Caused by: java.security.ProviderException: Internal error: close session
>> with active objects
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.Session.close(Session.java:127)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.Session.close(Session.java:114)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.SessionManager.closeSession(SessionManager.java:237)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.SessionManager$Pool.release(SessionManager.java:270)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.SessionManager.demoteObjSession(SessionManager.java:210)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.Session.removeObject(Session.java:101)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.SessionKeyRef.updateNativeKey(P11Key.java:1396)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.SessionKeyRef.removeNativeKey(P11Key.java:1377)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.NativeKeyHolder.releaseKeyID(P11Key.java:1329)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.P11Key.releaseKeyID(P11Key.java:156)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.reset(P11AEADCipher.java:529)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.ensureInitialized(P11AEADCipher.java:436)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.implDoFinal(P11AEADCipher.java:732)
>> at
>> jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.engineDoFinal(P11AEADCipher.java:624)
>> at java.base/javax.crypto.Cipher.doFinal(Cipher.java:2500)
>> at
>> java.base/sun.security.ssl.SSLCipher$T12GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1659)
>> at
>> java.base/sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:260)
>> at
>> java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:181)
>> at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)
>> at
>> java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1508)
>> at
>> java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1479)
>> at
>> java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1064)
>> ... 4 more
>>
>>
>> **Reproducibility:**
>> I started getting this exception quite reliably on JDK17 on my machine with
>> one particular test setup using ssl-tests testsuite. Unfortunately setup
>> itself needed some RH specific patches and also ability to reproduce depends
>> on other factors such number keys in keystore, machine where testing was
>> performed... I tried a bit to create some reproducer, but I couldn't find a
>> way to easily reproduce this issue :(
>>
>> **Problem:**
>> SunPKCS11 provider does session pooling. This is done in SessionManager [1]
>> (one per SunPKCS11 provider). Released sessions are kept by SessionManager
>> for a while, for reuse (in limited number). This however is a bit
>> complicated as some sessions can own objects (e.g. keys). So there are
>> actually 2 pools. One for sessions with objects ("objSessions") and one for
>> sessions without objects ("opSessions"). This is because sessions without
>> objects, which are not being used, can be safely closed (SessionManager only
>> keeps around limited amount of these), while sessions with objects cannot be
>> safely closed (until all objects are removed from them). Session manager has
>> methods for getting Session for given purpose (object creation or just doing
>> other operations), prioritizing appropriate pool. Each session has counter
>> (called "createdObjects") to track how many objects it owns. When session is
>> being returned to pool this counter is checked and session is placed to
>> appropriate pool. Also wh
en counter for some Session in "objSessions" pool reaches zero it is moved
("demoted") to "opSessions" pool.
>>
>> And here comes complicated part. As far as I understand it,
>> Session.addObject() [2] (which increases "createdObjects" counter) is always
>> being called by thread "holding" session which owns the created object.
>> (That is: thread gets a session, uses it to create an object and calls
>> Session.addObject() on that session to increase the counter, before
>> returning the session to pool. See e.g.: [3]) However this is not true for
>> Session.removeObject() [4]. (That is: thread gets session, which is not
>> necessary the same one owning object being removed, performs object removal,
>> but then calls Session.removeObject() on session which owned that object.
>> See e.g.: [5]) That is Session.removeObject() can be called on Session which
>> is in "objSessions" pool or which is being used be other thread. (object
>> removal can happen as result of releasing key, either explicitly or as
>> result of GC etc..).
>>
>> And finally, there is a problem in code handling object removal from a
>> session. Session.removeObject() [4] first checks if "createdObjects" counter
>> reached zero. If so, it calls SessionManager.demoteObjSession(this) [6],
>> which attempts to remove Session from objSessions pool, if session is
>> successfully removed from there, meaning no other thread "holds" this
>> session, session is put to opSessions pool, if not (meaning other thread
>> "holds" it), method just returns, since that other thread puts this session
>> to appropriate pool, when it is done with it by calling
>> SessionManager.releaseSession(session).
>>
>> There is race condition here. Consider following scenario:
>>
>> // Thread T1 runs:
>> Session.removeObject() // [4]
>> createdObjects.decrementAndGet() // returns zero
>>
>> // Thread T2 steps in (operating on the same session instance):
>> Session.addObject() // increases "createdObjects" counter [2]
>> SessionManager.releaseSession(session) // releases session to objSessions
>> pool
>>
>> // Thread T1 continues:
>> SessionManager.demoteObjSession(this) // [6]
>> objSessions.remove(session) // returns true
>> opSessions.release(session) // puts session (with objects!) to opSessions
>> pool
>> // if opSessions is already full, close of session with objects is attempted
>> throwing Exception..
>>
>>
>> **Fix:**
>> SessionManager.demoteObjSession [6] method was changed, so that check for
>> objects is done once again if session was successfully removed from
>> "objSessions" pool (now that it is out of pool and other threads should not
>> be adding objects to it). Based on this check session is either released to
>> "opSessions" pool or returned to "objSessions" pool. This can be achieved by
>> calling releaseSession(session) instead of opSessions.release(session).
>>
>> **Testing:**
>> jdk_security tests passed for me locally with this change.
>> I have also tested this change on top of custom JDK17 build which allows
>> scenario, where I can reproduce this issue. Problem got fixed.
>>
>> [1]
>> https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/SessionManager.java
>> [2]
>> https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/Session.java#L93
>> [3]
>> https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/P11Key.java#L1339
>> [4]
>> https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/Session.java#L98
>> [5]
>> https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/P11Key.java#L1360
>> [6]
>> https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/SessionManager.java#L195
>
> Thanks for the suggested fix. I share your opinion about the potential race
> condition regarding demoting object session. Will take a look.
@valeriepeng, Thank you for your review
-------------
PR: https://git.openjdk.org/jdk/pull/10125