We have been troubleshooting SSL issues using Apache with mod_nss and
Safenet HSMs for quite a while, so hopefully you can provide some insight
as the issue seems to be in the NSS component.

Red Hat Enterprise Linux Server release 7.6 (Maipo)
httpd.x86_64                      2.4.6-90.el7
@rhel-7-server-rpms
mod_nss.x86_64                    1.0.14-12.el7
 @rhel-7-server-rpms
nss.x86_64                        3.36.0-7.1.el7_6
@rhel-7-server-rpms

Apache is configured with mod_nss and Safenet libcryptoki, using TLSv1.2.
We see that in a seemingly random fashion, the Apache worker suddenly can't
talk to the HSM anymore, and can't recover from that. All subsequent
requests handled by this worker fail with the same error message. Only way
to recover is to kill the worker (or restart whole Apache):

[Tue Sep 24 20:21:19.375686 2019] [:error] [pid 2646] SSL Library Error:
-8152 The key does not support the requested operation

Packet captures show that the incoming TLS Client Hello that triggers the
error is identical to a successful one. We have noticed, nevertheless, that
there *might* be some correspondence with TLS session reuse. There have
been several instances in which a worker fails with this error soon after
handling a resumed TLS session (so the worker receives a Client Hello with
a session-id, successfully handles the request without renegotiation but
fails soon after on a subsequent request).

Here's the interesting part: we have also sniffed the PKCS11 conversation
between NSS and the HSM and we can see where the problem occurs. It looks
that the Apache worker is trying to perform operations on an invalid object
handle (pay attention to hObject=0x00001A60). PID 2646 is the Apache worker
that failed in the example before:

pid(2646) tid(140580153710720) time(24/09/2019,20:21:19.313)     >
C_GetAttributeValue hSession=0x00020001 hObject=0x00001A60
pTemplate=0x0x7ffe7173b7c0 count=1
pid(2646) tid(140580153710720) time(24/09/2019,20:21:19.348)     <<
C_GetAttributeValue rv=0x00000082{object handle invalid}
pTemplate=0x0x7ffe7173b7c0

pid(2646) tid(140580153710720) time(24/09/2019,20:21:19.357)     >
C_SignInit hSession=0x0002005B
pMechanism=0x0x7ffe7173b760{type=0x1{RSA_PKCS} pParam=0x(nil) paramLen=0}
hObject=0x00001A60
pid(2646) tid(140580153710720) time(24/09/2019,20:21:19.374)     <<
C_SignInit rv=0x00000060{key handle invalid}


That object handler 0x00001A60 seems to have been explicitly destroyed by
the same process more than 2 hours before in this example:

pid(2646) tid(140580153710720) time(24/09/2019,18:00:09.521)     >
C_DestroyObject hSession=0x00020001 hObject=0x00001A60
pid(2646) tid(140580153710720) time(24/09/2019,18:00:09.523)     <
C_DestroyObject rv=0x00000000{success}


Any further information or ideas are welcome.

Best regards!
-- 
dev-tech-crypto mailing list
dev-tech-crypto@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto

Reply via email to