Hi folks,
I think we found a bug in OpenSSL regarding RSA blindings that can lead to Bad
Record MAC messages and would like to discuss with you how to fix this issue.
Let me first describe the setup: we are using OpenSER, a SIP proxy. What's
unusual about it is that its design is multi-processed, that is it has forked
a lot of processes to do the work. There is a central TCP/TLS receiver process
that notices when a connection needs to be handled (i.e. there's new data) and
through a pipe, the sendmsg and the SCM_RIGHTS feature connections and its
file descriptors are passed to worker processes which now handle the reading
and processing of the SIP messages.
The worker process has a lock for each SSL connection and before doing the
necessary OpenSSL calls the filedescriptor is updated with the SSL_set_fd
command.
So far so good, this all works very well. However we and a customer sometimes
experienced Bad Record MAC messages in connection attempts from the phone.
After weeks of debugging (the Bad Record MACs appeared only seldom), reading
and understanding the OpenSSL source, inserting tons of debugging output we
finally traced the error back to RSA_decrypt_private and think that we have
found the root issue: RSA blinding.
Now since we disable the RSA blinding for the private RSA key via the
RSA_blinding_off command the Bad Record MAC are gone. So something about the
blinding must be wrong.
I have the suspicion that the locking as implemented for the blinding is to
blame: it seems to have a hole. If you look at crypto/rsa/rsa_eay.c function
rsa_get_blinding you notice an exception (optimization) regarding the locking:
if the current process is the process who created the blinding then it tells
its caller that no locking needs to be done. But what about other processes
running at the same time ? So the original creator never does the locking and
can happily trample on other feets resulting in a failed decryption and
several handshake messages later in a Bad Record MAC.
We have yet to confirm that it is the locking but I couldn't spot any other
suspicious things in the blinding code so far. Maybe someone of you has
another idea ?
My suggestion would be to introduce an internal flag and once rsa_get_blinding
has noticed that locking needs to be done that flag is set and from then on
locking is always done. Easy to implement, no API impact, should be quite
reliable. Question is where to anchor the flag. BN_BLINDING ? RSA ?
If you guys think this is the way to go then I can provide a patch.
Bye,
Marc
--
Marc Haisenko
Comdasys AG
Rüdesheimer Str. 7
80686 München
Germany
Tel.: +49 (0)89 548 433 321
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [email protected]
Automated List Manager [email protected]