Nauman Akbar wrote:
Dear Users
I am having this problem for a long time. Initially I thought it was an
issue with configuration of multi-threading but the problem seems to
remain with multi-threading removed.
I have developed a simple ssl based multi-threaded server application.
Previously, openssl data was shared among threads but now all ssl
functions are performed in a single thread. I am developing this
application on RH9 using openssl 0.9.7a. There is only one client
connecting to this server using the same credentials. Both client and
server only use ADH with SSLv3.
The problem I am having is, sometimes SSL_accept fails completely
randomly, taking down the server with it. It may be a segmentation fault
or some other exception. Since I am connecting to the machine remotely,
it is not possible for me to monitor the application at all times
(although I have tried). This is why I don’t know for certain what error
is generated when the server application crashes.
One thing is always common. The server terminates while doing a new
SSL_accept. The client receives this error on the other side:
21298:error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record
mac:s3_pkt.c:1052:SSL alert number 20.
Even the more bizarre thing is sometimes it would handle <500
connections, sometimes <1000. There had been few cases of 1000-5000
requests. Last week it crashed after about 2-3 weeks time with request
count in excess of 11000. It crashed again yesterday after running for
less than 24 hours and handling 40000 requests. It crashed again today
within 24 hours with 700 requests. After every crash, I changed
different multi-threading options (both generic and openssl based) to
make it work. However, during last 2 runs no ssl based functions/data
are shared among threads. So it is not a case of multi threading failing
or any race condition causing the crash. Additionally, the application
is explicitly made to keep thread count under 10 so it can’t be an issue
of memory unavailability. The server program is quite linear and do not
use dynamic blocks of memory except for certain class/structure objects
(but no arrays etc), so index over running or anything similar is also
not plausible. Just for sanity check, I am also having my code reviewed
by others.
The situation has become very urgent as I have to deliver this by coming
Friday and I still don’t know what is causing this. The only plausible
option I am left with is 0.9.7a has some issues with SSL_accept. I am
trying to get new version installed on the system. In the meantime, can
anyone guide me with respect to this problem? Is this really a version
issue or is there anything else I need to look at?
Regards
Nauman Akbar
Concise Solutions
Nauman -
We have been battling this exact same situation for the last three
weeks, to no avail. We're regretfully considering other options such as
GnuTLS which looks promising, although this may not be an option for you.
I wish we could have worked out the issues with OpenSSL. Perhaps it's
our coding that is messing up, but we are unable to get any help with
our problems. It seems as if this error is quite common, and many
people have had it, yet not many can explain it, and even less know how
it's fixed. We've tried compiling and linking our app against
OpenSSL-0.9.5 through OpenSSL-0.9.7g, with almost the exact same error.
I am telling you this to save you the time.
I'm not sure that this helps, but at least we understand what you're
going through :)
Thanks
-dant
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List [email protected]
Automated List Manager [EMAIL PROTECTED]