On Tue, May 24, 2005, dan wrote:

> Nauman Akbar wrote:
> >Dear Users
> >
> > 
> >
> >I am having this problem for a long time. Initially I thought it was an 
> >issue with configuration of multi-threading but the problem seems to 
> >remain with multi-threading removed.
> >
> > 
> >
> >I have developed a simple ssl based multi-threaded server application. 
> >Previously, openssl data was shared among threads but now all ssl 
> >functions are performed in a single thread. I am developing this 
> >application on RH9 using openssl 0.9.7a. There is only one client 
> >connecting to this server using the same credentials. Both client and 
> >server only use ADH with SSLv3.
> >
> > 
> >
> >The problem I am having is, sometimes SSL_accept fails completely 
> >randomly, taking down the server with it. It may be a segmentation fault 
> >or some other exception. Since I am connecting to the machine remotely, 
> >it is not possible for me to monitor the application at all times 
> >(although I have tried). This is why I don’t know for certain what error 
> >is generated when the server application crashes.
> >
> > 
> >
> >One thing is always common. The server terminates while doing a new 
> >SSL_accept. The client receives this error on the other side: 
> >21298:error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record 
> >mac:s3_pkt.c:1052:SSL alert number 20.
> >
> > 
> >
> >Even the more bizarre thing is sometimes it would handle <500 
> >connections, sometimes <1000. There had been few cases of 1000-5000 
> >requests. Last week it crashed after about 2-3 weeks time with request 
> >count in excess of 11000. It crashed again yesterday after running for 
> >less than 24 hours and handling 40000 requests. It crashed again today 
> >within 24 hours with 700 requests. After every crash, I changed 
> >different multi-threading options (both generic and openssl based) to 
> >make it work. However, during last 2 runs no ssl based functions/data 
> >are shared among threads. So it is not a case of multi threading failing 
> >or any race condition causing the crash. Additionally, the application 
> >is explicitly made to keep thread count under 10 so it can’t be an issue 
> >of memory unavailability. The server program is quite linear and do not 
> >use dynamic blocks of memory except for certain class/structure objects 
> >(but no arrays etc), so index over running or anything similar is also 
> >not plausible. Just for sanity check, I am also having my code reviewed 
> >by others.
> >
> > 
> >
> >The situation has become very urgent as I have to deliver this by coming 
> >Friday and I still don’t know what is causing this. The only plausible 
> >option I am left with is 0.9.7a has some issues with SSL_accept. I am 
> >trying to get new version installed on the system. In the meantime, can 
> >anyone guide me with respect to this problem? Is this really a version 
> >issue or is there anything else I need to look at?
> >
> > 
> >
> >Regards
> >
> >Nauman Akbar
> >
> >Concise Solutions
> >
> 
> 
> Nauman -
> 
> We have been battling this exact same situation for the last three 
> weeks, to no avail.  We're regretfully considering other options such as 
> GnuTLS which looks promising, although this may not be an option for you.
> 
> I wish we could have worked out the issues with OpenSSL.  Perhaps it's 
> our coding that is messing up, but we are unable to get any help with 
> our problems.  It seems as if this error is quite common, and many 
> people have had it, yet not many can explain it, and even less know how 
> it's fixed.  We've tried compiling and linking our app against 
> OpenSSL-0.9.5 through OpenSSL-0.9.7g, with almost the exact same error. 
>  I am telling you this to save you the time.
> 
> I'm not sure that this helps, but at least we understand what you're 
> going through :)
> 

In brief the "bad record mac" error is caused when OpenSSLs calculated record
macs (a checksum of sorts) doesn't agree with the value the peer has given.

There are a very large number of possible causes for this error. One could be
an implementation bug (either OpenSSL or the peer), corruption of network
data a badly written application or even malicious activity.

If this is causing a crash then a stack trace is needed to have a reasonable
chance to trace the cause.

Steve.
--
Dr Stephen N. Henson. Email, S/MIME and PGP keys: see homepage
OpenSSL project core developer and freelance consultant.
Funding needed! Details on homepage.
Homepage: http://www.drh-consultancy.demon.co.uk
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to