I realize that I must be doing all that. The difference I see from errno (and the reason I wrote this) is that if you fail to read errno, it does not affect the outcome of the NEXT system call (save for few documented cases which specifically instruct you to clear errno before calling the function). That strikes me as a design problem.
It's difficult, in a large system, to make sure that everyone "play by the book", and with non-blocking IO it's common that the same thread deals with unrelated tasks (with a select() loop and "socket handlers"). So what can happen here is that "my code" runs OpenSSL with full error checking, and "somebody else's" code runs OpenSSL with no error checking, breaking "my code". It's preferable for a general-purpose library to be designed to avoid such scenarios, and in this particular case it appears to have a solution - check socket blocking state before checking the error queue. That's what I was getting at. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of [email protected] Sent: Tuesday, January 04, 2011 5:51 AM To: [email protected] Subject: Re: Non empty error stack failing non-blocking SSL IO If your program ignores the error queue, your program is doing the equivalent of not checking errno after every system call. The program is required to deal with the error queue, because it is OpenSSL's only mechanism for informing the application code of the wide variety of potential protocol and authentication issues. The program should absolutely not be doing the same things in the cases of SSL_get_error() returning SSL_ERROR_SSL and SSL_WANT_READ. (It may be that someone missed a break statement at the end of one case and it's falling through to the next.) Either way, this is not anomalous behavior on OpenSSL's part. After you call SSL_read() and get zero bytes, you must determine why you got zero bytes, and that's where you should call SSL_get_error(). If it returns SSL_ERROR_SSL, you must check the error queue to determine exactly why the SSL session is in an error state. (The reason for the queue is because you're supposed to be interested in and handle every error that comes up in the process, not merely the most-recent.) -Kyle H On Mon, Jan 3, 2011 at 4:22 AM, Uri Simchoni <[email protected]> wrote: > I’m using OpenSSL 0.9.8i, and have noticed the following scenario: > > - Some OpenSSL crypto function returns with an error, leaving a > description of the error on the error queue > > - The application neglects to call ERR_clear_error() > > - SSL_read() is then called on a non-blocking socket and returns > because there’s no input available > > - Calling SSL_get_error() returns SSL_ERROR_SSL instead of > SSL_ERROR_WANT_READ, because the error queue is not empty. > > > > Would it be possible to modify the code so that blocking socket takes > precedence over the error queue? > > If not, what is the recommended programming practice with non-blocking > sockets? > > - ensure the everybody call ERR_clear_error() after an error > > - call ERR_clear_error() before SSL read/write (but if that’s > recommended why isn’t it inside SSL_read/SSL_write) > > > > Thanks, > > Uri > >
