Long info because I fear the Python module maybe misunderstanding what SSL_shutdown() actually does and why it exists. Which in turn mean that users of the Python module also misuse it (sandcastles in the sand and all that).

Antoine Pitrou wrote:
While testing Python's SSL support with OpenSSL >= 0.9.8m, we have
encountered a strange error return from SSL_shutdown on a non-blocking
socket (note: this is a different problem from the one described by
Victor Stinner in an earlier thread last month). Basically:

- SSL_shutdown(<ssl object>) returns -1
- SSL_get_error(<ssl object>, -1) returns SSL_ERROR_SYSCALL
- ERR_get_errno() returns 0
- errno is equal to 0

This situation was not hit before 0.9.8m. Our temptative workaround
right now (not yet committed, awaiting your insight :-)) is to detect
this particular situation and consider the call successful rather than
raise an exception.

It depends what you mean by "consider the call successful". There are 2 normal non-error states for SSL_shutdown() API calls, returning 0 and returning 1.

You should never consider a return of -1 to mean 1. Also a return of 1 is really the only value that indicates "success".

Then you have errors that are either recoverable (what I term soft-errors) and non-recoverable (hard-errors).

But as the recent mailing list thread indicates ( http://www.mail-archive.com/openssl-users@openssl.org/msg60444.html ) you may consider the specific soft-error returns of -1/WANT_READ and -1/WANT_WRITE to be successful as-if SSL_shutdown() had returned 0. If you are happy with keeping the non-descriptive behavior of older OpenSSL releases.

The SYS_ERROR_SYSCALL it probably because the underlying socket is no longer functional (see the comment overs EPIPE / ZERO_RETURN from the recent openssl-users list thread).

You must understand it is SSL_shutdown()'s job to - commence, advance and confirm that a cryptographically secure two-way shutdown has performed. This is its purpose in the world. If you are seeing -1/ERROR_SYSCALL then that is a _CORRECT_ thing for it to return in response to observing that state while trying to perform its mission.

What SSL_shutdown() is saying by returning -1/ERROR_SYSCALL is that a cryptographically two-way shutdown of the stream was _NOT_ completed and that it will probably not be able to ever be completed, probably due to the fact the underlying socket died on us. This is a fact of life you have to live with and deal with in your application now. The reason for the "probably" items; is that I'm sure there are other reasons that can cause it but practically most people will see this error indication at this stage due to those factors.

So thinking that SSL_shutdown() was successful would be incorrect, on the basis of my definition of the purpose of SSL_shutdown(). A cryptographically secure shutdown was not completed, therfore SSL_shutdown() was not successful.

I'm sorry that I've introduced this quasi-fuzziness into what was a nice clean wonderland of the Python SSL module. But it is a reality than an application should deal with and make up its own choice about.

Many applications don't care for a cryptographically secure shutdown of the communication transport, since they might indicate their intention to "QUIT" in the normal application payload data. The other end would then send back a "Bye bye, quit response message" (in the normal application payload data) and the server end goes into a state of never accepting any further commands from the client after that. Over and above all this, once each end has queued the last command/response data in respect of the "QUIT" command processing, once that application payload has successfully cleared the SSL_write() API call, that end can immediately proceed to calling SSL_shutdown(). This will commence proceedings in respect of a secure cryptographic shutdown, by denying any further SSL_write() calls (from your side) and by sending an end-of-stream indication packet to the other end. You then have to wait (and hope) the other end sends their end-of-stream indication packet, before you will see SSL_shutdown() return 1 on your side. Only once you have both sent and received the end-of-stream indication packet will SSL_shutdown() return 1.

Many client and server implementations just "hang up" on each other once the QUIT command response has been processed. I would guess the issue you are seeing with -1/ERROR_SYSCALL is due to this hanging up. But to be a good well meaning TLS/SSL citizen both ends should continue their non-blocking event loops for a reasonable amount of time (in the order of 5 to TCP timeout seconds) even after the last SSL_write() has been made. During this time both ends retry SSL_shutdown() over and over until it returns 1 (each time they get a non-blocking wakeup indication).

So you have to stand back for a moment and examine Python's use of the OpenSSL API and decide if you are trying to be 1:1 as much as possible to support and pass on all the cryptographic guarantees that OpenSSL makes or if you are trying to provide a simplified view of the world that Noddy and Big-Ears could use. Or maybe both by creating a Python specific API calls built on top of this understanding that irons out the issue by providing easy to digest error returns that users might like.

If you are able to observe a -1 error state where you think that a 1 should have been returned that maybe considered as a new bug. i.e. SSL_shutdown() should return 1 at least once (possibly to be sticky/latched) once that point in proceedings has been passed (regardless of the overall status of the underlying transport/socket).

I am interested in the issue of errno==0, this maybe indicative of the real errno return being lost. OpenSSL should if necessary preserve the first errno value it didn't expect to see, even if OpenSSL itself continues to make kernel calls that could reset the value of errno to 0.

Maybe this situation can be simulated by being a bad citizen and forcing a socket disconnection after one or both ends have called SSL_shutdown() at least once. I must say my testing and applications are good citizens so it may never have been noticed; also that I may have treated the -1/ERROR_SYSCALL case as being "unrecoverable" once SSL_shutdown() has been started and therefore never look to check if the errno!=0 (since I don't care for the specific reason in my usage).

What encouraged me in that workaround is that some LightHTTPd users have
encountered what looks like the same issue, also starting from 0.9.8m:

        « SSL_shutdown failed, SSL_get_error returned SSL_ERROR_SYSCALL,
        but errno == 0 - I think there is something wrong with your ssl
        lib. »
« Since I updated to openssl 0.9.8m I have noticed the same
        error messages in my log. (using lighttpd 1.4.26 with the same
        patch applied) »
I would welcome any explanations and suggestions concerning this
situation. Is it an OpenSSL bug? Or does this error return correspond to
an applicative error? (in which case, which error exactly, since the
return codes don't point to anything precise)

Well the simplified view of it is this (the exact errno reason isn't important in the decision making process, since it does not change the outcome).

I still think it is probably due to the state of the network socket changing to being no longer operational BEFORE SSL_shutdown() could complete the two-way cryptographic shutdown.

So as such this situation is unrecoverable.

So as such the correct course of action is to accept that SSL_shutdown() did not complete and to deallocate SSL objects and to clean up your sides affairs by doing such things as closing the the socket handle you are holding.

I think you are correct to assert that an OpenSSL bug exists if you are able to observe -1/ERROR_SYSCALL and errno==0.

But it is not a bug to observe -1/ERROR_SYSCALL from SSL_shutdown().


OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to