I hope I am using the RT correctly in this manner. At this time I am seeking approval / consensus / agreement in principal on this patch inclusion. Once that is reached then I'd like to work on related matters about this issue like updating man pages / API documentation, having a new FAQ item, updating example code in openssl builtin apps, authoring a release note comment, all the stuff I don't need to do for a private OpenSSL release but we do for the public one.
This ticket is request submission of a patch: http://marc.info/?l=openssl-dev&m=115154030723033&q=p3 [Direct download for patch *.diff format from previous email attachment, applied to openssl-0.9.8b ] http://marc.info/?t=115154004000001&r=1&w=2 [Path covering email thread] Here is a posting made of a tool called "sslregress" which can expose this specific problem and verify the fix: http://marc.info/?t=115129081200001&r=1&w=2 [The thread discussing sslregress] http://marc.info/?t=115153962600003&r=1&w=2 [The thread with the patch] Here are related threads of discussion over time on the matter: http://marc.info/?l=openssl-users&m=115088475305680&w=2 [Post my me on the initial problem] http://marc.info/?t=121249812000001&r=1&w=2 [Thor Lance came to similar conclusions to me about a fix from his viewpoint and accepts my more detail analysis of the problem as being more correct] http://marc.info/?t=119109061500001&r=1&w=2 [Discussion about the issues around SSL_shutdown] http://marc.info/?l=openssl-dev&m=116417462427122&w=2 [2nd call for assistance for inclusion Nov 2006] http://marc.info/?l=openssl-dev&m=116525974320575&w=2 [3rd call for assistance for inclusion Dec 2006] The issue in question is over problems in relation to SSL_shutdown() functionality and also API inadequacies surrounding that issue: * The SSL_shutdown() is not handling all possible soft-error conditions correctly, specifically if the lower-layer buffer (BIO/kernel) is under a full-buffer on write condition (EAGAIN) whilst SSL_shutdown() attempts to send/push-down the "ssl shutdown notify" packet from the SSL stream. My memory is hazy of the exact details of this problem, as my main driving issue was the API inadequacy. The thought was that is the last byte of the encoded "ssl shutdown notify" packet did not make it into the BIO/kernel buffer the first time, then this is never sent and therefore connections may hang open. * As an OpenSSL libssl.so API user many application want to know when the OpenSSL library no longer requires use of the write half of a socket. This is so that a TCP shutdown(sockfd, SHUTDOWN_WR) can be issued reliability once that shutdown progress checkpoint has been passed. This is an API inadequacy. The SSL_shutdown() method is just a restartable state machine, that enters the following logic sequence each time it is called: * Mark the SSL handle as not allowing any more application data to be sent to it (if not already marked). * Flush all existing "application data packets" from BIO to kernel (if data exists and the flush did not complete return -1/WANT_WRITE and return to caller). * Write out / enqueue the "shutdown notify" packet (it not already enqueued during a previous SSL_shutdown invocation). * Flush "shutdown notify" packet from libssl to BIO/Kernel layer below (if data still exists after flush; i.e. the flush did not complete return -1/WANT_WRITE and return to caller) * A this point due to the "shutdown notify" having been flushed 100% to from BIO to kernel we may be allowed to return the value "0" from SSL_shutdown(), if the following conditions are met; a) we have not previously returned 0 from SSL_shutdown before; b) we never cover up / mask a hard-error condition, e.g. -1/SSL_ERROR_{SSL,WHATEVER}; c) we would have returned -1/WANT_READ (sorry this is a superfluous rule and self-referencing to emphasis the point) * Read in / process packets from socket (if there are "application data packets" or no data exists return -1/WANT_READ by default or 0 according the the rules on zero-return above) * We received the "shutdown notify" packet, mark SSL handle that it has been received. * Return successful SSL_shutdown() state (always return 1 regardless of any zero-return rules above, it is legal for the API to completely skip returning zero if we get to here in the state machine). I hope most OpenSSL API users will agree that the above does not actually change the externally visible mechanics of how SSL_shutdown works. For all intents and purposes it works the same as it did before but applications may now commonly see -1/WANT_READ getting returned and -1/WANT_WRITE very rarely getting returned, the simplest modification to their existing application code to get old return code would be to treat the -1/WANT_READ and -1/WANT_WRITE soft-errors as-if zero was returned. A simple adjustment to make once understood. Impact on existing code users: * My guess is that most applications keep repeatably calling SSL_shutdown() until a "1" return value is received. This situation does not change. * Currently SSL_shutdown() will only ever return 0 or 1. This patch will change that, now its possible for -1/WANT_WRITE or 0 or -1/WANT_READ or 1 to be nominal return values. So existing code needs to accept the soft-error returns from this API call. Again I think the impact on this is low, in that since SSL_shutdown() has been called already the application has no further use for the socket so a developer who wrote code that simply treated all return codes that were not 0 or 1 as a generic fatal-error, that developer would have freed/cleanup the socket and resources since there is nothing else you can do. * The meaning of the "0" return code after this patch is now made crystal clear (previously it did not have any specific meaning about the shutdown progress other than it is not "1"). After the patch a zero will be returned on the first SSL_shutdown() invocation that successfully confirms the shutdown progress to have reached the point when OpenSSL libssl.so no longer has any further use for the write half of the underlying bidirectional socket. Subsequent invocation of SSL_shutdown() whilst the will return -1/WANT_READ. Any existing code wanting a simple conversion, should treat -1/WANT_WRITE and -1/WANT_READ as it would a 0 return value. Documentation improvements: It should be documented in SSL_shutdown that it is possible for inbound application data to stall the completion of shutdown. So while a "0" or "-1/WANT_READ" return value are seen application should attempt to call SSL_read() at least once and discard the application data that is seen. This can cause application who use the -1/WANT_READ to look for select() readability to enter an infinite loop when there is unread application data still coming in from the peer. Once all that has been read in only then can the "recieved shutdown notify" packet be seen and processed. This should be at least an FAQ point. I've taken quite a bit of time to understand, verify and access the situation in relation to this so if something is unclear please do not hesitate to contact me or share your thought process via openssl-dev list. Thanks, Darryl L. Miles ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected]
