I hope I am using the RT correctly in this manner.

At this time I am seeking approval / consensus / agreement in principal 
on this patch inclusion.  Once that is reached then I'd like to work on 
related matters about this issue like updating man pages / API 
documentation, having a new FAQ item, updating example code in openssl 
builtin apps, authoring a release note comment, all the stuff I don't 
need to do for a private OpenSSL release but we do for the public one.


This ticket is request submission of a patch:

  http://marc.info/?l=openssl-dev&m=115154030723033&q=p3  [Direct 
download for patch *.diff format from previous email attachment, applied 
to openssl-0.9.8b ]
  http://marc.info/?t=115154004000001&r=1&w=2  [Path covering email thread]


Here is a posting made of a tool called "sslregress" which can expose 
this specific problem and verify the fix:

   http://marc.info/?t=115129081200001&r=1&w=2  [The thread discussing 
sslregress]
   http://marc.info/?t=115153962600003&r=1&w=2  [The thread with the patch]


Here are related threads of discussion over time on the matter:

   http://marc.info/?l=openssl-users&m=115088475305680&w=2  [Post my me 
on the initial problem]
   http://marc.info/?t=121249812000001&r=1&w=2  [Thor Lance came to 
similar conclusions to me about a fix from his viewpoint and accepts my 
more detail analysis of the problem as being more correct]
   http://marc.info/?t=119109061500001&r=1&w=2  [Discussion about the 
issues around SSL_shutdown]
   http://marc.info/?l=openssl-dev&m=116417462427122&w=2  [2nd call for 
assistance for inclusion Nov 2006]
   http://marc.info/?l=openssl-dev&m=116525974320575&w=2  [3rd call for 
assistance for inclusion Dec 2006]



The issue in question is over problems in relation to SSL_shutdown() 
functionality and also API inadequacies surrounding that issue:

 * The SSL_shutdown() is not handling all possible soft-error conditions 
correctly, specifically if the lower-layer buffer (BIO/kernel) is under 
a full-buffer on write condition (EAGAIN) whilst SSL_shutdown() attempts 
to send/push-down the "ssl shutdown notify" packet from the SSL stream.  
My memory is hazy of the exact details of this problem, as my main 
driving issue was the API inadequacy.  The thought was that is the last 
byte of the encoded "ssl shutdown notify" packet did not make it into 
the BIO/kernel buffer the first time, then this is never sent and 
therefore connections may hang open.

 * As an OpenSSL libssl.so API user many application want to know when 
the OpenSSL library no longer requires use of the write half of a 
socket.  This is so that a TCP shutdown(sockfd, SHUTDOWN_WR) can be 
issued reliability once that shutdown progress checkpoint has been 
passed.  This is an API inadequacy.




The SSL_shutdown() method is just a restartable state machine, that 
enters the following logic sequence each time it is called:

 * Mark the SSL handle as not allowing any more application data to be 
sent to it (if not already marked).

 * Flush all existing "application data packets" from BIO to kernel (if 
data exists and the flush did not complete return -1/WANT_WRITE and 
return to caller).

 * Write out / enqueue the "shutdown notify" packet (it not already 
enqueued during a previous SSL_shutdown invocation).

 * Flush "shutdown notify" packet from libssl to BIO/Kernel layer below 
(if data still exists  after flush; i.e. the flush did not complete 
return -1/WANT_WRITE and return to caller)

 * A this point due to the "shutdown notify" having been flushed 100% to 
from BIO to kernel we may be allowed to return the value "0" from 
SSL_shutdown(), if the following conditions are met; a) we have not 
previously returned 0 from SSL_shutdown before; b) we never cover up / 
mask a hard-error condition, e.g. -1/SSL_ERROR_{SSL,WHATEVER}; c) we 
would have returned -1/WANT_READ (sorry this is a superfluous rule and 
self-referencing to emphasis the point)

 * Read in / process packets from socket (if there are "application data 
packets" or no data exists return -1/WANT_READ by default or 0 according 
the the rules on zero-return above)

 * We received the "shutdown notify" packet, mark SSL handle that it has 
been received.

* Return successful SSL_shutdown() state (always return 1 regardless of 
any zero-return rules above, it is legal for the API to completely skip 
returning zero if we get to here in the state machine).



I hope most OpenSSL API users will agree that the above does not 
actually change the externally visible mechanics of how SSL_shutdown 
works. For all intents and purposes it works the same as it did before 
but applications may now commonly see -1/WANT_READ getting returned and 
-1/WANT_WRITE very rarely getting returned, the simplest modification to 
their existing application code to get old return code would be to treat 
the -1/WANT_READ and -1/WANT_WRITE soft-errors as-if zero was returned. 
A simple adjustment to make once understood.




Impact on existing code users:

 * My guess is that most applications keep repeatably calling 
SSL_shutdown() until a "1" return value is received.  This situation 
does not change.

 * Currently SSL_shutdown() will only ever return 0 or 1.  This patch 
will change that, now its possible for -1/WANT_WRITE or 0 or 
-1/WANT_READ or 1 to be nominal return values.  So existing code needs 
to accept the soft-error returns from this API call.  Again I think the 
impact on this is low, in that since SSL_shutdown() has been called 
already the application has no further use for the socket so a developer 
who wrote code that simply treated all return codes that were not 0 or 1 
as a generic fatal-error, that developer would have freed/cleanup the 
socket and resources since there is nothing else you can do.

 * The meaning of the "0" return code after this patch is now made 
crystal clear (previously it did not have any specific meaning about the 
shutdown progress other than it is not "1").  After the patch a zero 
will be returned on the first SSL_shutdown() invocation that 
successfully confirms the shutdown progress to have reached the point 
when OpenSSL libssl.so no longer has any further use for the write half 
of the underlying bidirectional socket.  Subsequent invocation of 
SSL_shutdown() whilst the will return -1/WANT_READ.  Any existing code 
wanting a simple conversion, should treat -1/WANT_WRITE and -1/WANT_READ 
as it would a 0 return value.



Documentation improvements:

It should be documented in SSL_shutdown that it is possible for inbound 
application data to stall the completion of shutdown.  So while a "0" or 
"-1/WANT_READ" return value are seen application should attempt to call 
SSL_read() at least once and discard the application data that is seen.  
This can cause application who use the -1/WANT_READ to look for select() 
readability to enter an infinite loop when there is unread application 
data still coming in from the peer.  Once all that has been read in only 
then can the "recieved shutdown notify" packet be seen and processed.   
This should be at least an FAQ point.



I've taken quite a bit of time to understand, verify and access the 
situation in relation to this so if something is unclear please do not 
hesitate to contact me or share your thought process via openssl-dev list.

Thanks,

Darryl L. Miles



______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to