Thor Lancelot Simon wrote:
Darryl (who wrote the patch) has a set of regression tests for nonblocking
operation with OpenSSL. He posted about it here way back in 2006 when he
originally pointed out this problem. It's kind of hard to figure out which
message in some of these old, old mailing list threads actually has which
patch or pointer to an external software distribution embedded in it -- maybe
he can just tell us again where to fetch them from?
The thing that's not 100% obvious about this patch is the mechanism by
which WANT_READ or WANT_WRITE propagate back up to the application code.
I believe what occurs is that, in the cases in which the patch makes it
possible for SSL_shutdown to return -1, the BIO has already set the error
for us and nothing higher up than the code in s3_lib/s3_pkt changes it.
I'd appreciate confirmation of this from Darryl as well.
Good to talk about this old chestnut once again!
AFAIK there has been no change between 0.9.8b and 0.9.8g to this part of
SSL (infact SSL hardly seems to be developed/fixed/changed over time,
maybe this is because there is no real maintainer which is also
contributing to the problem in getting this patch accepted).
Thor is correct in that there is nothing for the highlevel API to do
with regards to setup of the soft-error conditions WANT_READ/WANT_WRITE
since the lower level layers already set this up when you attempt to
push or pull data with BIO.
Thor is also correct in that this patch brings SSL_shutdown() semantics
100% in line with the rest of the API, at this time in conditions where
soft-error WANT_WRITE comes into play for SSL_shutdown() OpenSSL
exhibits a clear and demonstratable bug.
> On Tue, Jun 03, 2008 at 11:37:02AM -0400, Geoff Thorpe wrote:
>> A quick skim of this patch seems to indicate that it makes sense,
though the
>> litmus test will be to get some kind of regression coverage. Eg. do
>> connections get left dangling in any common scenarios?
For a regression testing suite please google for "sslregress"
http://marc.info/?l=openssl-dev&m=115153957132635&w=2
This can easily demonstrate the non-blocking SSL_shutdown() problem and
verify the patch fixes the problem, its also very easy should there be a
concern over a new problem occurring to setup a "testcase" for that
situation and quickly check old and new code.
Right now with this bug thats exists in all current OpenSSL versions it
is possible to get dangling connections with the current code, its just
that they will only occur 0.000001% (aka "very very rare") in the wild
since it relies on there being a transmit buffer full condition in the
kernel layer at the time the shutdown notify is sent, this short write
causes only part of the shutdown notify to be queued (from BIO to
kernel) and no subsequent call to SSL_shutdown() and make the remaining
part get written from OpenSSL to the kernel, which causes rare
connection hangs in the current OpenSSL code.
An application I was developing making use of it was hitting the problem
regularly a few years back and it uses a custom version of OpenSSL to
alleviate this problem. I've done my bit for the community here on this
issue (which is all I can do).
To date no maintainer has publically taken a proper look and documented
legitimate reasons or concerns over the patch. The SSL_shutdown()
method is simply a restartable state machine:
* Mark the SSL handle as not allowing any more application data to be
sent to it.
* Flush all existing "application data packets" from BIO to kernel (if
data exists and the flush did not complete return -1/WANT_WRITE)
* Write out "shutdown notify" packet
* Flush "shutdown notify" packet from BIO to Kernel (if data exists
and the flush did not complete return -1/WANT_WRITE)
* A this point due to the "shutdown notify" having been flushed 100%
to from BIO to kernel we may be allowed to return the value "0" from
SSL_shutdown().
* Read in packets from socket, if they are "application data packets"
or no data exists (return 0 or -1/WANT_READ)
* If a "shutdown notify" packet was received, then mark SSL socket it
has received.
* Return successful SSL_shutdown() state (return 1).
I maybe a little rusty after a few years but the above is the crux of
what is going on in SSL_shutdown(). The transition point from returning
-1/WANT_WRITE to 0 (or -1/WANT_READ) is an _IMPORTANT_ milestone that
any sane application wants to know about. It means the "shutdown
notify" packet is with the kernel buffer layer now (and will soon be on
its want to the remote SSL layer). It means the application can issue a
socket level TCP shutdown for the sending side too.
But the OpenSSL right now will always return "0" even in cases where the
last byte of the "shutdown notify" data is actually inside BIO still
because the kernel would not accept it due to a short write() system
call. Rare but can happen.
Darryl
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-dev@openssl.org
Automated List Manager [EMAIL PROTECTED]