> openSSL v9.8.2 on Windows 2000 Server in support of PostgreSQL 8.2 - my
> original bug report to the Postgres developers can be found at:
> http://archives.postgresql.org/pgsql-bugs/2006-12/msg00122.php
> 
> A follow-up to my report posted by Tom Lane, a member of Postgres' core
> dev team, includes the following:
> 
> [quote]
> [ pokes around... ]  I'm inclined to say it's an OpenSSL bug --- the
> message string corresponds to WSAEINTR, which to the extent I know
> anything about Windows (which is admittedly nil) seems like it should be
> treated the same as EINTR on Unix, ie, retry.  That is how Postgres
> treats it on regular unsecured socket connections.  But a look in the
> OpenSSL sources finds no indication that they treat it specially,
> which means they're going to think it's a hard error.
> [/quote]
> 
> Basically, I've had my server brought down on three occasions by full
> hard drives. They fill with Postgres' log files, logging the error
> message: "SSL SYSCALL error: A blocking operation was interrupted by a
> call to WSACancelBlockingCall." As I understand Tom's follow-up to my
> bug report, some part of my client-server interaction is causing what
> should be a soft error, but OpenSSL reports it to Postgres as a hard
> error, and Postgrs in turn dutifully logs it... again and again and again.

According to MSDN this error occurs only when WSACancelBlockingCall is 
called. Now, if we assume that it's openssl that calls 
WSACancelBlockingCall, then it should be noted that it's done just 
before WSACleanup, which in turn "terminates use of the winsock 
interface" after which all winsock calls shall fail. Then googling 
suggests that error code in question can be returned if socket is closed 
elsewhere. In other words there is no reason to treat it as "a soft 
error," not under just mentioned premises. One can argue that openssl 
shouldn't call BIO_sock_cleanup (that's where WSACancelBlockingCall and 
WSACleanup are called), at least not present function, unless it knows 
that application is exiting [but it never knows]. And it shouldn't have 
registered it as signal handler. The latter can also be the cause of the 
problem, i.e. when openssl simply overrides application's signal 
handler. Because if application sets one, then signal won't be handled 
the way application expects. Even though signal(3)-ing in Windows is a 
kind of misnomer...

I can see that Postgres aliases WSAEINTR to EINTR, but I can't find any 
evidence that this error code is in any way equivalent to Unix EINTR and 
that it should be treated in "just instantly retry same call" manner and 
not as fatal condition. I'd insist on starting by removing signal(3) 
from b_sock.c [according to http://cvs.openssl.org/chngview?cn=16556]. A.


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to