Re: [HACKERS] Failing SSL connection due to weird interaction with openssl

2012-12-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 FWICS, this kind of problem is endemic in OpenSSL, which
 also doesn't seem to believe in comprehensive documentation or code
 comments.  It would be nice if we had an API to some other, less
 crappy encryption library; or maybe even some generic API that lets
 you easily wire it into any library you happen to wish to use.

Awhile back Red Hat was trying to get people to switch to NSS or GnuTLS,
which apparently are better designed.

 Not that I'm volunteering to write the patch... :-(

Me either ... and in fact the lack of interest among upstreams in
rewriting their TLS code is what made the aforesaid effort crash and
burn.  But FWIW, there are better alternatives out there.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failing SSL connection due to weird interaction with openssl

2012-12-08 Thread Andres Freund
On 2012-11-26 21:45:32 -0500, Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  I gather that this is supposed to be back-patched to all supported
  branches.

 FWIW, I don't like that patch any better than Robert does.  It seems
 as likely to do harm as good.  If there are places where libpq itself
 is leaving entries on the error stack, we should fix them -- retail.
 But it's not our business to clobber global state because there might
 be bugs in some other part of an application.

As there hasn't been any new input since this comment I am marking the
patch as Rejected in the CF application.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failing SSL connection due to weird interaction with openssl

2012-11-26 Thread Alvaro Herrera
Lars Kanis wrote:
 While investigating a ruby-pg issue [1], we noticed that a libpq SSL
 connection can fail, if the running application uses OpenSSL for
 other work, too. Root cause is the thread local error queue of
 OpenSSL, that is used to transmit textual error messages to the
 application after a failed crypto operation. In case that the
 application leaves errors on the queue, the communication to the
 PostgreSQL server can fail with a message left from the previous
 failed OpenSSL operation, in particular when using non-blocking
 operations on the socket. This issue with openssl is quite old now -
 see [3].

I gather that this is supposed to be back-patched to all supported
branches.

 [3] 
 http://www.educatedguesswork.org/movabletype/archives/2005/03/curse_you_opens.html

This link is dead.  Here's one that works:
http://www.educatedguesswork.org/2005/03/curse_you_opens.html


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failing SSL connection due to weird interaction with openssl

2012-11-26 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I gather that this is supposed to be back-patched to all supported
 branches.

FWIW, I don't like that patch any better than Robert does.  It seems
as likely to do harm as good.  If there are places where libpq itself
is leaving entries on the error stack, we should fix them -- retail.
But it's not our business to clobber global state because there might
be bugs in some other part of an application.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failing SSL connection due to weird interaction with openssl

2012-11-11 Thread Lars Kanis
Am 06.11.2012 21:40, schrieb Robert Haas:
 On Tue, Oct 23, 2012 at 4:09 AM, Lars Kanis l...@greiz-reinsdorf.de wrote:
 While investigating a ruby-pg issue [1], we noticed that a libpq SSL
 connection can fail, if the running application uses OpenSSL for other work,
 too. Root cause is the thread local error queue of OpenSSL, that is used to
 transmit textual error messages to the application after a failed crypto
 operation. In case that the application leaves errors on the queue, the
 communication to the PostgreSQL server can fail with a message left from the
 previous failed OpenSSL operation, in particular when using non-blocking
 operations on the socket. This issue with openssl is quite old now - see
 [3].

 For [1] it turned out that the issue is subdivided into these three parts:
 1. the ruby-openssl binding does not clear the thread local error queue of
 OpenSSL after a certificate verify
 2. OpenSSL makes use of a shared error queue for different crypto contexts.
 3. libpq does not ensure a cleared error queue when doing SSL_* calls

 To 1: Remaining messages on the error queue can generally lead to failing
 operations, later on. I'd talk to the ruby-openssl developers, to discuss
 how we can avoid any remaining messages on the queue.

 To 2: SSL_get_error() inspects the shared error queue under some conditions.
 It's maybe poor API design, but it's documented behaviour [2]. So we
 certainly have to get along with it.

 To 3: To make libpq independent to a previous error state, the error queue
 might be cleared with a call to ERR_clear_error() prior
 SSL_connect/read/write as in the attached trivial patch. This would make
 libpq robust against other uses of openssl within the application.

 What do you think about clearing the OpenSSL error queue in libpq in that
 way?
 Shouldn't it be the job of whatever code is consuming the error to
 clear the error queue afterwards?

Yes, of course. I already filed a bug for ruby-openssl, some weeks ago [1].

But IMHO libpq should be changed too, for the following reasons:

1. The behavior of libpq isn't consistent, since blocking calls are
already agnostic to remaining errors in the openssl queue, but
non-blocking are not. This is a openssl quirk, that is exposed to the
libpq-API, this way.

2. libpq throws wrong errors. The error of libpq isn't Remaining errors
in openssl error queue. libpq requires a clear error queue in order to
work correctly., but instead it throws arbitrary foreign errors that
could relate to or may not relate to the communication of libpq. The
documentation for SSL_get_error(3) is pretty unambiguous about the need
to clear the error queue first.

3. The  sensitivity of libpq to the error queue can lead to bugs that
are hard to track down, like this one [2]. This is because a libpq error
leads the developer to look for a bug related to the database
connection, although the issue is in a very different part of the code.

Regards,
Lars

[1] http://bugs.ruby-lang.org/issues/7215
[2]
https://bitbucket.org/ged/ruby-pg/issue/142/async_exec-over-ssl-connection-can-fail-on




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failing SSL connection due to weird interaction with openssl

2012-11-06 Thread Robert Haas
On Tue, Oct 23, 2012 at 4:09 AM, Lars Kanis l...@greiz-reinsdorf.de wrote:
 While investigating a ruby-pg issue [1], we noticed that a libpq SSL
 connection can fail, if the running application uses OpenSSL for other work,
 too. Root cause is the thread local error queue of OpenSSL, that is used to
 transmit textual error messages to the application after a failed crypto
 operation. In case that the application leaves errors on the queue, the
 communication to the PostgreSQL server can fail with a message left from the
 previous failed OpenSSL operation, in particular when using non-blocking
 operations on the socket. This issue with openssl is quite old now - see
 [3].

 For [1] it turned out that the issue is subdivided into these three parts:
 1. the ruby-openssl binding does not clear the thread local error queue of
 OpenSSL after a certificate verify
 2. OpenSSL makes use of a shared error queue for different crypto contexts.
 3. libpq does not ensure a cleared error queue when doing SSL_* calls

 To 1: Remaining messages on the error queue can generally lead to failing
 operations, later on. I'd talk to the ruby-openssl developers, to discuss
 how we can avoid any remaining messages on the queue.

 To 2: SSL_get_error() inspects the shared error queue under some conditions.
 It's maybe poor API design, but it's documented behaviour [2]. So we
 certainly have to get along with it.

 To 3: To make libpq independent to a previous error state, the error queue
 might be cleared with a call to ERR_clear_error() prior
 SSL_connect/read/write as in the attached trivial patch. This would make
 libpq robust against other uses of openssl within the application.

 What do you think about clearing the OpenSSL error queue in libpq in that
 way?

Shouldn't it be the job of whatever code is consuming the error to
clear the error queue afterwards?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Failing SSL connection due to weird interaction with openssl

2012-10-23 Thread Lars Kanis
While investigating a ruby-pg issue [1], we noticed that a libpq SSL 
connection can fail, if the running application uses OpenSSL for other 
work, too. Root cause is the thread local error queue of OpenSSL, that 
is used to transmit textual error messages to the application after a 
failed crypto operation. In case that the application leaves errors on 
the queue, the communication to the PostgreSQL server can fail with a 
message left from the previous failed OpenSSL operation, in particular 
when using non-blocking operations on the socket. This issue with 
openssl is quite old now - see [3].


For [1] it turned out that the issue is subdivided into these three parts:
1. the ruby-openssl binding does not clear the thread local error queue 
of OpenSSL after a certificate verify

2. OpenSSL makes use of a shared error queue for different crypto contexts.
3. libpq does not ensure a cleared error queue when doing SSL_* calls

To 1: Remaining messages on the error queue can generally lead to 
failing operations, later on. I'd talk to the ruby-openssl developers, 
to discuss how we can avoid any remaining messages on the queue.


To 2: SSL_get_error() inspects the shared error queue under some 
conditions. It's maybe poor API design, but it's documented behaviour 
[2]. So we certainly have to get along with it.


To 3: To make libpq independent to a previous error state, the error 
queue might be cleared with a call to ERR_clear_error() prior 
SSL_connect/read/write as in the attached trivial patch. This would make 
libpq robust against other uses of openssl within the application.


What do you think about clearing the OpenSSL error queue in libpq in 
that way?


[1] 
https://bitbucket.org/ged/ruby-pg/issue/142/async_exec-over-ssl-connection-can-fail-on

[2] http://www.openssl.org/docs/ssl/SSL_get_error.html
[3] 
http://www.educatedguesswork.org/movabletype/archives/2005/03/curse_you_opens.html


diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index b1ad776..2a09c5c 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -323,6 +323,8 @@ pqsecure_read(PGconn *conn, void *ptr, size_t len)
 
 		/* SSL_read can write to the socket, so we need to disable SIGPIPE */
 		DISABLE_SIGPIPE(conn, spinfo, return -1);
+		/* There could be errors left on OpenSSL's error queue from the application */
+		ERR_clear_error();
 
 rloop:
 		SOCK_ERRNO_SET(0);
@@ -485,6 +487,8 @@ pqsecure_write(PGconn *conn, const void *ptr, size_t len)
 		int			err;
 
 		DISABLE_SIGPIPE(conn, spinfo, return -1);
+		/* There could be errors left on OpenSSL's error queue from the application */
+		ERR_clear_error();
 
 		SOCK_ERRNO_SET(0);
 		n = SSL_write(conn-ssl, ptr, len);
@@ -1375,6 +1379,9 @@ open_client_SSL(PGconn *conn)
 {
 	int			r;
 
+	/* There could be errors left on OpenSSL's error queue from the application */
+	ERR_clear_error();
+
 	r = SSL_connect(conn-ssl);
 	if (r = 0)
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers