On 01/26/2015 12:14 PM, Andres Freund wrote:
Hi,

When working on getting rid of ImmediateInterruptOK I wanted to verify
that ssl still works correctly. Turned out it didn't. But neither did it
in master.

Turns out there's two major things we do wrong:

1) We ignore the rule that once called and returning
    SSL_ERROR_WANTS_(READ|WRITE) SSL_read()/write() have to be called
    again with the same parameters. Unfortunately that rule doesn't mean
    just that the same parameters have to be passed in, but also that we
    can't just constantly switch between _read()/write(). Especially
    nonblocking backend code (i.e. walsender) and the whole frontend code
    violate this rule.

I don't buy that. Googling around, and looking at the man pages, I just don't see anything to support that claim. I believe it is supported to switch between reads and writes.

2) We start renegotiations in be_tls_write() while in nonblocking mode,
    but don't properly retry to handle socket readyness. There's a loop
    that retries handshakes twenty times (???), but what actually is
    needed is to call SSL_get_error() and ensure that there's actually
    data available.

Yeah, that's just crazy.

Actually, I don't think we need to call SSL_do_handshake() at all. At least on modern OpenSSL versions. OpenSSL internally uses a state machine, and SSL_read() and SSL_write() calls will complete the handshake just as well.

2) is easy enough to fix, but 1) is pretty hard. Before anybody says
that 1) isn't an important rule: It reliably causes connection aborts
within a couple renegotiations. The higher the latency the higher the
likelihood of aborts. Even locally it doesn't take very long to
abort. Errors usually are something like "SSL connection has been closed
unexpectedly" or "SSL Error: sslv3 alert unexpected message" and a host
of other similar messages. There's a couple reports of those in the
archives and I've seen many more in client logs.

Yeah, I can reproduce that. There's clearly something wrong.

I believe this is an OpenSSL bug. I traced the "unexpected message" error to this code in OpenSSL's s3_read_bytes() function:

    case SSL3_RT_APPLICATION_DATA:
        /*
         * At this point, we were expecting handshake data, but have
         * application data.  If the library was running inside ssl3_read()
         * (i.e. in_read_app_data is set) and it makes sense to read
         * application data at this point (session renegotiation not yet
         * started), we will indulge it.
         */
        if (s->s3->in_read_app_data &&
            (s->s3->total_renegotiations != 0) &&
            (((s->state & SSL_ST_CONNECT) &&
              (s->state >= SSL3_ST_CW_CLNT_HELLO_A) &&
              (s->state <= SSL3_ST_CR_SRVR_HELLO_A)
             ) || ((s->state & SSL_ST_ACCEPT) &&
                   (s->state <= SSL3_ST_SW_HELLO_REQ_A) &&
                   (s->state >= SSL3_ST_SR_CLNT_HELLO_A)
             )
            )) {
            s->s3->in_read_app_data = 2;
            return (-1);
        } else {
            al = SSL_AD_UNEXPECTED_MESSAGE;
            SSLerr(SSL_F_SSL3_READ_BYTES, SSL_R_UNEXPECTED_RECORD);
            goto f_err;
        }

So that quite clearly throws an error, *unless* the original application call was SSL_read(). As you also concluded, OpenSSL doesn't like it when SSL_write() is called when it's in renegotiation. I think that's OpenSSL's fault and it should "indulge" the application data message even in SSL_write().

Can we work-around that easily? I believe we can. The crucial part is that we mustn't let SSL_write() to process any incoming application data. We can achieve that if we always call SSL_read() to drain such data, before calling SSL_write(). But that's not quite enough; if we're in renegotiation, SSL_write() might still try to read more data from the socket that has arrived after the SSL_read() call. Fortunately, we can easily prevent that by hacking pqsecure_raw_read() to always return EWOULDBLOCK, when it's called through SSL_write().

The attached patch does that. I've been running your pg_receivexlog test case with this for about 15 minutes without any errors now, with ssl_renegotiation_limit=50kB, while before it errored out within seconds.

In theory, I guess we should do similar hacks in the server, and always call SSL_read() before SSL_write(), but it seems to work without it. Or maybe not; OpenSSL server and client code is not symmetric, so perhaps it works in server mode without that.

PS. The server code started renegotiation when it's 1kB from reaching ssl_renegotiation_limit. I increased that to 32kB, because in testing, I got some "SSL failed to renegotiate connection before limit expired" errors in testing before I did that.

PPS. I did all testing of this patch with my sleeping logic simplification patch applied (http://www.postgresql.org/message-id/54d3821e.9060...@vmware.com). It applies without it too, and I don't think it matters, but I thought I'd mention it.

- Heikki

From 4b1c8ef107dc195c0b4a493d72c6ffe97077318c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 5 Feb 2015 22:44:58 +0200
Subject: [PATCH 1/1] SSL renegotiation fixes.

Remove the ugly "retry twenty times" from server renegotiation code. In
fact, calling SSL_do_handshake() after SSL_renegotiate() is not necessary
at all. SSL_read() and SSL_write() will perform the handshake as well.

In the client-side, avoid letting SSL_write() to read data. That sometimes
makes OpenSSL throw "unexpected message" and other errors, if renegotiation
is happening. Before calling SSL_write(), set a flag that prevents
pqsecure_raw_read() from reading any more data from the socket, and call
SSL_read() to drain any remaining data from OpenSSL's buffer. That makes
ensures that SSL_read() always handles all reads, which seems to be
necessary to make renegotiations robust with OpenSSL.

Also start renegotiation a little bit earlier than before. We used to start
it 1kB before reaching ssl_renegotiation_limit. That's too little, there can
easily be 1kB of data in-flight from the client, before it gets our
renegotiation request.
---
 src/backend/libpq/be-secure-openssl.c    | 25 ++-----------------------
 src/interfaces/libpq/fe-misc.c           | 16 +++++++++++++++-
 src/interfaces/libpq/fe-secure-openssl.c |  1 -
 src/interfaces/libpq/fe-secure.c         | 15 +++++++++++++++
 src/interfaces/libpq/libpq-int.h         |  3 +++
 5 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 39587e8..e847f41 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -576,11 +576,11 @@ be_tls_write(Port *port, void *ptr, size_t len, int *waitfor)
 	/*
 	 * If SSL renegotiations are enabled and we're getting close to the
 	 * limit, start one now; but avoid it if there's one already in
-	 * progress.  Request the renegotiation 1kB before the limit has
+	 * progress.  Request the renegotiation 32kB before the limit has
 	 * actually expired.
 	 */
 	if (ssl_renegotiation_limit && !in_ssl_renegotiation &&
-		port->count > (ssl_renegotiation_limit - 1) * 1024L)
+		port->count > (ssl_renegotiation_limit - 32) * 1024L)
 	{
 		in_ssl_renegotiation = true;
 
@@ -601,27 +601,6 @@ be_tls_write(Port *port, void *ptr, size_t len, int *waitfor)
 			ereport(COMMERROR,
 					(errcode(ERRCODE_PROTOCOL_VIOLATION),
 					 errmsg("SSL failure during renegotiation start")));
-		else
-		{
-			int			retries;
-
-			/*
-			 * A handshake can fail, so be prepared to retry it, but only
-			 * a few times.
-			 */
-			for (retries = 0;; retries++)
-			{
-				if (SSL_do_handshake(port->ssl) > 0)
-					break;	/* done */
-				ereport(COMMERROR,
-						(errcode(ERRCODE_PROTOCOL_VIOLATION),
-						 errmsg("SSL handshake failure on renegotiation, retrying")));
-				if (retries >= 20)
-					ereport(FATAL,
-							(errcode(ERRCODE_PROTOCOL_VIOLATION),
-							 errmsg("could not complete SSL handshake on renegotiation, too many failures")));
-			}
-		}
 	}
 
 	errno = 0;
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 945e283..4c47821 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -834,7 +834,7 @@ pqSendSome(PGconn *conn, int len)
 {
 	char	   *ptr = conn->outBuffer;
 	int			remaining = conn->outCount;
-	int			result = 0;
+	int			result;
 
 	if (conn->sock == PGINVALID_SOCKET)
 	{
@@ -845,11 +845,24 @@ pqSendSome(PGconn *conn, int len)
 		return -1;
 	}
 
+#ifdef USE_OPENSSL
+	if (conn->ssl)
+	{
+		block_reads = true;
+		result = pqReadData(conn);
+		block_reads = false;
+		if (result < 0)
+			return -1;
+	}
+#endif
+
 	/* while there's still data to send */
+	result = 0;
 	while (len > 0)
 	{
 		int			sent;
 
+		block_reads = true;
 #ifndef WIN32
 		sent = pqsecure_write(conn, ptr, len);
 #else
@@ -861,6 +874,7 @@ pqSendSome(PGconn *conn, int len)
 		 */
 		sent = pqsecure_write(conn, ptr, Min(len, 65536));
 #endif
+		block_reads = false;
 
 		if (sent < 0)
 		{
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index a32af34..38e7071 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -100,7 +100,6 @@ static long win32_ssl_create_mutex = 0;
 #endif
 #endif   /* ENABLE_THREAD_SAFETY */
 
-
 /* ------------------------------------------------------------ */
 /*			 Procedures common to all secure sessions			*/
 /* ------------------------------------------------------------ */
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index 57c572b..aac405b 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -124,6 +124,9 @@ struct sigpipe_info
 #define RESTORE_SIGPIPE(conn, spinfo)
 #endif   /* WIN32 */
 
+bool block_reads = false;
+bool block_writes = false;
+
 /* ------------------------------------------------------------ */
 /*			 Procedures common to all secure sessions			*/
 /* ------------------------------------------------------------ */
@@ -227,6 +230,12 @@ pqsecure_raw_read(PGconn *conn, void *ptr, size_t len)
 	int			result_errno = 0;
 	char		sebuf[256];
 
+	if (block_reads)
+	{
+		errno = EWOULDBLOCK;
+		return -1;
+	}
+
 	n = recv(conn->sock, ptr, len, 0);
 
 	if (n < 0)
@@ -307,6 +316,12 @@ pqsecure_raw_write(PGconn *conn, const void *ptr, size_t len)
 
 	DECLARE_SIGPIPE_INFO(spinfo);
 
+	if (block_writes)
+	{
+		errno = EWOULDBLOCK;
+		return -1;
+	}
+
 #ifdef MSG_NOSIGNAL
 	if (conn->sigpipe_flag)
 		flags |= MSG_NOSIGNAL;
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 008fd67..ce55411 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -618,6 +618,9 @@ extern int	pqWriteReady(PGconn *conn);
 
 /* === in fe-secure.c === */
 
+extern bool block_reads;
+extern bool block_writes;
+
 extern int	pqsecure_initialize(PGconn *);
 extern void pqsecure_destroy(void);
 extern PostgresPollingStatusType pqsecure_open_client(PGconn *);
-- 
2.1.4

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to