Attached patch fixes an issue reported by William Felipe Welter about
a year ago [1]. It is loosely based on his original patch.

As Heikki goes into on that thread, the appropriate action seems to be
to constantly reset the error queue, and to make sure that we
ourselves clear the queue consistently. (Note that we might not have
consistently called ERR_get_error() in the event of an OOM within
SSLerrmessage(), for example). I have not changed backend code in the
patch, though; I felt that we had enough control of the general
situation there for it to be unnecessary to lock everything down.

The interface that OpenSSL exposes for all of this is very poorly
thought out. It's not exactly clear how a client of OpenSSL can be a
"good citizen" in handling the error queue. Correctly using the
library is only ever described in terms of a very exact thing that
must happen or must not happen. There is no overarching concept of how
things fit together so that each OpenSSL client doesn't clobber the
other. It's all rather impractical. Clearly, this patch needs careful
review.

[1] 
http://www.postgresql.org/message-id/flat/20150224030956.2529.83...@wrigleys.postgresql.org#20150224030956.2529.83...@wrigleys.postgresql.org
-- 
Peter Geoghegan
From d5433bc562f792afd75ef8664729db9e6a60ee62 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <peter.geoghega...@gmail.com>
Date: Tue, 26 Jan 2016 15:11:15 -0800
Subject: [PATCH] Distrust external OpenSSL clients; clear err queue

OpenSSL has an unfortunate tendency to mix up per-session state error
handling, and per-thread OpenSSL error handling.  This can cause
problems when programs that link to libpq with OpenSSL enabled have some
other use of OpenSSL; without care, one caller of OpenSSL may cause
problems for the other caller.

To fix, don't trust other users of OpenSSL to clear the per-thread error
queue.  Instead, clear the entire per-thread queue ahead of certain I/O
operations (these I/O operations mostly need to call SSL_get_error() to
check for success, which relies on the queue being empty).  This is a
bit aggressive, but it's pretty clear that the other callers have a very
dubious claim to ownership of the per-thread queue; problem reports
involving PHP scripts that use both PHP's OpenSSL extension, and the
pgsql PHP extension (libpq) are themselves evidence of this.

Finally, be more careful about clearing our own error queue, so as to
not cause these problems ourself.  It's possibly that control previously
did not always reach SSLerrmessage(), where ERR_get_error() was supposed
to be called to clear the queue's earliest code.  Make sure
ERR_get_error() is always called, so as to spare other users of OpenSSL
the possibility of similar problems caused by libpq (as opposed to
problems caused by a third party OpenSSL library like PHP's OpenSSL
extension).
---
 src/interfaces/libpq/fe-secure-openssl.c | 125 +++++++++++++++++++++++++------
 1 file changed, 104 insertions(+), 21 deletions(-)

diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index 133546b..270d184 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -70,7 +70,7 @@ static int verify_peer_name_matches_certificate_name(PGconn *conn,
 static void destroy_ssl_system(void);
 static int	initialize_SSL(PGconn *conn);
 static PostgresPollingStatusType open_client_SSL(PGconn *);
-static char *SSLerrmessage(void);
+static char *SSLerrmessage(unsigned long errcode);
 static void SSLerrfree(char *buf);
 
 static int	my_sock_read(BIO *h, char *buf, int size);
@@ -152,7 +152,7 @@ pgtls_open_client(PGconn *conn)
 			!SSL_set_app_data(conn->ssl, conn) ||
 			!my_SSL_set_fd(conn, conn->sock))
 		{
-			char	   *err = SSLerrmessage();
+			char	   *err = SSLerrmessage(ERR_get_error());
 
 			printfPQExpBuffer(&conn->errorMessage,
 				   libpq_gettext("could not establish SSL connection: %s\n"),
@@ -209,11 +209,37 @@ pgtls_read(PGconn *conn, void *ptr, size_t len)
 	int			result_errno = 0;
 	char		sebuf[256];
 	int			err;
+	unsigned long errcode;
 
 rloop:
 	SOCK_ERRNO_SET(0);
+
+	/*
+	 * Prepare to call SSL_get_error() by clearing thread's OpenSSL error
+	 * queue.  In general, the current thread's error queue must be empty
+	 * before the TLS/SSL I/O operation is attempted, or SSL_get_error()
+	 * will not work reliably.  Since the possibility exists that other
+	 * OpenSSL clients running in the same thread but not under our control
+	 * will fail to call ERR_get_error() themselves (after their own I/O
+	 * operations), pro-actively clear the per-thread error queue now.
+	 */
+	ERR_clear_error();
 	n = SSL_read(conn->ssl, ptr, len);
 	err = SSL_get_error(conn->ssl, n);
+
+	/*
+	 * Other clients of OpenSSL may fail to call ERR_get_error(), but we
+	 * always do (so as to not cause problems for OpenSSL clients that
+	 * don't call ERR_clear_error() defensively, but are responsible enough
+	 * to call ERR_get_error() to clear error codes they add to the queue).
+	 * This should be harmless in the common case where there is no such
+	 * entry.  Be sure that this happens by calling immediately.
+	 * SSL_get_error() relies on the OpenSSL per-thread error queue being
+	 * intact, so this is the earliest possible point ERR_get_error() may
+	 * be called.
+	 */
+	errcode = ERR_get_error();
+
 	switch (err)
 	{
 		case SSL_ERROR_NONE:
@@ -266,7 +292,7 @@ rloop:
 			break;
 		case SSL_ERROR_SSL:
 			{
-				char	   *errm = SSLerrmessage();
+				char	   *errm = SSLerrmessage(errcode);
 
 				printfPQExpBuffer(&conn->errorMessage,
 								  libpq_gettext("SSL error: %s\n"), errm);
@@ -318,10 +344,36 @@ pgtls_write(PGconn *conn, const void *ptr, size_t len)
 	int			result_errno = 0;
 	char		sebuf[256];
 	int			err;
+	unsigned long errcode;
 
 	SOCK_ERRNO_SET(0);
+
+	/*
+	 * Prepare to call SSL_get_error() by clearing thread's OpenSSL error
+	 * queue.  In general, the current thread's error queue must be empty
+	 * before the TLS/SSL I/O operation is attempted, or SSL_get_error()
+	 * will not work reliably.  Since the possibility exists that other
+	 * OpenSSL clients running in the same thread but not under our control
+	 * will fail to call ERR_get_error() themselves (after their own I/O
+	 * operations), pro-actively clear the per-thread error queue now.
+	 */
+	ERR_clear_error();
 	n = SSL_write(conn->ssl, ptr, len);
 	err = SSL_get_error(conn->ssl, n);
+
+	/*
+	 * Other clients of OpenSSL may fail to call ERR_get_error(), but we
+	 * always do (so as to not cause problems for OpenSSL clients that
+	 * don't call ERR_clear_error() defensively, but are responsible enough
+	 * to call ERR_get_error() to clear error codes they add to the queue).
+	 * This should be harmless in the common case where there is no such
+	 * entry.  Be sure that this happens by calling immediately.
+	 * SSL_get_error() relies on the OpenSSL per-thread error queue being
+	 * intact, so this is the earliest possible point ERR_get_error() may
+	 * be called.
+	 */
+	errcode = ERR_get_error();
+
 	switch (err)
 	{
 		case SSL_ERROR_NONE:
@@ -372,7 +424,7 @@ pgtls_write(PGconn *conn, const void *ptr, size_t len)
 			break;
 		case SSL_ERROR_SSL:
 			{
-				char	   *errm = SSLerrmessage();
+				char	   *errm = SSLerrmessage(errcode);
 
 				printfPQExpBuffer(&conn->errorMessage,
 								  libpq_gettext("SSL error: %s\n"), errm);
@@ -839,7 +891,7 @@ pgtls_init(PGconn *conn)
 		SSL_context = SSL_CTX_new(SSLv23_method());
 		if (!SSL_context)
 		{
-			char	   *err = SSLerrmessage();
+			char	   *err = SSLerrmessage(ERR_get_error());
 
 			printfPQExpBuffer(&conn->errorMessage,
 						 libpq_gettext("could not create SSL context: %s\n"),
@@ -1009,7 +1061,7 @@ initialize_SSL(PGconn *conn)
 #endif
 		if (SSL_CTX_use_certificate_chain_file(SSL_context, fnbuf) != 1)
 		{
-			char	   *err = SSLerrmessage();
+			char	   *err = SSLerrmessage(ERR_get_error());
 
 			printfPQExpBuffer(&conn->errorMessage,
 			   libpq_gettext("could not read certificate file \"%s\": %s\n"),
@@ -1024,7 +1076,7 @@ initialize_SSL(PGconn *conn)
 
 		if (SSL_use_certificate_file(conn->ssl, fnbuf, SSL_FILETYPE_PEM) != 1)
 		{
-			char	   *err = SSLerrmessage();
+			char	   *err = SSLerrmessage(ERR_get_error());
 
 			printfPQExpBuffer(&conn->errorMessage,
 			   libpq_gettext("could not read certificate file \"%s\": %s\n"),
@@ -1079,7 +1131,7 @@ initialize_SSL(PGconn *conn)
 			conn->engine = ENGINE_by_id(engine_str);
 			if (conn->engine == NULL)
 			{
-				char	   *err = SSLerrmessage();
+				char	   *err = SSLerrmessage(ERR_get_error());
 
 				printfPQExpBuffer(&conn->errorMessage,
 					 libpq_gettext("could not load SSL engine \"%s\": %s\n"),
@@ -1091,7 +1143,7 @@ initialize_SSL(PGconn *conn)
 
 			if (ENGINE_init(conn->engine) == 0)
 			{
-				char	   *err = SSLerrmessage();
+				char	   *err = SSLerrmessage(ERR_get_error());
 
 				printfPQExpBuffer(&conn->errorMessage,
 				libpq_gettext("could not initialize SSL engine \"%s\": %s\n"),
@@ -1107,7 +1159,7 @@ initialize_SSL(PGconn *conn)
 										   NULL, NULL);
 			if (pkey == NULL)
 			{
-				char	   *err = SSLerrmessage();
+				char	   *err = SSLerrmessage(ERR_get_error());
 
 				printfPQExpBuffer(&conn->errorMessage,
 								  libpq_gettext("could not read private SSL key \"%s\" from engine \"%s\": %s\n"),
@@ -1121,7 +1173,7 @@ initialize_SSL(PGconn *conn)
 			}
 			if (SSL_use_PrivateKey(conn->ssl, pkey) != 1)
 			{
-				char	   *err = SSLerrmessage();
+				char	   *err = SSLerrmessage(ERR_get_error());
 
 				printfPQExpBuffer(&conn->errorMessage,
 								  libpq_gettext("could not load private SSL key \"%s\" from engine \"%s\": %s\n"),
@@ -1177,7 +1229,7 @@ initialize_SSL(PGconn *conn)
 
 		if (SSL_use_PrivateKey_file(conn->ssl, fnbuf, SSL_FILETYPE_PEM) != 1)
 		{
-			char	   *err = SSLerrmessage();
+			char	   *err = SSLerrmessage(ERR_get_error());
 
 			printfPQExpBuffer(&conn->errorMessage,
 			   libpq_gettext("could not load private key file \"%s\": %s\n"),
@@ -1191,7 +1243,7 @@ initialize_SSL(PGconn *conn)
 	if (have_cert &&
 		SSL_check_private_key(conn->ssl) != 1)
 	{
-		char	   *err = SSLerrmessage();
+		char	   *err = SSLerrmessage(ERR_get_error());
 
 		printfPQExpBuffer(&conn->errorMessage,
 						  libpq_gettext("certificate does not match private key file \"%s\": %s\n"),
@@ -1229,7 +1281,7 @@ initialize_SSL(PGconn *conn)
 #endif
 		if (SSL_CTX_load_verify_locations(SSL_context, fnbuf, NULL) != 1)
 		{
-			char	   *err = SSLerrmessage();
+			char	   *err = SSLerrmessage(ERR_get_error());
 
 			printfPQExpBuffer(&conn->errorMessage,
 							  libpq_gettext("could not read root certificate file \"%s\": %s\n"),
@@ -1259,7 +1311,7 @@ initialize_SSL(PGconn *conn)
 				X509_STORE_set_flags(cvstore,
 						  X509_V_FLAG_CRL_CHECK | X509_V_FLAG_CRL_CHECK_ALL);
 #else
-				char	   *err = SSLerrmessage();
+				char	   *err = SSLerrmessage(ERR_get_error());
 
 				printfPQExpBuffer(&conn->errorMessage,
 								  libpq_gettext("SSL library does not support CRL certificates (file \"%s\")\n"),
@@ -1326,8 +1378,31 @@ static PostgresPollingStatusType
 open_client_SSL(PGconn *conn)
 {
 	int			r;
+	unsigned long errcode;
 
+	/*
+	 * Prepare to call SSL_get_error() by clearing thread's OpenSSL error
+	 * queue.  In general, the current thread's error queue must be empty
+	 * before the TLS/SSL I/O operation is attempted, or SSL_get_error()
+	 * will not work reliably (we don't actually call it here, but be
+	 * consistent).  Since the possibility exists that other OpenSSL
+	 * clients running in the same thread but not under our control will
+	 * fail to call ERR_get_error() themselves (after their own I/O
+	 * operations), pro-actively clear the per-thread error queue now.
+	 */
+	ERR_clear_error();
 	r = SSL_connect(conn->ssl);
+
+	/*
+	 * Other clients of OpenSSL may fail to call ERR_get_error(), but we
+	 * always do (so as to not cause problems for OpenSSL clients that
+	 * don't call ERR_clear_error() defensively, but are responsible enough
+	 * to call ERR_get_error() to clear error codes they add to the queue).
+	 * This should be harmless in the common case where there is no such
+	 * entry.  Be sure that this happens by calling immediately.
+	 */
+	errcode = ERR_get_error();
+
 	if (r <= 0)
 	{
 		int			err = SSL_get_error(conn->ssl, r);
@@ -1356,7 +1431,7 @@ open_client_SSL(PGconn *conn)
 				}
 			case SSL_ERROR_SSL:
 				{
-					char	   *err = SSLerrmessage();
+					char	   *err = SSLerrmessage(errcode);
 
 					printfPQExpBuffer(&conn->errorMessage,
 									  libpq_gettext("SSL error: %s\n"),
@@ -1384,7 +1459,12 @@ open_client_SSL(PGconn *conn)
 	conn->peer = SSL_get_peer_certificate(conn->ssl);
 	if (conn->peer == NULL)
 	{
-		char	   *err = SSLerrmessage();
+
+		char	   *err;
+
+		/* Call ERR_get_error() again for this case */
+		errcode = ERR_get_error();
+		err = SSLerrmessage(errcode);
 
 		printfPQExpBuffer(&conn->errorMessage,
 					libpq_gettext("certificate could not be obtained: %s\n"),
@@ -1456,7 +1536,12 @@ pgtls_close(PGconn *conn)
 
 
 /*
- * Obtain reason string for last SSL error
+ * Obtain reason string for passed SSL errcode
+ *
+ * ERR_get_error() is used to get errcode for reporting.  This is
+ * left to caller, since ERR_get_error() must be called at a
+ * point where the thread's OpenSSL error queue needs to be
+ * cleared.
  *
  * Some caution is needed here since ERR_reason_error_string will
  * return NULL if it doesn't recognize the error code.  We don't
@@ -1467,16 +1552,14 @@ static char ssl_nomem[] = "out of memory allocating error description";
 #define SSL_ERR_LEN 128
 
 static char *
-SSLerrmessage(void)
+SSLerrmessage(unsigned long errcode)
 {
-	unsigned long errcode;
 	const char *errreason;
 	char	   *errbuf;
 
 	errbuf = malloc(SSL_ERR_LEN);
 	if (!errbuf)
 		return ssl_nomem;
-	errcode = ERR_get_error();
 	if (errcode == 0)
 	{
 		snprintf(errbuf, SSL_ERR_LEN, libpq_gettext("no SSL error reported"));
-- 
1.9.1

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to