Re: libpq: Fix wrong connection status on invalid "connect_timeout"

2019-10-20 Thread Michael Paquier
On Fri, Oct 18, 2019 at 02:01:23PM +0200, Lars Kanis wrote:
> Am 18.10.19 um 05:06 schrieb Michael Paquier:
>> So attached is a patch to skip trailing whitespaces as well,
>> which also fixes the issue with ECPG.  I have refactored the parsing
>> logic a bit while on it.  The comment at the top of parse_int_param()
>> needs to be reworked a bit more.
> 
> I tested this and it looks good to me. Maybe you could omit some
> redundant 'end' checks, as in the attached patch. Or was your intention
> to verify non-NULL 'end'?

Yes.  Here are the connection patterns I have tested.  These now pass:
'postgresql:///postgres?host=/tmp=5432  =postgres'
'postgresql:///postgres?host=/tmp=  5432=postgres'
And these fail (overflow on third one):
'postgresql:///postgres?host=/tmp=5432 s =postgres'
'postgresql:///postgres?host=/tmp= s 5432=postgres'
'postgresql:///postgres?host=/tmp=  50=postgres'

Before the patch any trailing characters caused a failures even if
there were just whitespaces as trailing characters (first case
listed).

>> Perhaps we could add directly regression
>> tests for libpq.  I'll start a new thread about that once we are done
>> here, the topic is larger.
> 
> We have around 650 tests on ruby-pg to ensure everything runs as
> expected and I always wondered how the API of libpq is being verified.

For advanced test scenarios like connection handling, we use perl's
TAP tests.  The situation regarding libpq-related testing is a bit
messy though.  We have some tests in src/test/recovery/ for a couple
of things, and we should have more things to stress anything related
to the protocol (say message injection, etc.).

I'll try to start a new thread about that with a patch adding some
basics for discussion.

I have applied the parsing fix and your fix as two separate commits as
these are at the end two separate bugs, then back-patched down to v12.
Ed has been credited for the report, and I have marked the author as
you, Lars.  Thanks!
--
Michael


signature.asc
Description: PGP signature


Re: libpq: Fix wrong connection status on invalid "connect_timeout"

2019-10-18 Thread Lars Kanis
Am 18.10.19 um 05:06 schrieb Michael Paquier:

> So attached is a patch to skip trailing whitespaces as well,
> which also fixes the issue with ECPG.  I have refactored the parsing
> logic a bit while on it.  The comment at the top of parse_int_param()
> needs to be reworked a bit more.

I tested this and it looks good to me. Maybe you could omit some
redundant 'end' checks, as in the attached patch. Or was your intention
to verify non-NULL 'end'?


> Perhaps we could add directly regression
> tests for libpq.  I'll start a new thread about that once we are done
> here, the topic is larger.

We have around 650 tests on ruby-pg to ensure everything runs as
expected and I always wondered how the API of libpq is being verified.


--
Kind Regards,
Lars Kanis

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..0eeabb8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, , 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(>errorMessage,
 	  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 	  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, , conn,
 			 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{


Re: libpq: Fix wrong connection status on invalid "connect_timeout"

2019-10-18 Thread Lars Kanis
Am 18.10.19 um 05:06 schrieb Michael Paquier:

> So attached is a patch to skip trailing whitespaces as well,
> which also fixes the issue with ECPG.  I have refactored the parsing
> logic a bit while on it.  The comment at the top of parse_int_param()
> needs to be reworked a bit more.

I tested this and it looks good to me. Maybe you could omit some
redundant 'end' checks, as in the attached patch. Or was your intention
to verify non-NULL 'end'?


> Perhaps we could add directly regression
> tests for libpq.  I'll start a new thread about that once we are done
> here, the topic is larger.

We have around 650 tests on ruby-pg to ensure everything runs as
expected and I always wondered how the API of libpq is being verified.


--
Kind Regards,
Lars Kanis

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..0eeabb8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, , 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(>errorMessage,
 	  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 	  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, , conn,
 			 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{


Re: libpq: Fix wrong connection status on invalid "connect_timeout"

2019-10-18 Thread Lars Kanis
Am 18.10.19 um 05:06 schrieb Michael Paquier:

> So attached is a patch to skip trailing whitespaces as well,
> which also fixes the issue with ECPG.  I have refactored the parsing
> logic a bit while on it.  The comment at the top of parse_int_param()
> needs to be reworked a bit more.

I tested this and it looks good to me. Maybe you could omit some
redundant 'end' checks, as in the attached patch. Or was your intention
to verify non-NULL 'end'?


> Perhaps we could add directly regression
> tests for libpq.  I'll start a new thread about that once we are done
> here, the topic is larger.

We have around 650 tests on ruby-pg to ensure everything runs as
expected and I always wondered how the API of libpq is being verified.


--
Kind Regards,
Lars Kanis

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..0eeabb8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, , 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(>errorMessage,
 	  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 	  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, , conn,
 			 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{


Re: libpq: Fix wrong connection status on invalid "connect_timeout"

2019-10-17 Thread Michael Paquier
On Thu, Oct 17, 2019 at 10:10:17PM +0200, Lars Kanis wrote:
> That's why I changed connectDBComplete() only, instead of setting the
> status directly in parse_int_param().

Yes, you shouldn't do that as the keepalive parameters and
tcp_user_timeout have some specific handling when it comes to defaults
depending on the platform and we have some retry logic when specifying
multiple hosts.

Now, there is actually more to it than it looks at first glance.  Your
patch is pointing out at a failure within the regression tests of the
ECPG driver, as any parameters part of a connection string may have
trailing spaces which are considered as incorrect by the patch,
causing the connection to fail.

In short, on HEAD this succeeds but would fail with your patch:
$ psql 'postgresql:///postgres?host=/tmp_timeout=14 =5432'
psql: error: could not connect to server: invalid integer value "14 "
for connection option "connect_timeout"

Parameter names are more restrictive, as URLs don't allow leading or
trailing spaces for them.  On HEAD, we allow leading spaces for
integer parameters as the parsing uses strtol(3), but not for the
trailing spaces, which is a bit crazy and I think that we had better
not break that if the parameter value correctly defines a proper
integer.  So attached is a patch to skip trailing whitespaces as well,
which also fixes the issue with ECPG.  I have refactored the parsing
logic a bit while on it.  The comment at the top of parse_int_param()
needs to be reworked a bit more.

We could add some TAP tests for that, but I don't see a good area to
check after connection parameters.  We have tests for multi-host
strings in 001_stream_rep.pl but that already feels misplaced as those
tests are for recovery.  Perhaps we could add directly regression
tests for libpq.  I'll start a new thread about that once we are done
here, the topic is larger.

(Note to self: Ed Morley needs to be credited for the report as well.)
--
Michael
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2efe..9af830321c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, , 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end && *end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end && *end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(>errorMessage,
 	  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 	  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, , conn,
 			 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{


signature.asc
Description: PGP signature


Re: libpq: Fix wrong connection status on invalid "connect_timeout"

2019-10-17 Thread Lars Kanis
I verified that all other integer parameters properly set CONNECTION_BAD
in case of invalid values. These are:

* port
* keepalives_idle
* keepalives_interval
* keepalives_count
* tcp_user_timeout

That's why I changed connectDBComplete() only, instead of setting the
status directly in parse_int_param().

--

Kind Regards,
Lars Kanis


Am 17.10.19 um 20:04 schrieb Lars Kanis:
> Greetings,
>
> libpq since PostgreSQL-12 has stricter checks for integer values in
> connection parameters. They were introduced by commit
> https://github.com/postgres/postgres/commit/e7a2217978d9cbb2149bfcb4ef1e45716cfcbefb
> .
>
> However in case of "connect_timeout" such an invalid integer value leads
> to a connection status other than CONNECTION_OK or CONNECTION_BAD. The
> wrong parameter is therefore not properly reported to user space. This
> patch fixes this by explicit setting CONNECTION_BAD.
>
> The issue was raised on ruby-pg: https://github.com/ged/ruby-pg/issues/302
>
> It originally came up at Heroku:
> https://github.com/heroku/stack-images/issues/147
>
-- 
--
Kind Regards,
Lars Kanis






libpq: Fix wrong connection status on invalid "connect_timeout"

2019-10-17 Thread Lars Kanis
Greetings,

libpq since PostgreSQL-12 has stricter checks for integer values in
connection parameters. They were introduced by commit
https://github.com/postgres/postgres/commit/e7a2217978d9cbb2149bfcb4ef1e45716cfcbefb
.

However in case of "connect_timeout" such an invalid integer value leads
to a connection status other than CONNECTION_OK or CONNECTION_BAD. The
wrong parameter is therefore not properly reported to user space. This
patch fixes this by explicit setting CONNECTION_BAD.

The issue was raised on ruby-pg: https://github.com/ged/ruby-pg/issues/302

It originally came up at Heroku:
https://github.com/heroku/stack-images/issues/147

-- 

Kind Regards,
Lars Kanis

From f0c0dc3511c66a1e0fe0e4cc7ad2995f5f40bd7c Mon Sep 17 00:00:00 2001
From: Lars Kanis 
Date: Thu, 17 Oct 2019 19:37:51 +0200
Subject: [PATCH] Fix wrong connection status on invalid "connect_timeout"

Commit e7a2217978d9cbb2149bfcb4ef1e45716cfcbefb introduced
stricter checks for integer values in connection parameters.
However in case of "connect_timeout" such invalid integer
values leaded to a connection status other than CONNECTION_OK
or CONNECTION_BAD. This patch explicit sets CONNECTION_BAD.

Signed-off-by: Lars Kanis 
---
 src/interfaces/libpq/fe-connect.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..44d927e 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2008,7 +2008,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, , conn,
 			 "connect_timeout"))
+		{
+			/* invalid integer */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{
-- 
2.17.1