Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-04-04 Thread Tom Lane
[ from a week ago ]

Alvaro Herrera  writes:
> Hm, indri failed:
> ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith 
> -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new 
> -Wendif-labels -Wmissing-format-attribute -Wcast-function-type 
> -Wformat-security -fno-strict-aliasing -fwrapv 
> -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -g -O2 
> -fno-common -Werror  -fvisibility=hidden -bundle -o dblink.dylib  dblink.o 
> -L../../src/port -L../../src/common -L../../src/interfaces/libpq -lpq 
> -isysroot 
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.4.sdk
>   -L/opt/local/libexec/llvm-15/lib -L/opt/local/lib -L/opt/local/lib 
> -L/opt/local/lib  -L/opt/local/lib -Wl,-dead_strip_dylibs  -Werror  
> -fvisibility=hidden -bundle_loader ../../src/backend/postgres

> Undefined symbols for architecture arm64:
>   "_libintl_gettext", referenced from:
>   _libpqsrv_cancel in dblink.o
>   _libpqsrv_cancel in dblink.o
> ld: symbol(s) not found for architecture arm64
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> make[1]: *** [dblink.dylib] Error 1
> make: *** [all-dblink-recurse] Error 2

Having just fixed the same issue for test_json_parser, I now realize
what's going on there: dblink's link command doesn't actually mention
any of the external libraries that we might need, such as libintl.
You can get away with that on some platforms, but not macOS.
It would probably be possible to fix that if anyone cared to.
I'm not sufficiently excited about it to do so right now --- as
you say, we don't support translation in contrib anyway.

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 10:45, Alvaro Herrera  wrote:
> Yeah, this seems to work and I no longer get that complaint from
> headerscheck.

patch LGTM




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-04-04 Thread Alvaro Herrera
On 2024-Apr-03, Tom Lane wrote:

> On my machine, headerscheck does not like this:
> 
> $ src/tools/pginclude/headerscheck --cplusplus
> In file included from /tmp/headerscheck.4gTaW5/test.cpp:3:
> ./src/include/libpq/libpq-be-fe-helpers.h: In function 'char* 
> libpqsrv_cancel(PGconn*, TimestampTz)':
> ./src/include/libpq/libpq-be-fe-helpers.h:393:10: warning: ISO C++ forbids 
> converting a string constant to 'char*' [-Wwrite-strings]
>return "out of memory";
>   ^~~
> ./src/include/libpq/libpq-be-fe-helpers.h:421:13: warning: ISO C++ forbids 
> converting a string constant to 'char*' [-Wwrite-strings]
>  error = "cancel request timed out";
>  ^~
> 
> The second part of that could easily be fixed by declaring "error" as
> "const char *".  As for the first part, can we redefine the whole
> function as returning "const char *"?  (If not, this coding is very
> questionable anyway.)

Yeah, this seems to work and I no longer get that complaint from
headerscheck.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
>From 0af8c7039b2c8ed80bc0bddacfe4a9abb8f527b3 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 4 Apr 2024 10:13:07 +0200
Subject: [PATCH] Make libpqsrv_cancel's return type const char *

---
 contrib/dblink/dblink.c | 2 +-
 contrib/postgres_fdw/connection.c   | 2 +-
 src/include/libpq/libpq-be-fe-helpers.h | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index de858e165a..755293456f 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1347,7 +1347,7 @@ Datum
 dblink_cancel_query(PG_FUNCTION_ARGS)
 {
 	PGconn	   *conn;
-	char	   *msg;
+	const char *msg;
 	TimestampTz endtime;
 
 	dblink_init();
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 2532e453c4..cac9d96d33 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -1332,7 +1332,7 @@ pgfdw_cancel_query(PGconn *conn)
 static bool
 pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime)
 {
-	char	   *errormsg = libpqsrv_cancel(conn, endtime);
+	const char *errormsg = libpqsrv_cancel(conn, endtime);
 
 	if (errormsg != NULL)
 		ereport(WARNING,
diff --git a/src/include/libpq/libpq-be-fe-helpers.h b/src/include/libpq/libpq-be-fe-helpers.h
index 8be9aa1f2f..fe50829274 100644
--- a/src/include/libpq/libpq-be-fe-helpers.h
+++ b/src/include/libpq/libpq-be-fe-helpers.h
@@ -382,11 +382,11 @@ libpqsrv_get_result(PGconn *conn, uint32 wait_event_info)
  * Note: this function leaks a string's worth of memory when reporting
  * libpq errors.  Make sure to call it in a transient memory context.
  */
-static inline char *
+static inline const char *
 libpqsrv_cancel(PGconn *conn, TimestampTz endtime)
 {
 	PGcancelConn *cancel_conn;
-	char	   *error = NULL;
+	const char *error = NULL;
 
 	cancel_conn = PQcancelCreate(conn);
 	if (cancel_conn == NULL)
-- 
2.39.2



Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-04-03 Thread Tom Lane
Alvaro Herrera  writes:
> Great, thanks for looking.  Pushed now, I'll be closing the commitfest
> entry shortly.

On my machine, headerscheck does not like this:

$ src/tools/pginclude/headerscheck --cplusplus
In file included from /tmp/headerscheck.4gTaW5/test.cpp:3:
./src/include/libpq/libpq-be-fe-helpers.h: In function 'char* 
libpqsrv_cancel(PGconn*, TimestampTz)':
./src/include/libpq/libpq-be-fe-helpers.h:393:10: warning: ISO C++ forbids 
converting a string constant to 'char*' [-Wwrite-strings]
   return "out of memory";
  ^~~
./src/include/libpq/libpq-be-fe-helpers.h:421:13: warning: ISO C++ forbids 
converting a string constant to 'char*' [-Wwrite-strings]
 error = "cancel request timed out";
 ^~

The second part of that could easily be fixed by declaring "error" as
"const char *".  As for the first part, can we redefine the whole
function as returning "const char *"?  (If not, this coding is very
questionable anyway.)

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-29 Thread Noah Misch
On Fri, Mar 29, 2024 at 09:17:55AM +0100, Jelte Fennema-Nio wrote:
> On Thu, 28 Mar 2024 at 19:03, Tom Lane  wrote:
> > Could we make this test bulletproof by using an injection point?
> > If not, I remain of the opinion that we're better off without it.
> 
> Possibly, and if so, I agree that would be better than the currently
> added test. But I honestly don't feel like spending the time on
> creating such a test.

The SQL test is more representative of real applications, and it's way simpler
to understand.  In general, I prefer 6-line SQL tests that catch a problem 10%
of the time over injection point tests that catch it 100% of the time.  For
low detection rate to be exciting, it needs to be low enough to have a serious
chance of all buildfarm members reporting green for the bad commit.  With ~115
buildfarm members running in the last day, 0.1% detection rate would have been
low enough to bother improving, but 4% would be high enough to call it good.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-29 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 19:03, Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > It doesn't fail when it's too fast -- it's just that it doesn't cover
> > the case we want to cover.
>
> That's hardly better, because then you think you have test
> coverage but maybe you don't.

Honestly, that seems quite a lot better. Instead of having randomly
failing builds, you have a test that creates coverage 80+% of the
time. And that also seems a lot better than having no coverage at all
(which is what we had for the last 7 years since introduction of
cancellations to postgres_fdw). It would be good to expand the comment
in the test though saying that the test might not always cover the
intended code path, due to timing problems.

> Could we make this test bulletproof by using an injection point?
> If not, I remain of the opinion that we're better off without it.

Possibly, and if so, I agree that would be better than the currently
added test. But I honestly don't feel like spending the time on
creating such a test. And given 7 years have passed without someone
adding any test for this codepath at all, I don't expect anyone else
will either.

If you both feel we're better off without the test, feel free to
remove it. This was just some small missing test coverage that I
noticed while working on this patch, that I thought I'd quickly
address. I don't particularly care a lot about the specific test.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Tom Lane
Alvaro Herrera  writes:
> On 2024-Mar-28, Tom Lane wrote:
>> If the test fails both when the machine is too slow and when it's
>> too fast, then there's zero hope of making it stable and we should
>> just remove it.

> It doesn't fail when it's too fast -- it's just that it doesn't cover
> the case we want to cover.

That's hardly better, because then you think you have test
coverage but maybe you don't.

Could we make this test bulletproof by using an injection point?
If not, I remain of the opinion that we're better off without it.

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Alvaro Herrera
On 2024-Mar-28, Tom Lane wrote:

> Jelte Fennema-Nio  writes:
> 
> > I think we don't really want to make the timeout too short. Otherwise
> > the query might get cancelled before we push any query down to the
> > FDW. I guess that means that for some slow machines even 10ms is not
> > enough to make the test do the intended purpose. I'd keep it at 10ms,
> > which seems long enough for normal systems, while still being pretty
> > short.
> 
> If the test fails both when the machine is too slow and when it's
> too fast, then there's zero hope of making it stable and we should
> just remove it.

It doesn't fail when it's too fast -- it's just that it doesn't cover
the case we want to cover.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Escucha y olvidarás; ve y recordarás; haz y entenderás" (Confucio)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Tom Lane
Jelte Fennema-Nio  writes:
> On Thu, 28 Mar 2024 at 17:43, Alvaro Herrera  wrote:
>> Hah, you're right, I can reproduce with a smaller timeout, and using SET
>> LOCAL works as a fix.  If we're doing that, why not reduce the timeout
>> to 1ms?  We don't need to wait extra 9ms ...

> I think we don't really want to make the timeout too short. Otherwise
> the query might get cancelled before we push any query down to the
> FDW. I guess that means that for some slow machines even 10ms is not
> enough to make the test do the intended purpose. I'd keep it at 10ms,
> which seems long enough for normal systems, while still being pretty
> short.

If the test fails both when the machine is too slow and when it's
too fast, then there's zero hope of making it stable and we should
just remove it.

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 17:43, Alvaro Herrera  wrote:
> Hah, you're right, I can reproduce with a smaller timeout, and using SET
> LOCAL works as a fix.  If we're doing that, why not reduce the timeout
> to 1ms?  We don't need to wait extra 9ms ...

I think we don't really want to make the timeout too short. Otherwise
the query might get cancelled before we push any query down to the
FDW. I guess that means that for some slow machines even 10ms is not
enough to make the test do the intended purpose. I'd keep it at 10ms,
which seems long enough for normal systems, while still being pretty
short.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Alvaro Herrera
On 2024-Mar-28, Jelte Fennema-Nio wrote:

> Ugh that's annoying, the RESET is timing out too I guess.

Hah, you're right, I can reproduce with a smaller timeout, and using SET
LOCAL works as a fix.  If we're doing that, why not reduce the timeout
to 1ms?  We don't need to wait extra 9ms ...

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
“Cuando no hay humildad las personas se degradan” (A. Christie)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 17:34, Alvaro Herrera  wrote:
>
> Eh, kestrel has also failed[1], apparently every query after the large
> JOIN that this commit added as test fails with a statement timeout error.
>
> [1] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel=2024-03-28%2016%3A01%3A14

Ugh that's annoying, the RESET is timing out too I guess. That can
hopefully be easily fixed by changing the new test to:

BEGIN;
SET LOCAL statement_timeout = '10ms';
select count(*) from ft1 CROSS JOIN ft2 CROSS JOIN ft4 CROSS JOIN ft5;
-- this takes very long
ROLLBACK;




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Alvaro Herrera
Eh, kestrel has also failed[1], apparently every query after the large
JOIN that this commit added as test fails with a statement timeout error.

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel=2024-03-28%2016%3A01%3A14

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Alvaro Herrera
On 2024-Mar-28, Alvaro Herrera wrote:

> Undefined symbols for architecture arm64:
>   "_libintl_gettext", referenced from:
>   _libpqsrv_cancel in dblink.o
>   _libpqsrv_cancel in dblink.o
> ld: symbol(s) not found for architecture arm64
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> make[1]: *** [dblink.dylib] Error 1
> make: *** [all-dblink-recurse] Error 2

I just removed the _() from the new function.  There's not much point in
wasting more time on this, given that contrib doesn't have translation
support anyway, and we're not using this in libpqwalreceiver.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Crear es tan difícil como ser libre" (Elsa Triolet)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Alvaro Herrera
Hm, indri failed:

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new 
-Wendif-labels -Wmissing-format-attribute -Wcast-function-type 
-Wformat-security -fno-strict-aliasing -fwrapv 
-Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -g -O2 
-fno-common -Werror  -fvisibility=hidden -bundle -o dblink.dylib  dblink.o 
-L../../src/port -L../../src/common -L../../src/interfaces/libpq -lpq -isysroot 
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.4.sdk
  -L/opt/local/libexec/llvm-15/lib -L/opt/local/lib -L/opt/local/lib 
-L/opt/local/lib  -L/opt/local/lib -Wl,-dead_strip_dylibs  -Werror  
-fvisibility=hidden -bundle_loader ../../src/backend/postgres

Undefined symbols for architecture arm64:
  "_libintl_gettext", referenced from:
  _libpqsrv_cancel in dblink.o
  _libpqsrv_cancel in dblink.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [dblink.dylib] Error 1
make: *** [all-dblink-recurse] Error 2

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Alvaro Herrera
On 2024-Mar-28, Jelte Fennema-Nio wrote:

> Your changes look good, apart from the prverror stuff indeed. If you
> remove the prverror stuff again I think this is ready to commit.

Great, thanks for looking.  Pushed now, I'll be closing the commitfest
entry shortly.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Always assume the user will do much worse than the stupidest thing
you can imagine."(Julien PUYDT)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Wed, 27 Mar 2024 at 19:27, Alvaro Herrera  wrote:
> I ended up reducing the two PG_TRY blocks to a single one.  I see no
> reason to split them up, and this way it looks more legible.

I definitely agree this looks better. Not sure why I hadn't done that,
maybe it wasn't possible in one of the earlier iterations of the API.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Wed, 27 Mar 2024 at 19:46, Alvaro Herrera  wrote:
>
> On 2024-Mar-27, Alvaro Herrera wrote:
>
> > I changed it so that the error messages are returned as translated
> > phrases, and was bothered by the fact that if errors happen repeatedly,
> > the memory for them might be leaked.  Maybe this is fine depending on
> > the caller's memory context, but since it's only at most one string each
> > time, it's quite easy to just keep track of it so that we can release it
> > on the next.
>
> (Actually this sounds clever but fails pretty obviously if the caller
> does free the string, such as in a memory context reset.  So I guess we
> have to just accept the potential leakage.)

Your changes look good, apart from the prverror stuff indeed. If you
remove the prverror stuff again I think this is ready to commit.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-27 Thread Alvaro Herrera
On 2024-Mar-27, Alvaro Herrera wrote:

> I changed it so that the error messages are returned as translated
> phrases, and was bothered by the fact that if errors happen repeatedly,
> the memory for them might be leaked.  Maybe this is fine depending on
> the caller's memory context, but since it's only at most one string each
> time, it's quite easy to just keep track of it so that we can release it
> on the next.

(Actually this sounds clever but fails pretty obviously if the caller
does free the string, such as in a memory context reset.  So I guess we
have to just accept the potential leakage.)

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La conclusión que podemos sacar de esos estudios es que
no podemos sacar ninguna conclusión de ellos" (Tanenbaum)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-27 Thread Alvaro Herrera
On 2024-Mar-22, Jelte Fennema-Nio wrote:

> On Thu, 21 Mar 2024 at 03:54, Noah Misch  wrote:
> >
> > On Mon, Mar 18, 2024 at 07:40:10PM +0100, Alvaro Herrera wrote:
> > > I enabled the test again and also pushed the changes to dblink,
> > > isolationtester and fe_utils (which AFAICS is used by pg_dump,
> >
> > I recommend adding a libpqsrv_cancel() function to libpq-be-fe-helpers.h, to
> > use from dblink and postgres_fdw.  pgxn modules calling PQcancel() from the
> > backend (citus pg_bulkload plproxy pmpp) then have a better chance to adopt
> > the new way.
> 
> Done

Nice, thanks.  I played with it a bit, mostly trying to figure out if
the chosen API is usable.  I toyed with making it return boolean success
and the error message as an output argument, because I was nervous about
what'd happen in OOM.  But since this is backend environment, what
actually happens is that we elog(ERROR) anyway, so we never return a
NULL error message.  So after the detour I think Jelte's API is okay.

I changed it so that the error messages are returned as translated
phrases, and was bothered by the fact that if errors happen repeatedly,
the memory for them might be leaked.  Maybe this is fine depending on
the caller's memory context, but since it's only at most one string each
time, it's quite easy to just keep track of it so that we can release it
on the next.

I ended up reducing the two PG_TRY blocks to a single one.  I see no
reason to split them up, and this way it looks more legible.

What do you think?

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Tiene valor aquel que admite que es un cobarde" (Fernandel)
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index edbc9ab02a..de858e165a 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1347,25 +1347,16 @@ Datum
 dblink_cancel_query(PG_FUNCTION_ARGS)
 {
 	PGconn	   *conn;
-	PGcancelConn *cancelConn;
 	char	   *msg;
+	TimestampTz endtime;
 
 	dblink_init();
 	conn = dblink_get_named_conn(text_to_cstring(PG_GETARG_TEXT_PP(0)));
-	cancelConn = PQcancelCreate(conn);
-
-	PG_TRY();
-	{
-		if (!PQcancelBlocking(cancelConn))
-			msg = pchomp(PQcancelErrorMessage(cancelConn));
-		else
-			msg = "OK";
-	}
-	PG_FINALLY();
-	{
-		PQcancelFinish(cancelConn);
-	}
-	PG_END_TRY();
+	endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
+		  3);
+	msg = libpqsrv_cancel(conn, endtime);
+	if (msg == NULL)
+		msg = "OK";
 
 	PG_RETURN_TEXT_P(cstring_to_text(msg));
 }
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4931ebf591..2532e453c4 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -133,7 +133,7 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
 static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
 static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
 static bool pgfdw_cancel_query(PGconn *conn);
-static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime);
 static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
    bool consume_input);
 static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
@@ -1315,36 +1315,31 @@ pgfdw_cancel_query(PGconn *conn)
 	endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
 		  CONNECTION_CLEANUP_TIMEOUT);
 
-	if (!pgfdw_cancel_query_begin(conn))
+	if (!pgfdw_cancel_query_begin(conn, endtime))
 		return false;
 	return pgfdw_cancel_query_end(conn, endtime, false);
 }
 
+/*
+ * Submit a cancel request to the given connection, waiting only until
+ * the given time.
+ *
+ * We sleep interruptibly until we receive confirmation that the cancel
+ * request has been accepted, and if it is, return true; if the timeout
+ * lapses without that, or the request fails for whatever reason, return
+ * false.
+ */
 static bool
-pgfdw_cancel_query_begin(PGconn *conn)
+pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime)
 {
-	PGcancel   *cancel;
-	char		errbuf[256];
+	char	   *errormsg = libpqsrv_cancel(conn, endtime);
 
-	/*
-	 * Issue cancel request.  Unfortunately, there's no good way to limit the
-	 * amount of time that we might block inside PQgetCancel().
-	 */
-	if ((cancel = PQgetCancel(conn)))
-	{
-		if (!PQcancel(cancel, errbuf, sizeof(errbuf)))
-		{
-			ereport(WARNING,
-	(errcode(ERRCODE_CONNECTION_FAILURE),
-	 errmsg("could not send cancel request: %s",
-			errbuf)));
-			PQfreeCancel(cancel);
-			return false;
-		}
-		PQfreeCancel(cancel);
-	}
+	if (errormsg != NULL)
+		ereport(WARNING,
+errcode(ERRCODE_CONNECTION_FAILURE),
+errmsg("could not send cancel request: %s", errormsg));
 
-	return true;
+	return errormsg == NULL;
 }
 
 static bool
@@ -1685,7 +1680,11 @@ pgfdw_abort_cleanup_begin(ConnCacheEntry *entry, bool toplevel,
 	 */
 	if (PQtransactionStatus(entry->conn) == 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-22 Thread Jelte Fennema-Nio
On Thu, 21 Mar 2024 at 03:54, Noah Misch  wrote:
>
> On Mon, Mar 18, 2024 at 07:40:10PM +0100, Alvaro Herrera wrote:
> > I enabled the test again and also pushed the changes to dblink,
> > isolationtester and fe_utils (which AFAICS is used by pg_dump,
>
> I recommend adding a libpqsrv_cancel() function to libpq-be-fe-helpers.h, to
> use from dblink and postgres_fdw.  pgxn modules calling PQcancel() from the
> backend (citus pg_bulkload plproxy pmpp) then have a better chance to adopt
> the new way.

Done


v37-0001-postgres_fdw-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-20 Thread Noah Misch
On Mon, Mar 18, 2024 at 07:40:10PM +0100, Alvaro Herrera wrote:
> I enabled the test again and also pushed the changes to dblink,
> isolationtester and fe_utils (which AFAICS is used by pg_dump,

I recommend adding a libpqsrv_cancel() function to libpq-be-fe-helpers.h, to
use from dblink and postgres_fdw.  pgxn modules calling PQcancel() from the
backend (citus pg_bulkload plproxy pmpp) then have a better chance to adopt
the new way.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-18 Thread Alvaro Herrera
I enabled the test again and also pushed the changes to dblink,
isolationtester and fe_utils (which AFAICS is used by pg_dump,
pg_amcheck, reindexdb and vacuumdb).  I chickened out of committing the
postgres_fdw changes though, so here they are again.  Not sure I'll find
courage to get these done by tomorrow, or whether I should just let them
for Fujita-san or Noah, who have been the last committers to touch this.


-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"No renuncies a nada. No te aferres a nada."
>From 737578fcdc6ed0de64838d2ad905054b18eb9ec1 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Mon, 18 Mar 2024 19:37:40 +0100
Subject: [PATCH v39] postgres_fdw: Start using new libpq cancel APIs

Commit 61461a300c1c introduced new functions to libpq for cancelling
queries.  This replaces the usage of the old ones in postgres_fdw.

Author: Jelte Fennema-Nio 
Discussion: https://postgr.es/m/cageczqt_vgowwenuqvuv9xqmbacyxjtrrayo8w07oqashk_...@mail.gmail.com
---
 contrib/postgres_fdw/connection.c | 108 +++---
 .../postgres_fdw/expected/postgres_fdw.out|  15 +++
 contrib/postgres_fdw/sql/postgres_fdw.sql |   7 ++
 3 files changed, 113 insertions(+), 17 deletions(-)

diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4931ebf591..0c66eaa001 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -133,7 +133,7 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
 static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
 static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
 static bool pgfdw_cancel_query(PGconn *conn);
-static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime);
 static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
    bool consume_input);
 static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
@@ -1315,36 +1315,106 @@ pgfdw_cancel_query(PGconn *conn)
 	endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
 		  CONNECTION_CLEANUP_TIMEOUT);
 
-	if (!pgfdw_cancel_query_begin(conn))
+	if (!pgfdw_cancel_query_begin(conn, endtime))
 		return false;
 	return pgfdw_cancel_query_end(conn, endtime, false);
 }
 
+/*
+ * Submit a cancel request to the given connection, waiting only until
+ * the given time.
+ *
+ * We sleep interruptibly until we receive confirmation that the cancel
+ * request has been accepted, and if it is, return true; if the timeout
+ * lapses without that, or the request fails for whatever reason, return
+ * false.
+ */
 static bool
-pgfdw_cancel_query_begin(PGconn *conn)
+pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime)
 {
-	PGcancel   *cancel;
-	char		errbuf[256];
+	bool		timed_out = false;
+	bool		failed = false;
+	PGcancelConn *cancel_conn = PQcancelCreate(conn);
 
-	/*
-	 * Issue cancel request.  Unfortunately, there's no good way to limit the
-	 * amount of time that we might block inside PQgetCancel().
-	 */
-	if ((cancel = PQgetCancel(conn)))
+	if (!PQcancelStart(cancel_conn))
 	{
-		if (!PQcancel(cancel, errbuf, sizeof(errbuf)))
+		PG_TRY();
 		{
 			ereport(WARNING,
 	(errcode(ERRCODE_CONNECTION_FAILURE),
 	 errmsg("could not send cancel request: %s",
-			errbuf)));
-			PQfreeCancel(cancel);
-			return false;
+			pchomp(PQcancelErrorMessage(cancel_conn);
 		}
-		PQfreeCancel(cancel);
+		PG_FINALLY();
+		{
+			PQcancelFinish(cancel_conn);
+		}
+		PG_END_TRY();
+		return false;
 	}
 
-	return true;
+	/* In what follows, do not leak any PGcancelConn on an error. */
+	PG_TRY();
+	{
+		while (true)
+		{
+			PostgresPollingStatusType pollres = PQcancelPoll(cancel_conn);
+			TimestampTz now = GetCurrentTimestamp();
+			long		cur_timeout;
+			int			waitEvents = WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH;
+
+			if (pollres == PGRES_POLLING_OK)
+break;
+
+			/* If timeout has expired, give up, else get sleep time. */
+			cur_timeout = TimestampDifferenceMilliseconds(now, endtime);
+			if (cur_timeout <= 0)
+			{
+timed_out = true;
+failed = true;
+goto exit;
+			}
+
+			switch (pollres)
+			{
+case PGRES_POLLING_READING:
+	waitEvents |= WL_SOCKET_READABLE;
+	break;
+case PGRES_POLLING_WRITING:
+	waitEvents |= WL_SOCKET_WRITEABLE;
+	break;
+default:
+	failed = true;
+	goto exit;
+			}
+
+			/* Sleep until there's something to do */
+			WaitLatchOrSocket(MyLatch, waitEvents, PQcancelSocket(cancel_conn),
+			  cur_timeout, PG_WAIT_EXTENSION);
+			ResetLatch(MyLatch);
+
+			CHECK_FOR_INTERRUPTS();
+		}
+exit:
+		if (failed)
+		{
+			if (timed_out)
+ereport(WARNING,
+		(errmsg("could not cancel request due to timeout")));
+			else
+ereport(WARNING,
+		(errcode(ERRCODE_CONNECTION_FAILURE),
+		 errmsg("could not send cancel request: %s",
+

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-14 Thread Jelte Fennema-Nio
On Thu, 14 Mar 2024 at 11:33, Alvaro Herrera  wrote:
> Hmm, isn't this basically saying that we're giving up on reliably
> canceling queries altogether?  I mean, maybe we'd like to instead fix
> the bug about canceling queries in extended query protocol ...
> Isn't that something you're worried about?

In any case I think it's worth having (non-flaky) test coverage of our
libpq cancellation sending code. So I think it makes sense to commit
the patch I proposed, even if the backend code to handle that code is
arguably buggy.

Regarding the question if the backend code is actually buggy or not:
the way cancel requests are defined to work is a bit awkward. They
cancel whatever operation is running on the session when they arrive.
So if the session is just in the middle of a Bind and Execute message
there is nothing to cancel. While surprising and probably not what
someone would want, I don't think this behaviour is too horrible in
practice in this case. Most of the time people cancel queries while
the Execute message is being processed. The new test really only runs
into this problem because it sends a cancel request, immediately after
sending the query.

I definitely think it's worth rethinking the way we do query
cancellations though. I think what we would probably want is a way to
cancel a specific query/message on a session. Instead of cancelling
whatever is running at the moment when the cancel request is processed
by Postgres. Because this "cancel whatever is running" behaviour is
fraught with issues, this Bind/Execute issue being only one of them.
One really annoying race condition of a cancel request cancelling
another query than intended can happen with this flow (that I spend
lots of time on addressing in PgBouncer):
1. You send query A on session 1
2. You send a cancel request for session 1 (intending to cancel query A)
3. Query A completes by itself
4. You now send query B
5. The cancel request is now processed
6. Query B is now cancelled

But solving that race condition would involve changing the postgres
protocol. Which I'm trying to make possible with the first few commits
in[1]. And while those first few commits might still land in PG17, I
don't think a large protocol change like adding query identifiers to
cancel requests is feasible for PG17 anymore.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-14 Thread Alvaro Herrera
On 2024-Mar-14, Jelte Fennema-Nio wrote:

> On Wed, 13 Mar 2024 at 20:08, Jacob Champion
>  wrote:
> > I hit this on my machine. With the attached diff I can reproduce
> > constantly (including with the most recent test patch); I think the
> > cancel must be arriving between the bind/execute steps?
> 
> Nice find! Your explanation makes total sense. Attached a patchset
> that fixes/works around this issue by using the simple query protocol
> in the cancel test.

Hmm, isn't this basically saying that we're giving up on reliably
canceling queries altogether?  I mean, maybe we'd like to instead fix
the bug about canceling queries in extended query protocol ...
Isn't that something you're worried about?

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"World domination is proceeding according to plan"(Andrew Morton)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-14 Thread Jelte Fennema-Nio
On Wed, 13 Mar 2024 at 20:08, Jacob Champion
 wrote:
> I hit this on my machine. With the attached diff I can reproduce
> constantly (including with the most recent test patch); I think the
> cancel must be arriving between the bind/execute steps?

Nice find! Your explanation makes total sense. Attached a patchset
that fixes/works around this issue by using the simple query protocol
in the cancel test.


v38-0001-Use-simple-query-protocol-for-cancel-test.patch
Description: Binary data


v38-0003-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v38-0002-Revert-Comment-out-noisy-libpq_pipeline-test.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-13 Thread Jacob Champion
On Wed, Mar 13, 2024 at 12:01 PM Alvaro Herrera  wrote:
> On 2024-Mar-13, Jelte Fennema-Nio wrote:
> > Sadly I'm having a hard time reliably reproducing this race condition
> > locally. So it's hard to be sure what is happening here. Attached is a
> > patch with a wild guess as to what the issue might be (i.e. seeing an
> > outdated "active" state and thus passing the check even though the
> > query is not running yet)
>
> I tried leaving the original running in my laptop to see if I could
> reproduce it, but got no hits ... and we didn't get any other failures
> apart from the three ones already reported ... so it's not terribly high
> probability.  Anyway I pushed your patch now since the theory seems
> plausible; let's see if we still get the issue to reproduce.  If it
> does, we could make the script more verbose to hunt for further clues.

I hit this on my machine. With the attached diff I can reproduce
constantly (including with the most recent test patch); I think the
cancel must be arriving between the bind/execute steps?

Thanks,
--Jacob
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6b7903314a..22ce7c07d9 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2073,6 +2073,9 @@ exec_bind_message(StringInfo input_message)
valgrind_report_error_query(debug_query_string);
 
debug_query_string = NULL;
+
+   if (strstr(psrc->query_string, "pg_sleep"))
+   sleep(1);
 }
 
 /*


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-13 Thread Alvaro Herrera
On 2024-Mar-13, Jelte Fennema-Nio wrote:

> I agree it's probably a timing issue. The cancel being received after
> the query is done seems very unlikely, since the query takes 180
> seconds (assuming PG_TEST_TIMEOUT_DEFAULT is not lowered for these
> animals). I think it's more likely that the cancel request arrives too
> early, and thus being ignored because no query is running yet. The
> test already had logic to wait until the query backend was in the
> "active" state, before sending a cancel to solve that issue. But my
> guess is that that somehow isn't enough.
> 
> Sadly I'm having a hard time reliably reproducing this race condition
> locally. So it's hard to be sure what is happening here. Attached is a
> patch with a wild guess as to what the issue might be (i.e. seeing an
> outdated "active" state and thus passing the check even though the
> query is not running yet)

I tried leaving the original running in my laptop to see if I could
reproduce it, but got no hits ... and we didn't get any other failures
apart from the three ones already reported ... so it's not terribly high
probability.  Anyway I pushed your patch now since the theory seems
plausible; let's see if we still get the issue to reproduce.  If it
does, we could make the script more verbose to hunt for further clues.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Here's a general engineering tip: if the non-fun part is too complex for you
to figure out, that might indicate the fun part is too ambitious." (John Naylor)
https://postgr.es/m/CAFBsxsG4OWHBbSDM%3DsSeXrQGOtkPiOEOuME4yD7Ce41NtaAD9g%40mail.gmail.com




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-13 Thread Jelte Fennema-Nio
On Wed, 13 Mar 2024 at 04:53, Tom Lane  wrote:
> I suspect it's basically just a
> timing dependency.  Have you thought about the fact that a cancel
> request is a no-op if it arrives after the query's done?

I agree it's probably a timing issue. The cancel being received after
the query is done seems very unlikely, since the query takes 180
seconds (assuming PG_TEST_TIMEOUT_DEFAULT is not lowered for these
animals). I think it's more likely that the cancel request arrives too
early, and thus being ignored because no query is running yet. The
test already had logic to wait until the query backend was in the
"active" state, before sending a cancel to solve that issue. But my
guess is that that somehow isn't enough.

Sadly I'm having a hard time reliably reproducing this race condition
locally. So it's hard to be sure what is happening here. Attached is a
patch with a wild guess as to what the issue might be (i.e. seeing an
outdated "active" state and thus passing the check even though the
query is not running yet)


v37-0001-Hopefully-make-cancel-test-more-reliable.patch
Description: Binary data


v37-0002-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Tom Lane
Jelte Fennema-Nio  writes:
> On Tue, 12 Mar 2024 at 19:28, Alvaro Herrera  wrote:
>>> Hmm, buildfarm member kestrel (which uses
>>> -fsanitize=undefined,alignment) failed:

>> Hm, I tried using the same compile flags, couldn't reproduce.

> Okay, it passed now it seems so I guess this test is flaky somehow.

Two more intermittent failures:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushmaster=2024-03-13%2003%3A15%3A09

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=taipan=2024-03-13%2003%3A15%3A31

These animals all belong to Andres' flotilla, but other than that
I'm not seeing a connection.  I suspect it's basically just a
timing dependency.  Have you thought about the fact that a cancel
request is a no-op if it arrives after the query's done?

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 19:28, Alvaro Herrera  wrote:
>
> On 2024-Mar-12, Alvaro Herrera wrote:
>
> > Hmm, buildfarm member kestrel (which uses
> > -fsanitize=undefined,alignment) failed:
> >
> > # Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc
> > dbname='postgres'
> > test cancellations...
> > libpq_pipeline:260: query did not fail when it was expected
>
> Hm, I tried using the same compile flags, couldn't reproduce.

Okay, it passed now it seems so I guess this test is flaky somehow.
The error message and the timing difference between failed and
succeeded buildfarm run clearly indicates that the pg_sleep ran its
180 seconds to completion (so cancel was never processed for some
reason).

**failed case**
282/285 postgresql:libpq_pipeline / libpq_pipeline/001_libpq_pipeline
   ERROR   191.56s   exit status 1

**succeeded case**

252/285 postgresql:libpq_pipeline / libpq_pipeline/001_libpq_pipeline
   OK   10.01s   21 subtests passed

I don't see any obvious reason for how this test can be flaky, but
I'll think a bit more about it tomorrow.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Alvaro Herrera
On 2024-Mar-12, Alvaro Herrera wrote:

> Hmm, buildfarm member kestrel (which uses
> -fsanitize=undefined,alignment) failed:
> 
> # Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc
> dbname='postgres'
> test cancellations... 
> libpq_pipeline:260: query did not fail when it was expected

Hm, I tried using the same compile flags, couldn't reproduce.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Alvaro Herrera
Hmm, buildfarm member kestrel (which uses
-fsanitize=undefined,alignment) failed:

# Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc
dbname='postgres'
test cancellations... 
libpq_pipeline:260: query did not fail when it was expected

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel=2024-03-12%2016%3A41%3A27

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The saddest aspect of life right now is that science gathers knowledge faster
 than society gathers wisdom."  (Isaac Asimov)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Alvaro Herrera
On 2024-Mar-12, Jelte Fennema-Nio wrote:

> On Tue, 12 Mar 2024 at 10:19, Alvaro Herrera  wrote:
> > Here's a last one for the cfbot.
> 
> Thanks for committing the first 3 patches btw. Attached a tiny change
> to 0001, which adds "(backing struct for PGcancelConn)" to the comment
> on pg_cancel_conn.

Thanks, I included it.  I hope there were no other changes, because I
didn't verify :-) but if there were, please let me know to incorporate
them.

I made a number of other small changes, mostly to the documentation,
nothing fundamental.  (Someday we should stop using  to
document the libpq functions and use refentry's instead ... it'd be
useful to have manpages for these functions.)

One thing I don't like very much is release_conn_addrinfo(), which is
called conditionally in two places but unconditionally in other places.
Maybe it'd make more sense to put this conditionality inside the
function itself, possibly with a "bool force" flag to suppress that in
the cases where it is not desired.

In pqConnectDBComplete, we cast the PGconn * to PGcancelConn * in order
to call PQcancelPoll, which is a bit abusive, but I don't know how to do
better.  Maybe we just accept this ... but if PQcancelStart is the only
way to have pqConnectDBStart called from a cancel connection, maybe it'd
be saner to duplicate pqConnectDBStart for cancel conns.

Thanks!

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 15:04, Jelte Fennema-Nio  wrote:
> Attached a tiny change to 0001

One more tiny comment change, stating that pg_cancel is used by the
deprecated PQcancel function.


v36-0001-libpq-Add-encrypted-and-non-blocking-query-cance.patch
Description: Binary data


v36-0002-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 10:19, Alvaro Herrera  wrote:
> Here's a last one for the cfbot.

Thanks for committing the first 3 patches btw. Attached a tiny change
to 0001, which adds "(backing struct for PGcancelConn)" to the comment
on pg_cancel_conn.
From d340fde6883a249fd7c1a90033675a3b5edb603e Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Thu, 14 Dec 2023 13:39:09 +0100
Subject: [PATCH v35 2/2] Start using new libpq cancel APIs

A previous commit introduced new APIs to libpq for cancelling queries.
This replaces the usage of the old APIs in most of the codebase with
these newer ones. This specifically leaves out changes to psql and
pgbench as those would need a much larger refactor to be able to call
them, due to the new functions not being signal-safe.
---
 contrib/dblink/dblink.c   |  30 +++--
 contrib/postgres_fdw/connection.c | 105 +++---
 .../postgres_fdw/expected/postgres_fdw.out|  15 +++
 contrib/postgres_fdw/sql/postgres_fdw.sql |   7 ++
 src/fe_utils/connect_utils.c  |  11 +-
 src/test/isolation/isolationtester.c  |  29 ++---
 6 files changed, 145 insertions(+), 52 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 19a362526d2..98dcca3e6fd 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1346,22 +1346,32 @@ PG_FUNCTION_INFO_V1(dblink_cancel_query);
 Datum
 dblink_cancel_query(PG_FUNCTION_ARGS)
 {
-	int			res;
 	PGconn	   *conn;
-	PGcancel   *cancel;
-	char		errbuf[256];
+	PGcancelConn *cancelConn;
+	char	   *msg;
 
 	dblink_init();
 	conn = dblink_get_named_conn(text_to_cstring(PG_GETARG_TEXT_PP(0)));
-	cancel = PQgetCancel(conn);
+	cancelConn = PQcancelCreate(conn);
 
-	res = PQcancel(cancel, errbuf, 256);
-	PQfreeCancel(cancel);
+	PG_TRY();
+	{
+		if (!PQcancelBlocking(cancelConn))
+		{
+			msg = pchomp(PQcancelErrorMessage(cancelConn));
+		}
+		else
+		{
+			msg = "OK";
+		}
+	}
+	PG_FINALLY();
+	{
+		PQcancelFinish(cancelConn);
+	}
+	PG_END_TRY();
 
-	if (res == 1)
-		PG_RETURN_TEXT_P(cstring_to_text("OK"));
-	else
-		PG_RETURN_TEXT_P(cstring_to_text(errbuf));
+	PG_RETURN_TEXT_P(cstring_to_text(msg));
 }
 
 
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4931ebf5915..dcc13dc3b24 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -133,7 +133,7 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
 static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
 static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
 static bool pgfdw_cancel_query(PGconn *conn);
-static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime);
 static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
    bool consume_input);
 static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
@@ -1315,36 +1315,104 @@ pgfdw_cancel_query(PGconn *conn)
 	endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
 		  CONNECTION_CLEANUP_TIMEOUT);
 
-	if (!pgfdw_cancel_query_begin(conn))
+	if (!pgfdw_cancel_query_begin(conn, endtime))
 		return false;
 	return pgfdw_cancel_query_end(conn, endtime, false);
 }
 
 static bool
-pgfdw_cancel_query_begin(PGconn *conn)
+pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime)
 {
-	PGcancel   *cancel;
-	char		errbuf[256];
+	bool		timed_out = false;
+	bool		failed = false;
+	PGcancelConn *cancel_conn = PQcancelCreate(conn);
 
-	/*
-	 * Issue cancel request.  Unfortunately, there's no good way to limit the
-	 * amount of time that we might block inside PQgetCancel().
-	 */
-	if ((cancel = PQgetCancel(conn)))
+
+	if (!PQcancelStart(cancel_conn))
 	{
-		if (!PQcancel(cancel, errbuf, sizeof(errbuf)))
+		PG_TRY();
 		{
 			ereport(WARNING,
 	(errcode(ERRCODE_CONNECTION_FAILURE),
 	 errmsg("could not send cancel request: %s",
-			errbuf)));
-			PQfreeCancel(cancel);
-			return false;
+			pchomp(PQcancelErrorMessage(cancel_conn);
 		}
-		PQfreeCancel(cancel);
+		PG_FINALLY();
+		{
+			PQcancelFinish(cancel_conn);
+		}
+		PG_END_TRY();
+		return false;
 	}
 
-	return true;
+	/* In what follows, do not leak any PGcancelConn on an error. */
+	PG_TRY();
+	{
+		while (true)
+		{
+			TimestampTz now = GetCurrentTimestamp();
+			long		cur_timeout;
+			PostgresPollingStatusType pollres = PQcancelPoll(cancel_conn);
+			int			waitEvents = WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH;
+
+			if (pollres == PGRES_POLLING_OK)
+			{
+break;
+			}
+
+			/* If timeout has expired, give up, else get sleep time. */
+			cur_timeout = TimestampDifferenceMilliseconds(now, endtime);
+			if (cur_timeout <= 0)
+			{
+timed_out = true;
+failed = true;
+goto exit;
+			}
+
+			switch (pollres)
+			{
+case PGRES_POLLING_READING:
+	waitEvents |= WL_SOCKET_READABLE;
+	break;
+case 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 10:53, Jelte Fennema-Nio  wrote:
> I'd rather fail as hard as possible when someone is using the API
> wrongly.

To be clear, this is my way of looking at it. If you feel strongly
about that we should not change conn.status, I'm fine with making that
change to the patchset.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 10:19, Alvaro Herrera  wrote:
> If we do this and we see conn.status is not ALLOCATED, meaning a cancel
> is already ongoing, shouldn't we leave conn.status alone instead of
> changing to CONNECTION_BAD?  I mean, we shouldn't be juggling the elbow
> of whoever's doing that, should we?  Maybe just add the error message
> and return 0?

I'd rather fail as hard as possible when someone is using the API
wrongly. Not doing so is bound to cause confusion imho. e.g. if the
state is still CONNECTION_OK because the user forgot to call
PQcancelReset then keeping the connection status "as is" might seem as
if the cancel request succeeded even though nothing happened. So if
the user uses the API incorrectly then I'd rather use all the avenues
possible to indicate that there was an error. Especially since in all
other cases if PQcancelStart returns false CONNECTION_BAD is the
status, and this in turn means that PQconnectPoll will return
PGRES_POLLING_FAILED. So I doubt people will always check the actual
return value of the function to check if an error happened. They might
check PQcancelStatus or PQconnectPoll instead, because that integrates
easier with the rest of their code.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Alvaro Herrera
Here's a last one for the cfbot.

I have a question about this one

int
PQcancelStart(PGcancelConn *cancelConn)
{
[...]
if (cancelConn->conn.status != CONNECTION_ALLOCATED)
{
libpq_append_conn_error(>conn,
"cancel request is already being sent on this 
connection");
cancelConn->conn.status = CONNECTION_BAD;
return 0;
}


If we do this and we see conn.status is not ALLOCATED, meaning a cancel
is already ongoing, shouldn't we leave conn.status alone instead of
changing to CONNECTION_BAD?  I mean, we shouldn't be juggling the elbow
of whoever's doing that, should we?  Maybe just add the error message
and return 0?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"If it is not right, do not do it.
If it is not true, do not say it." (Marcus Aurelius, Meditations)
>From fc0cbf0a6184d374e12e051f88f9f8eef7cc30d9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 12 Mar 2024 10:09:25 +0100
Subject: [PATCH v34 1/2] libpq: Add encrypted and non-blocking query
 cancellation routines

The existing PQcancel API uses blocking IO, which makes PQcancel
impossible to use in an event loop based codebase without blocking the
event loop until the call returns.  It also doesn't encrypt the
connection over which the cancel request is sent, even when the original
connection required encryption.

This commit adds a PQcancelConn struct and assorted functions, which
provide a better mechanism of sending cancel requests; in particular all
the encryption used in the original connection are also used in the
cancel connection.  The main entry points are:

- PQcancelCreate creates the PQcancelConn based on the original
  connection (but does not establish an actual connection).
- PQcancelStart can be used to initiate non-blocking cancel requests,
  using encryption if the original connection did so, which must be
  pumped using
- PQcancelPoll.
- PQcancelReset puts a PQcancelConn back in state so that it can be
  reused to send a new cancel request to the same connection.
- PQcancelBlocking is a simpler-to-use blocking API that still uses
  encryption.

Additional functions are
 - PQcancelStatus, mimicks PQstatus;
 - PQcancelSocket, mimicks PQcancelSocket;
 - PQcancelErrorMessage, mimicks PQerrorMessage;
 - PQcancelFinish, mimicks PQfinish.

Author: Jelte Fennema-Nio 
Reviewed-by: Denis Laxalde 
Discussion: https://postgr.es/m/am5pr83mb0178d3b31ca1b6ec4a8ecc42f7...@am5pr83mb0178.eurprd83.prod.outlook.com
---
 doc/src/sgml/libpq.sgml   | 465 --
 src/interfaces/libpq/exports.txt  |   9 +
 src/interfaces/libpq/fe-cancel.c  | 297 ++-
 src/interfaces/libpq/fe-connect.c | 129 -
 src/interfaces/libpq/libpq-fe.h   |  31 +-
 src/interfaces/libpq/libpq-int.h  |   5 +
 .../modules/libpq_pipeline/libpq_pipeline.c   | 125 +
 src/tools/pgindent/typedefs.list  |   1 +
 8 files changed, 1015 insertions(+), 47 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index a2bbf33d02..373d0dc322 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -265,7 +265,7 @@ PGconn *PQsetdb(char *pghost,
 
  PQconnectStartParamsPQconnectStartParams
  PQconnectStartPQconnectStart
- PQconnectPollPQconnectPoll
+ PQconnectPollPQconnectPoll
  
   
nonblocking connection
@@ -5287,7 +5287,7 @@ int PQisBusy(PGconn *conn);
/
can also attempt to cancel a command that is still being processed
by the server; see .  But regardless of
-   the return value of , the application
+   the return value of , the application
must continue with the normal result-reading sequence using
.  A successful cancellation will
simply cause the command to terminate sooner than it would have
@@ -6034,10 +6034,402 @@ int PQsetSingleRowMode(PGconn *conn);
SQL command
   
 
-  
-   A client application can request cancellation of a command that is
-   still being processed by the server, using the functions described in
-   this section.
+  
+   Functions for Sending Cancel Requests
+   
+
+ PQcancelCreatePQcancelCreate
+
+ 
+  
+   Prepares a connection over which a cancel request can be sent.
+
+PGcancelConn *PQcancelCreate(PGconn *conn);
+
+  
+
+  
+creates a
+   PGcancelConnPGcancelConn
+   object, but it won't instantly start sending a cancel request over this
+   connection. A cancel request can be sent over this connection in a
+   blocking manner using  and in a
+   non-blocking manner using .
+   The return value can be passed to 
+   to check if the PGcancelConn object was
+   created successfully. The PGcancelConn object
+   is an opaque structure that is not meant to be accessed directly by the
+   application. This PGcancelConn object can be
+   used to cancel the query that's 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-07 Thread Jelte Fennema-Nio
Attached is a new patchset with various changes. I created a dedicated
0002 patch to add tests for the already existing cancellation
functions, because that seemed useful for another thread where changes
to the cancellation protocol are being proposed[1].

[1]: 
https://www.postgresql.org/message-id/flat/508d0505-8b7a-4864-a681-e7e5edfe32aa%40iki.fi

On Wed, 6 Mar 2024 at 19:22, Alvaro Herrera  wrote:
>
> Docs: one bogus "that that".

This was already fixed by my previous doc changes in v32, I guess that
email got crossed with this one

> Did we consider having PQcancelConn() instead be called
> PQcancelCreate()?

Done

> "Asynchronously cancel a query on the given connection. This requires
> polling the returned PGcancelConn to actually complete the cancellation
> of the query." but this is no longer a good description of what this
> function does.

Fixed

>  Anyway, maybe there are reasons for this; but in any case we
> should set ->cancelRequest in all cases, not only after the first tests
> for errors.

Done

> I think the extra PGconn inside pg_cancel_conn is useless; it would be
> simpler to typedef PGcancelConn to PGconn in fe-cancel.c, and remove the
> indirection through the extra struct.  You're actually dereferencing the
> object in two ways in the new code, both by casting the outer object
> straight to PGconn (taking advantage that the struct member is first in
> the struct), and by using PGcancelConn->conn.  This seems pointless.  I
> mean, if we're going to cast to "PGconn *" in some places anyway, then
> we may as well access all members directly.  Perhaps, if you want, you
> could add asserts that ->cancelRequest is set true in all the
> fe-cancel.c functions.  Anyway, we'd still have compiler support to tell
> you that you're passing the wrong struct to the function.  (I didn't
> actually try to change the code this way, so I might be wrong.)

Turns out you were wrong about the compiler support to tell us we're
passing the wrong struct: When both the PGconn and PGcancelConn
typedefs refer to the same struct, the compiler allows passing PGconn
to PGcancelConn functions and vice versa without complaining. This
seems enough reason for me to keep indirection through the extra
struct.

So instead of adding the proposed typed this typedef I chose to add a
comment to pg_cancel_conn explaining its purpose, as well as not
casting PGcancelConn to PGconn but always accessing the conn field for
consistency.

> We could move the definition of struct pg_cancel to fe-cancel.c.  Nobody
> outside that needs to know that definition anyway.

Done in 0003


v33-0003-libpq-Move-pg_cancel-to-fe-cancel.c.patch
Description: Binary data


v33-0002-Add-tests-for-libpq-query-cancellation-APIs.patch
Description: Binary data


v33-0001-Add-missing-connection-statuses-to-docs.patch
Description: Binary data


v33-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v33-0004-libpq-Add-encrypted-and-non-blocking-versions-of.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-06 Thread Jelte Fennema-Nio
On Wed, 6 Mar 2024 at 19:22, Alvaro Herrera  wrote:
> Docs: one bogus "that that".

will fix

> Did we consider having PQcancelConn() instead be called
> PQcancelCreate()?

Fine by me

> Also, the comment still says
> "Asynchronously cancel a query on the given connection. This requires
> polling the returned PGcancelConn to actually complete the cancellation
> of the query." but this is no longer a good description of what this
> function does.

will fix

> Why do we return a non-NULL pointer from PQcancelConn in the first three
> cases where we return errors?  (original conn was NULL, original conn is
> PGINVALID_SOCKET, pqCopyPGconn returns failure)  Wouldn't it make more
> sense to free the allocated object and return NULL?  Actually, I wonder
> if there's any reason at all to return a valid pointer in any failure
> cases; I mean, do we really expect that application authors are going to
> read/report the error message from a PGcancelConn that failed to be fully
> created?

I think having a useful error message when possible is quite nice. And
I do think people will read/report this error message. Especially
since many people will simply pass it to PQcancelBlocking, whether
it's NULL or not. And then check the status, and then report the error
if the status was CONNECTION_BAD.

> but in any case we
> should set ->cancelRequest in all cases, not only after the first tests
> for errors.

makes sense

> I think the extra PGconn inside pg_cancel_conn is useless; it would be
> simpler to typedef PGcancelConn to PGconn in fe-cancel.c, and remove the
> indirection through the extra struct.

That sounds nice indeed. I'll try it out.

> We could move the definition of struct pg_cancel to fe-cancel.c.  Nobody
> outside that needs to know that definition anyway.

will do




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-06 Thread Alvaro Herrera
Docs: one bogus "that that".

Did we consider having PQcancelConn() instead be called
PQcancelCreate()?  I think this better conveys that what we're doing is
create an object that can be used to do something, and that nothing else
is done with it by default.  Also, the comment still says
"Asynchronously cancel a query on the given connection. This requires
polling the returned PGcancelConn to actually complete the cancellation
of the query." but this is no longer a good description of what this
function does.

Why do we return a non-NULL pointer from PQcancelConn in the first three
cases where we return errors?  (original conn was NULL, original conn is
PGINVALID_SOCKET, pqCopyPGconn returns failure)  Wouldn't it make more
sense to free the allocated object and return NULL?  Actually, I wonder
if there's any reason at all to return a valid pointer in any failure
cases; I mean, do we really expect that application authors are going to
read/report the error message from a PGcancelConn that failed to be fully
created?  Anyway, maybe there are reasons for this; but in any case we
should set ->cancelRequest in all cases, not only after the first tests
for errors.

I think the extra PGconn inside pg_cancel_conn is useless; it would be
simpler to typedef PGcancelConn to PGconn in fe-cancel.c, and remove the
indirection through the extra struct.  You're actually dereferencing the
object in two ways in the new code, both by casting the outer object
straight to PGconn (taking advantage that the struct member is first in
the struct), and by using PGcancelConn->conn.  This seems pointless.  I
mean, if we're going to cast to "PGconn *" in some places anyway, then
we may as well access all members directly.  Perhaps, if you want, you
could add asserts that ->cancelRequest is set true in all the
fe-cancel.c functions.  Anyway, we'd still have compiler support to tell
you that you're passing the wrong struct to the function.  (I didn't
actually try to change the code this way, so I might be wrong.)

We could move the definition of struct pg_cancel to fe-cancel.c.  Nobody
outside that needs to know that definition anyway.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"XML!" Exclaimed C++.  "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.
https://burningbird.net/the-parable-of-the-languages/




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-06 Thread Jelte Fennema-Nio
On Wed, 6 Mar 2024 at 15:03, Denis Laxalde  wrote:
>
> In patch 0004, I noticed a couple of typos in the documentation; please
> find attached a fixup patch correcting these.

Thanks, applied.

> while not actually listing the "statuses". Should we list them?

I listed the relevant statuses over now and updated the PQcancelStatus
docs to look more like the PQstatus one. I didn't list any statuses
that a cancel connection could never have (but a normal connection
can).

While going over the list of statuses possible for a cancel connection
I realized that the docs for PQconnectStart were not listing all
relevant statuses, so I fixed that in patch 0001.
From 40d3d9b0f4058bcf3041e63f71ce4c56e43e73f2 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 6 Mar 2024 18:33:49 +0100
Subject: [PATCH v32 1/3] Add missing connection statuses to docs

The list of connection statuses that PQstatus might return during an
asynchronous connection attempt was incorrect:

1. CONNECTION_SETENV is never returned anymore and is only part of the
   enum for backwards compatibility. So it's removed from the list.
2. CONNECTION_CHECK_STANDBY and CONNECTION_GSS_STARTUP were not listed.

This addresses those problems. CONNECTION_NEEDED and
CONNECTION_CHECK_TARGET are not listed in the docs on purpose, since
these states are internal states that can never be observed by a caller
of PQstatus.
---
 doc/src/sgml/libpq.sgml | 15 ---
 src/interfaces/libpq/libpq-fe.h |  3 ++-
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 1d8998efb2a..a2bbf33d029 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -428,11 +428,11 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
  
 
 
-
- CONNECTION_SETENV
+
+ CONNECTION_GSS_STARTUP
  
   
-   Negotiating environment-driven parameter settings.
+   Negotiating GSS encryption.
   
  
 
@@ -446,6 +446,15 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
  
 
 
+
+ CONNECTION_CHECK_STANDBY
+ 
+  
+   Checking if connection is to a server in standby mode.
+  
+ 
+
+
 
  CONNECTION_CONSUME
  
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index defc415fa3f..1e5e7481a7c 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -77,7 +77,8 @@ typedef enum
 	CONNECTION_CHECK_WRITABLE,	/* Checking if session is read-write. */
 	CONNECTION_CONSUME,			/* Consuming any extra messages. */
 	CONNECTION_GSS_STARTUP,		/* Negotiating GSSAPI. */
-	CONNECTION_CHECK_TARGET,	/* Checking target server properties. */
+	CONNECTION_CHECK_TARGET,	/* Internal state: Checking target server
+ * properties. */
 	CONNECTION_CHECK_STANDBY	/* Checking if server is in standby mode. */
 } ConnStatusType;
 

base-commit: de7c6fe8347ab726c80ebbfcdb57f4b714d5243d
-- 
2.34.1

From 15d91bb8ca87764eee02d785126495df1f6ffd5f Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Thu, 14 Dec 2023 13:39:09 +0100
Subject: [PATCH v32 3/3] Start using new libpq cancel APIs

A previous commit introduced new APIs to libpq for cancelling queries.
This replaces the usage of the old APIs in most of the codebase with
these newer ones. This specifically leaves out changes to psql and
pgbench as those would need a much larger refactor to be able to call
them, due to the new functions not being signal-safe.
---
 contrib/dblink/dblink.c   |  30 +++--
 contrib/postgres_fdw/connection.c | 105 +++---
 .../postgres_fdw/expected/postgres_fdw.out|  15 +++
 contrib/postgres_fdw/sql/postgres_fdw.sql |   7 ++
 src/fe_utils/connect_utils.c  |  11 +-
 src/test/isolation/isolationtester.c  |  29 ++---
 6 files changed, 145 insertions(+), 52 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 19a362526d2..8b4013c480a 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1346,22 +1346,32 @@ PG_FUNCTION_INFO_V1(dblink_cancel_query);
 Datum
 dblink_cancel_query(PG_FUNCTION_ARGS)
 {
-	int			res;
 	PGconn	   *conn;
-	PGcancel   *cancel;
-	char		errbuf[256];
+	PGcancelConn *cancelConn;
+	char	   *msg;
 
 	dblink_init();
 	conn = dblink_get_named_conn(text_to_cstring(PG_GETARG_TEXT_PP(0)));
-	cancel = PQgetCancel(conn);
+	cancelConn = PQcancelConn(conn);
 
-	res = PQcancel(cancel, errbuf, 256);
-	PQfreeCancel(cancel);
+	PG_TRY();
+	{
+		if (!PQcancelBlocking(cancelConn))
+		{
+			msg = pchomp(PQcancelErrorMessage(cancelConn));
+		}
+		else
+		{
+			msg = "OK";
+		}
+	}
+	PG_FINALLY();
+	{
+		PQcancelFinish(cancelConn);
+	}
+	PG_END_TRY();
 
-	if (res == 1)
-		PG_RETURN_TEXT_P(cstring_to_text("OK"));
-	else
-		PG_RETURN_TEXT_P(cstring_to_text(errbuf));
+	

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-06 Thread Denis Laxalde
In patch 0004, I noticed a couple of typos in the documentation; please 
find attached a fixup patch correcting these.


Still in the documentation, same patch, the last paragraph documenting 
PQcancelPoll() ends as:


+   indicate the current stage of the connection procedure and might 
be useful

+   to provide feedback to the user for example. These statuses are:
+  

while not actually listing the "statuses". Should we list them? Adjust 
the wording? Or refer to PQconnectPoll() documentation (since the 
paragraph is copied from there it seems)?



Otherwise, the feature still works fine as far as I can tell.From 3e04442e3f283829ed38e4a2b435fd182addf87a Mon Sep 17 00:00:00 2001
From: Denis Laxalde 
Date: Wed, 6 Mar 2024 14:55:40 +0100
Subject: [PATCH] fixup! libpq: Add encrypted and non-blocking versions of
 PQcancel

---
 doc/src/sgml/libpq.sgml  | 2 +-
 src/interfaces/libpq/fe-cancel.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 1613fcc7bb..1281cac284 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -6122,7 +6122,7 @@ PostgresPollingStatusType PQcancelPoll(PGcancelConn *cancelConn);
   
The request is made over the given PGcancelConn,
which needs to be created with .
-   The return value of 
+   The return value of 
is 1 if the cancellation request could be started and 0 if not.
If it was unsuccessful, the error message can be
retrieved using .
diff --git a/src/interfaces/libpq/fe-cancel.c b/src/interfaces/libpq/fe-cancel.c
index e66b8819ee..9c9a23bb4d 100644
--- a/src/interfaces/libpq/fe-cancel.c
+++ b/src/interfaces/libpq/fe-cancel.c
@@ -146,7 +146,7 @@ PQcancelBlocking(PGcancelConn *cancelConn)
 /*
  *		PQcancelStart
  *
- * Starts sending a cancellation request in a blocking fashion. Returns
+ * Starts sending a cancellation request in a non-blocking fashion. Returns
  * 1 if successful 0 if not.
  */
 int
-- 
2.39.2



Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-14 Thread Jelte Fennema-Nio
On Wed, 14 Feb 2024 at 18:41, Alvaro Herrera  wrote:
> Hmm, I think the changes to libpq_pipeline in 0005 should be in 0004.

Yeah, you're correct. Fixed that now.


v32-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v32-0004-libpq-Add-encrypted-and-non-blocking-versions-of.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-14 Thread Alvaro Herrera
On 2024-Feb-14, Jelte Fennema-Nio wrote:

> Attached is a new version of the final patches, with much improved
> docs (imho) and the new function names: PQcancelStart and
> PQcancelBlocking.

Hmm, I think the changes to libpq_pipeline in 0005 should be in 0004.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-14 Thread Jelte Fennema-Nio
On Sun, 4 Feb 2024 at 16:39, Alvaro Herrera  wrote:
> Maybe this is okay?  I'll have a look at the whole final situation more
> carefully later; or if somebody else wants to share an opinion, please
> do so.

Attached is a new version of the final patches, with much improved
docs (imho) and the new function names: PQcancelStart and
PQcancelBlocking.


v31-0004-libpq-Add-encrypted-and-non-blocking-versions-of.patch
Description: Binary data


v31-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-04 Thread Alvaro Herrera
On 2024-Feb-02, Jelte Fennema-Nio wrote:

> The only reasonable thing I can think of to make that situation better
> is to move that part of the function outside of PQcancelPoll and
> create a dedicated PQcancelStart function for it. It introduces an
> extra function, but it does seem more in line with how we do the
> regular connection establishment. Basically you would have code like
> this then, which looks quite nice honestly:
> 
> cancelConn = PQcancelConn(conn);
> if (!PQcancelStart(cancelConn))
> pg_fatal("bad cancel connection: %s", 
> PQcancelErrorMessage(cancelConn));
> while (true)
> {
>  // polling using PQcancelPoll here
> }

Maybe this is okay?  I'll have a look at the whole final situation more
carefully later; or if somebody else wants to share an opinion, please
do so.

In the meantime I pushed your 0002 and 0003 patches, so you can take
this as an opportunity to rebase the remaining ones.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The saddest aspect of life right now is that science gathers knowledge faster
 than society gathers wisdom."  (Isaac Asimov)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-02 Thread Jelte Fennema-Nio
On Fri, 2 Feb 2024 at 16:06, Alvaro Herrera  wrote:
> Now, looking at this list, I think it's surprising that the nonblocking
> request for a cancellation is called PQcancelPoll.  PQcancelSend() is at
> odds with the asynchronous query API, which uses the verb "send" for the
> asynchronous variants. This would suggest that PQcancelPoll should
> actually be called PQcancelSend or maybe PQcancelStart (mimicking
> PQconnectStart).  I'm not sure what's a good alternative name for the
> blocking one, which you have called PQcancelSend.

I agree that Send is an unfortunate suffix. I'd love to use PQcancel
for this, but obviously that one is already taken. Some other options
that I can think of are (from favorite to less favorite):
- PQcancelBlocking
- PQcancelAndWait
- PQcancelGo
- PQcancelNow

Finally, another option would be to renome PQcancelConn to
PQgetCancelConn and then rename PQcancelSend to PQcancelConn.

Regarding PQcancelPoll, I think it's a good name for the polling
function, but I agree it's a bit confusing to use it to also start
sending the connection. Even the code of PQcancelPoll basically admits
that this is  confusing behaviour:

/*
 * Before we can call PQconnectPoll we first need to start the connection
 * using pqConnectDBStart. Non-cancel connections already do this whenever
 * the connection is initialized. But cancel connections wait until the
 * caller starts polling, because there might be a large delay between
 * creating a cancel connection and actually wanting to use it.
 */
if (conn->status == CONNECTION_STARTING)
{
if (!pqConnectDBStart(>conn))
{
cancelConn->conn.status = CONNECTION_STARTED;
return PGRES_POLLING_WRITING;
}
}

The only reasonable thing I can think of to make that situation better
is to move that part of the function outside of PQcancelPoll and
create a dedicated PQcancelStart function for it. It introduces an
extra function, but it does seem more in line with how we do the
regular connection establishment. Basically you would have code like
this then, which looks quite nice honestly:

cancelConn = PQcancelConn(conn);
if (!PQcancelStart(cancelConn))
pg_fatal("bad cancel connection: %s", PQcancelErrorMessage(cancelConn));
while (true)
{
 // polling using PQcancelPoll here
}




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-02 Thread Alvaro Herrera
Hello,

The patched docs claim that PQrequestCancel is insecure, but neither the
code nor docs explain why.  The docs for PQcancel on the other hand do
mention that encryption is not used; does that apply to PQrequestCancel
as well and is that the reason?  If so, I think we should copy the
warning and perhaps include a code comment about that.  Also, maybe that
final phrase in PQcancel should be a  box: remove from "So, for
example" and add Because gssencmode and sslencmode are
not preserved from the original connection, the cancel request is not
encrypted. or something like that.


I wonder if Section 33.7 Canceling Queries in Progress should be split
in three subsections, and I propose the following order:

33.7.1 PGcancelConn-based Cancellation API
  PQcancelConn  -- we first document the basics
  PQcancelSend
  PQcancelFinish
  PQcancelPoll  -- the nonblocking interface is documented next
  PQcancelReset -- reuse a cancelconn, later in docs because it's more 
advanced
  PQcancelStatus-- accessors go last
  PQcancelSocket
  PQcancelErrorMessage

33.7.2 Obsolete interface
  PQgetCancel
  PQfreeCancel
  PQcancel

33.7.3 Deprecated and Insecure Methods
  PQrequestCancel

I have a hard time coming up with good subsection titles though.

Now, looking at this list, I think it's surprising that the nonblocking
request for a cancellation is called PQcancelPoll.  PQcancelSend() is at
odds with the asynchronous query API, which uses the verb "send" for the
asynchronous variants.  This would suggest that PQcancelPoll should
actually be called PQcancelSend or maybe PQcancelStart (mimicking
PQconnectStart).  I'm not sure what's a good alternative name for the
blocking one, which you have called PQcancelSend.

I see upthread that the names of these functions were already quite
heavily debated.  Sorry to beat that dead horse some more ... I'm just
not sure it's decided matter.

Lastly -- the doc blurbs that say simply "a version of XYZ that can be
used for cancellation connections" are a bit underwhelming.  Shouldn't
we document these more fully instead of making users go read the docs
for the other functions and wonder what the differences might be, if
any?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-02 Thread Jelte Fennema-Nio
On Fri, 2 Feb 2024 at 13:19, Alvaro Herrera  wrote:
> Thank you, looks good.
>
> I propose the following minor/trivial fixes over your initial 3 patches.

All of those seem good like fixes. Attached is an updated patchset
where they are all applied. As well as adding a missing word ("been")
in a comment that I noticed while reading your fixes.
From 7736e940567878c32355c2143cddba3b13bfa71e Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Fri, 26 Jan 2024 16:47:51 +0100
Subject: [PATCH v30 3/5] libpq: Change some static functions to extern

This is in preparation of a follow up commit that starts using these
functions from fe-cancel.c.
---
 src/interfaces/libpq/fe-connect.c | 87 +++
 src/interfaces/libpq/libpq-int.h  |  6 +++
 2 files changed, 47 insertions(+), 46 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 079abfca9e..7d8616eb6d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -387,15 +387,10 @@ static const char uri_designator[] = "postgresql://";
 static const char short_uri_designator[] = "postgres://";
 
 static bool connectOptions1(PGconn *conn, const char *conninfo);
-static bool connectOptions2(PGconn *conn);
-static int	connectDBStart(PGconn *conn);
-static int	connectDBComplete(PGconn *conn);
 static PGPing internal_ping(PGconn *conn);
-static PGconn *makeEmptyPGconn(void);
 static void pqFreeCommandQueue(PGcmdQueueEntry *queue);
 static bool fillPGconn(PGconn *conn, PQconninfoOption *connOptions);
 static void freePGconn(PGconn *conn);
-static void closePGconn(PGconn *conn);
 static void release_conn_addrinfo(PGconn *conn);
 static int	store_conn_addrinfo(PGconn *conn, struct addrinfo *addrlist);
 static void sendTerminateConn(PGconn *conn);
@@ -644,8 +639,8 @@ pqDropServerData(PGconn *conn)
  * PQconnectStart or PQconnectStartParams (which differ in the same way as
  * PQconnectdb and PQconnectdbParams) and PQconnectPoll.
  *
- * Internally, the static functions connectDBStart, connectDBComplete
- * are part of the connection procedure.
+ * The non-exported functions pqConnectDBStart, pqConnectDBComplete are
+ * part of the connection procedure implementation.
  */
 
 /*
@@ -678,7 +673,7 @@ PQconnectdbParams(const char *const *keywords,
 	PGconn	   *conn = PQconnectStartParams(keywords, values, expand_dbname);
 
 	if (conn && conn->status != CONNECTION_BAD)
-		(void) connectDBComplete(conn);
+		(void) pqConnectDBComplete(conn);
 
 	return conn;
 }
@@ -731,7 +726,7 @@ PQconnectdb(const char *conninfo)
 	PGconn	   *conn = PQconnectStart(conninfo);
 
 	if (conn && conn->status != CONNECTION_BAD)
-		(void) connectDBComplete(conn);
+		(void) pqConnectDBComplete(conn);
 
 	return conn;
 }
@@ -785,7 +780,7 @@ PQconnectStartParams(const char *const *keywords,
 	 * to initialize conn->errorMessage to empty.  All subsequent steps during
 	 * connection initialization will only append to that buffer.
 	 */
-	conn = makeEmptyPGconn();
+	conn = pqMakeEmptyPGconn();
 	if (conn == NULL)
 		return NULL;
 
@@ -819,15 +814,15 @@ PQconnectStartParams(const char *const *keywords,
 	/*
 	 * Compute derived options
 	 */
-	if (!connectOptions2(conn))
+	if (!pqConnectOptions2(conn))
 		return conn;
 
 	/*
 	 * Connect to the database
 	 */
-	if (!connectDBStart(conn))
+	if (!pqConnectDBStart(conn))
 	{
-		/* Just in case we failed to set it in connectDBStart */
+		/* Just in case we failed to set it in pqConnectDBStart */
 		conn->status = CONNECTION_BAD;
 	}
 
@@ -863,7 +858,7 @@ PQconnectStart(const char *conninfo)
 	 * to initialize conn->errorMessage to empty.  All subsequent steps during
 	 * connection initialization will only append to that buffer.
 	 */
-	conn = makeEmptyPGconn();
+	conn = pqMakeEmptyPGconn();
 	if (conn == NULL)
 		return NULL;
 
@@ -876,15 +871,15 @@ PQconnectStart(const char *conninfo)
 	/*
 	 * Compute derived options
 	 */
-	if (!connectOptions2(conn))
+	if (!pqConnectOptions2(conn))
 		return conn;
 
 	/*
 	 * Connect to the database
 	 */
-	if (!connectDBStart(conn))
+	if (!pqConnectDBStart(conn))
 	{
-		/* Just in case we failed to set it in connectDBStart */
+		/* Just in case we failed to set it in pqConnectDBStart */
 		conn->status = CONNECTION_BAD;
 	}
 
@@ -895,7 +890,7 @@ PQconnectStart(const char *conninfo)
  * Move option values into conn structure
  *
  * Don't put anything cute here --- intelligence should be in
- * connectOptions2 ...
+ * pqConnectOptions2 ...
  *
  * Returns true on success. On failure, returns false and sets error message.
  */
@@ -933,7 +928,7 @@ fillPGconn(PGconn *conn, PQconninfoOption *connOptions)
  *
  * Internal subroutine to set up connection parameters given an already-
  * created PGconn and a conninfo string.  Derived settings should be
- * processed by calling connectOptions2 next.  (We split them because
+ * processed by calling pqConnectOptions2 next.  (We split them because
  * PQsetdbLogin overrides 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-02-02 Thread Alvaro Herrera
On 2024-Jan-29, Jelte Fennema-Nio wrote:

> On Mon, 29 Jan 2024 at 12:44, Alvaro Herrera  wrote:
> > Thanks!  I committed 0001 now.  I also renamed the new
> > pq_parse_int_param to pqParseIntParam, for consistency with other
> > routines there.  Please rebase the other patches.
> 
> Awesome! Rebased, and renamed pq_release_conn_hosts to
> pqReleaseConnHosts for the same consistency reasons.

Thank you, looks good.

I propose the following minor/trivial fixes over your initial 3 patches.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"I can't go to a restaurant and order food because I keep looking at the
fonts on the menu.  Five minutes later I realize that it's also talking
about food" (Donald Knuth)
>From 92ca8dc2739a777ff5a0df990d6e9818c5729ac5 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 2 Feb 2024 10:57:02 +0100
Subject: [PATCH 1/4] pgindent

---
 src/interfaces/libpq/fe-cancel.c | 14 +++---
 src/interfaces/libpq/libpq-fe.h  | 17 +
 src/tools/pgindent/typedefs.list |  1 +
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/src/interfaces/libpq/fe-cancel.c b/src/interfaces/libpq/fe-cancel.c
index 7416791d9f..d75c9628e7 100644
--- a/src/interfaces/libpq/fe-cancel.c
+++ b/src/interfaces/libpq/fe-cancel.c
@@ -137,7 +137,7 @@ oom_error:
  * Returns 1 if successful 0 if not.
  */
 int
-PQcancelSend(PGcancelConn * cancelConn)
+PQcancelSend(PGcancelConn *cancelConn)
 {
if (!cancelConn || cancelConn->conn.status == CONNECTION_BAD)
return 1;
@@ -157,7 +157,7 @@ PQcancelSend(PGcancelConn * cancelConn)
  * Poll a cancel connection. For usage details see PQconnectPoll.
  */
 PostgresPollingStatusType
-PQcancelPoll(PGcancelConn * cancelConn)
+PQcancelPoll(PGcancelConn *cancelConn)
 {
PGconn *conn = (PGconn *) cancelConn;
int n;
@@ -249,7 +249,7 @@ PQcancelPoll(PGcancelConn * cancelConn)
  * Get the status of a cancel connection.
  */
 ConnStatusType
-PQcancelStatus(const PGcancelConn * cancelConn)
+PQcancelStatus(const PGcancelConn *cancelConn)
 {
return PQstatus((const PGconn *) cancelConn);
 }
@@ -260,7 +260,7 @@ PQcancelStatus(const PGcancelConn * cancelConn)
  * Get the socket of the cancel connection.
  */
 int
-PQcancelSocket(const PGcancelConn * cancelConn)
+PQcancelSocket(const PGcancelConn *cancelConn)
 {
return PQsocket((const PGconn *) cancelConn);
 }
@@ -271,7 +271,7 @@ PQcancelSocket(const PGcancelConn * cancelConn)
  * Get the socket of the cancel connection.
  */
 char *
-PQcancelErrorMessage(const PGcancelConn * cancelConn)
+PQcancelErrorMessage(const PGcancelConn *cancelConn)
 {
return PQerrorMessage((const PGconn *) cancelConn);
 }
@@ -283,7 +283,7 @@ PQcancelErrorMessage(const PGcancelConn * cancelConn)
  * request.
  */
 void
-PQcancelReset(PGcancelConn * cancelConn)
+PQcancelReset(PGcancelConn *cancelConn)
 {
pqClosePGconn((PGconn *) cancelConn);
cancelConn->conn.status = CONNECTION_STARTING;
@@ -299,7 +299,7 @@ PQcancelReset(PGcancelConn * cancelConn)
  * Closes and frees the cancel connection.
  */
 void
-PQcancelFinish(PGcancelConn * cancelConn)
+PQcancelFinish(PGcancelConn *cancelConn)
 {
PQfinish((PGconn *) cancelConn);
 }
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 857ba54d94..851e549355 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -329,17 +329,18 @@ extern PostgresPollingStatusType PQresetPoll(PGconn 
*conn);
 extern void PQreset(PGconn *conn);
 
 /* Create a PGcancelConn that's used to cancel a query on the given PGconn */
-extern PGcancelConn * PQcancelConn(PGconn *conn);
+extern PGcancelConn *PQcancelConn(PGconn *conn);
+
 /* issue a blocking cancel request */
-extern int PQcancelSend(PGcancelConn * conn);
+extern int PQcancelSend(PGcancelConn *conn);
 
 /* issue or poll a non-blocking cancel request */
-extern PostgresPollingStatusType PQcancelPoll(PGcancelConn * cancelConn);
-extern ConnStatusType PQcancelStatus(const PGcancelConn * cancelConn);
-extern int PQcancelSocket(const PGcancelConn * cancelConn);
-extern char *PQcancelErrorMessage(const PGcancelConn * cancelConn);
-extern void PQcancelReset(PGcancelConn * cancelConn);
-extern void PQcancelFinish(PGcancelConn * cancelConn);
+extern PostgresPollingStatusType PQcancelPoll(PGcancelConn *cancelConn);
+extern ConnStatusType PQcancelStatus(const PGcancelConn *cancelConn);
+extern int PQcancelSocket(const PGcancelConn *cancelConn);
+extern char *PQcancelErrorMessage(const PGcancelConn *cancelConn);
+extern void PQcancelReset(PGcancelConn *cancelConn);
+extern void PQcancelFinish(PGcancelConn *cancelConn);
 
 
 /* request a cancel structure */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 91433d439b..9ffb169e9d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-29 Thread Jelte Fennema-Nio
On Mon, 29 Jan 2024 at 12:44, Alvaro Herrera  wrote:
> Thanks!  I committed 0001 now.  I also renamed the new
> pq_parse_int_param to pqParseIntParam, for consistency with other
> routines there.  Please rebase the other patches.

Awesome! Rebased, and renamed pq_release_conn_hosts to
pqReleaseConnHosts for the same consistency reasons.


v29-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v29-0002-libpq-Add-pqReleaseConnHosts-function.patch
Description: Binary data


v29-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v29-0003-libpq-Change-some-static-functions-to-extern.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-29 Thread Alvaro Herrera
On 2024-Jan-28, Jelte Fennema-Nio wrote:

> On Sun, 28 Jan 2024 at 10:51, Jelte Fennema-Nio  wrote:
> > Both of those are fixed now.
> 
> Okay, there turned out to also be an issue on Windows with
> setKeepalivesWin32 not being available in fe-cancel.c. That's fixed
> now too (as well as some minor formatting issues).

Thanks!  I committed 0001 now.  I also renamed the new
pq_parse_int_param to pqParseIntParam, for consistency with other
routines there.  Please rebase the other patches.

Thanks,

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-28 Thread Jelte Fennema-Nio
On Sun, 28 Jan 2024 at 10:51, Jelte Fennema-Nio  wrote:
> Both of those are fixed now.

Okay, there turned out to also be an issue on Windows with
setKeepalivesWin32 not being available in fe-cancel.c. That's fixed
now too (as well as some minor formatting issues).
From 4efbb0c75341f4612f0c5b8d5d3fe3f8f9c3b43c Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Fri, 26 Jan 2024 17:01:28 +0100
Subject: [PATCH v28 2/5] libpq: Add pq_release_conn_hosts function

In a follow up PR we'll need to free this connhost field in a function
defined in fe-cancel.c

So this extracts the logic to a dedicated extern function.
---
 src/interfaces/libpq/fe-connect.c | 38 ---
 src/interfaces/libpq/libpq-int.h  |  1 +
 2 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 5d08b4904d3..bc1f6521650 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4395,19 +4395,7 @@ freePGconn(PGconn *conn)
 		free(conn->events[i].name);
 	}
 
-	/* clean up pg_conn_host structures */
-	for (int i = 0; i < conn->nconnhost; ++i)
-	{
-		free(conn->connhost[i].host);
-		free(conn->connhost[i].hostaddr);
-		free(conn->connhost[i].port);
-		if (conn->connhost[i].password != NULL)
-		{
-			explicit_bzero(conn->connhost[i].password, strlen(conn->connhost[i].password));
-			free(conn->connhost[i].password);
-		}
-	}
-	free(conn->connhost);
+	pq_release_conn_hosts(conn);
 
 	free(conn->client_encoding_initial);
 	free(conn->events);
@@ -4526,6 +4514,30 @@ release_conn_addrinfo(PGconn *conn)
 	}
 }
 
+/*
+ * pq_release_conn_hosts
+ *	 - Free the host list in the PGconn.
+ */
+void
+pq_release_conn_hosts(PGconn *conn)
+{
+	if (conn->connhost)
+	{
+		for (int i = 0; i < conn->nconnhost; ++i)
+		{
+			free(conn->connhost[i].host);
+			free(conn->connhost[i].hostaddr);
+			free(conn->connhost[i].port);
+			if (conn->connhost[i].password != NULL)
+			{
+explicit_bzero(conn->connhost[i].password, strlen(conn->connhost[i].password));
+free(conn->connhost[i].password);
+			}
+		}
+		free(conn->connhost);
+	}
+}
+
 /*
  * sendTerminateConn
  *	 - Send a terminate message to backend.
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 48c10b474f5..4cbad2c2c83 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -680,6 +680,7 @@ extern int	pqPacketSend(PGconn *conn, char pack_type,
 extern bool pqGetHomeDirectory(char *buf, int bufsize);
 extern bool pq_parse_int_param(const char *value, int *result, PGconn *conn,
 			   const char *context);
+extern void pq_release_conn_hosts(PGconn *conn);
 
 extern pgthreadlock_t pg_g_threadlock;
 
-- 
2.34.1

From f1168ac4c3dd758a77be3ceb8c40bacb9aebef8c Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Fri, 26 Jan 2024 16:47:51 +0100
Subject: [PATCH v28 3/5] libpq: Change some static functions to extern

This is in preparation of a follow up commit that starts using these
functions from fe-cancel.c.
---
 src/interfaces/libpq/fe-connect.c | 85 +++
 src/interfaces/libpq/libpq-int.h  |  6 +++
 2 files changed, 46 insertions(+), 45 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index bc1f6521650..aeb3adc0e31 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -387,15 +387,10 @@ static const char uri_designator[] = "postgresql://";
 static const char short_uri_designator[] = "postgres://";
 
 static bool connectOptions1(PGconn *conn, const char *conninfo);
-static bool connectOptions2(PGconn *conn);
-static int	connectDBStart(PGconn *conn);
-static int	connectDBComplete(PGconn *conn);
 static PGPing internal_ping(PGconn *conn);
-static PGconn *makeEmptyPGconn(void);
 static void pqFreeCommandQueue(PGcmdQueueEntry *queue);
 static bool fillPGconn(PGconn *conn, PQconninfoOption *connOptions);
 static void freePGconn(PGconn *conn);
-static void closePGconn(PGconn *conn);
 static void release_conn_addrinfo(PGconn *conn);
 static int	store_conn_addrinfo(PGconn *conn, struct addrinfo *addrlist);
 static void sendTerminateConn(PGconn *conn);
@@ -644,7 +639,7 @@ pqDropServerData(PGconn *conn)
  * PQconnectStart or PQconnectStartParams (which differ in the same way as
  * PQconnectdb and PQconnectdbParams) and PQconnectPoll.
  *
- * Internally, the static functions connectDBStart, connectDBComplete
+ * Internally, the static functions pqConnectDBStart, pqConnectDBComplete
  * are part of the connection procedure.
  */
 
@@ -678,7 +673,7 @@ PQconnectdbParams(const char *const *keywords,
 	PGconn	   *conn = PQconnectStartParams(keywords, values, expand_dbname);
 
 	if (conn && conn->status != CONNECTION_BAD)
-		(void) connectDBComplete(conn);
+		(void) pqConnectDBComplete(conn);
 
 	return conn;
 }
@@ -731,7 +726,7 @@ PQconnectdb(const char *conninfo)
 	PGconn	   *conn = 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-28 Thread Jelte Fennema-Nio
On Sun, 28 Jan 2024 at 04:15, vignesh C  wrote:
> CFBot shows that the patch has few compilation errors as in [1]:
> [17:07:07.621] /usr/bin/ld:
> ../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
> `handle_sigint':
> [17:07:07.621] cancel.c:(.text+0x50): undefined reference to `PQcancel'

I forgot to update ./configure based builds with the new file, only
meson was working. Also it seems I trimmed the header list fe-cancel.c
a bit too much for OSX, so I added unistd.h back.

Both of those are fixed now.


v27-0002-libpq-Add-pq_release_conn_hosts-function.patch
Description: Binary data


v27-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v27-0003-libpq-Change-some-static-functions-to-extern.patch
Description: Binary data


v27-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v27-0001-libpq-Move-cancellation-related-functions-to-fe-.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-27 Thread vignesh C
On Fri, 26 Jan 2024 at 22:22, Jelte Fennema-Nio  wrote:
>
> On Fri, 26 Jan 2024 at 13:11, Alvaro Herrera  wrote:
> > I wonder, would it make sense to put all these new functions in a
> > separate file fe-cancel.c?
>
>
> Okay I tried doing that. I think the end result is indeed quite nice,
> having all the cancellation related functions together in a file. But
> it did require making a bunch of static functions in fe-connect
> extern, and adding them to libpq-int.h. On one hand that seems fine to
> me, on the other maybe that indicates that this cancellation logic
> makes sense to be in the same file as the other connection functions
> (in a sense, connecting is all that a cancel request does).

CFBot shows that the patch has few compilation errors as in [1]:
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
`handle_sigint':
[17:07:07.621] cancel.c:(.text+0x50): undefined reference to `PQcancel'
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
`SetCancelConn':
[17:07:07.621] cancel.c:(.text+0x10c): undefined reference to `PQfreeCancel'
[17:07:07.621] /usr/bin/ld: cancel.c:(.text+0x114): undefined
reference to `PQgetCancel'
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
`ResetCancelConn':
[17:07:07.621] cancel.c:(.text+0x148): undefined reference to `PQfreeCancel'
[17:07:07.621] /usr/bin/ld:
../../../src/fe_utils/libpgfeutils.a(connect_utils.o): in function
`disconnectDatabase':
[17:07:07.621] connect_utils.c:(.text+0x2fc): undefined reference to
`PQcancelConn'
[17:07:07.621] /usr/bin/ld: connect_utils.c:(.text+0x307): undefined
reference to `PQcancelSend'
[17:07:07.621] /usr/bin/ld: connect_utils.c:(.text+0x30f): undefined
reference to `PQcancelFinish'
[17:07:07.623] /usr/bin/ld: ../../../src/interfaces/libpq/libpq.so:
undefined reference to `PQcancelPoll'
[17:07:07.626] collect2: error: ld returned 1 exit status
[17:07:07.626] make[3]: *** [Makefile:31: pg_amcheck] Error 1
[17:07:07.626] make[2]: *** [Makefile:45: all-pg_amcheck-recurse] Error 2
[17:07:07.626] make[2]: *** Waiting for unfinished jobs
[17:07:08.126] /usr/bin/ld: ../../../src/interfaces/libpq/libpq.so:
undefined reference to `PQcancelPoll'
[17:07:08.130] collect2: error: ld returned 1 exit status
[17:07:08.131] make[3]: *** [Makefile:42: initdb] Error 1
[17:07:08.131] make[2]: *** [Makefile:45: all-initdb-recurse] Error 2
[17:07:08.492] /usr/bin/ld: ../../../src/interfaces/libpq/libpq.so:
undefined reference to `PQcancelPoll'
[17:07:08.495] collect2: error: ld returned 1 exit status
[17:07:08.496] make[3]: *** [Makefile:50: pg_basebackup] Error 1
[17:07:08.496] make[2]: *** [Makefile:45: all-pg_basebackup-recurse] Error 2
[17:07:09.060] /usr/bin/ld: parallel.o: in function `sigTermHandler':
[17:07:09.060] parallel.c:(.text+0x1aa): undefined reference to `PQcancel'

Please post an updated version for the same.

[1] - https://cirrus-ci.com/task/6210637211107328

Regards,
Vignesh




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Jelte Fennema-Nio
On Fri, 26 Jan 2024 at 18:19, Alvaro Herrera  wrote:
> Yeah, I see that point of view as well.  I like the end result; the
> additional protos in libpq-int.h don't bother me.  Does anybody else
> wants to share their opinion on it?  If none, then I'd consider going
> ahead with this version.

To be clear, I'm +1 on the new file structure (although if people feel
strongly against it, I don't care enough to make a big deal out of
it).

@Alvaro did you have any other comments on the contents of the patch btw?




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Alvaro Herrera
On 2024-Jan-26, Jelte Fennema-Nio wrote:

> Okay I tried doing that. I think the end result is indeed quite nice,
> having all the cancellation related functions together in a file. But
> it did require making a bunch of static functions in fe-connect
> extern, and adding them to libpq-int.h. On one hand that seems fine to
> me, on the other maybe that indicates that this cancellation logic
> makes sense to be in the same file as the other connection functions
> (in a sense, connecting is all that a cancel request does).

Yeah, I see that point of view as well.  I like the end result; the
additional protos in libpq-int.h don't bother me.  Does anybody else
wants to share their opinion on it?  If none, then I'd consider going
ahead with this version.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"We’ve narrowed the problem down to the customer’s pants being in a situation
 of vigorous combustion" (Robert Haas, Postgres expert extraordinaire)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Jelte Fennema-Nio
On Fri, 26 Jan 2024 at 13:11, Alvaro Herrera  wrote:
> I wonder, would it make sense to put all these new functions in a
> separate file fe-cancel.c?


Okay I tried doing that. I think the end result is indeed quite nice,
having all the cancellation related functions together in a file. But
it did require making a bunch of static functions in fe-connect
extern, and adding them to libpq-int.h. On one hand that seems fine to
me, on the other maybe that indicates that this cancellation logic
makes sense to be in the same file as the other connection functions
(in a sense, connecting is all that a cancel request does).


v26-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v26-0002-libpq-Add-pq_release_conn_hosts-function.patch
Description: Binary data


v26-0003-libpq-Change-some-static-functions-to-extern.patch
Description: Binary data


v26-0001-libpq-Move-cancellation-related-functions-to-fe-.patch
Description: Binary data


v26-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Alvaro Herrera
Pushed 0001.

I wonder, would it make sense to put all these new functions in a
separate file fe-cancel.c?

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"World domination is proceeding according to plan"(Andrew Morton)




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Jelte Fennema-Nio
On Fri, 26 Jan 2024 at 02:59, vignesh C  wrote:
> Please post an updated version for the same.

Done.
From 5a94d610a4fe138365e2e88c5cec72eba53ed036 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Thu, 14 Dec 2023 13:39:04 +0100
Subject: [PATCH v25 2/3] Add non-blocking version of PQcancel

This patch makes the following changes in libpq:
1. Add a new PQcancelSend function, which sends cancellation requests
   using the regular connection establishment code. This makes sure
   that cancel requests support and use all connection options
   including encryption.
2. Add a new PQcancelConn function which allows sending cancellation in
   a non-blocking way by using it together with the newly added
   PQcancelPoll and PQcancelSocket.

The existing PQcancel API is using blocking IO. This makes PQcancel
impossible to use in an event loop based codebase, without blocking the
event loop until the call returns. PQcancelConn can now be used instead,
to have a non-blocking way of sending cancel requests.

This patch also includes a test for all of libpq cancellation APIs. The
test can be easily run like this:

cd src/test/modules/libpq_pipeline
make && ./libpq_pipeline cancel
---
 doc/src/sgml/libpq.sgml   | 280 ++-
 src/interfaces/libpq/exports.txt  |   8 +
 src/interfaces/libpq/fe-connect.c | 451 +-
 src/interfaces/libpq/libpq-fe.h   |  27 +-
 src/interfaces/libpq/libpq-int.h  |   9 +
 .../modules/libpq_pipeline/libpq_pipeline.c   | 263 +-
 6 files changed, 987 insertions(+), 51 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d0d5aefadc0..9808e678650 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -265,7 +265,7 @@ PGconn *PQsetdb(char *pghost,
 
  PQconnectStartParamsPQconnectStartParams
  PQconnectStartPQconnectStart
- PQconnectPollPQconnectPoll
+ PQconnectPollPQconnectPoll
  
   
nonblocking connection
@@ -5281,7 +5281,7 @@ int PQisBusy(PGconn *conn);
/
can also attempt to cancel a command that is still being processed
by the server; see .  But regardless of
-   the return value of , the application
+   the return value of , the application
must continue with the normal result-reading sequence using
.  A successful cancellation will
simply cause the command to terminate sooner than it would have
@@ -6034,13 +6034,223 @@ int PQsetSingleRowMode(PGconn *conn);
this section.
 

+
+ PQcancelConnPQcancelConn
+
+ 
+  
+   Prepares a connection over which a cancel request can be sent.
+
+PGcancelConn *PQcancelConn(PGconn *conn);
+
+  
+
+  
+creates a
+   PGcancelConnPGcancelConn
+   object, but it won't instantly start sending a cancel request over this
+   connection. A cancel request can be sent over this connection in a
+   blocking manner using  and in a
+   non-blocking manner using .
+   The return value should can be passed to ,
+   to check if the PGcancelConn object was
+   created successfully. The PGcancelConn object
+   is an opaque structure that is not meant to be accessed directly by the
+   application. This PGcancelConn object can be
+   used to cancel the query that's running on the original connection in a
+   thread-safe way.
+  
+
+  
+   If the original connection is encrypted (using TLS or GSS), then the
+   connection for the cancel request is encrypted in the same way. Any
+   connection options that are only used during authentication or after
+   authentication of the client are ignored though, because cancellation
+   requests do not require authentication and the connection is closed right
+   after the cancellation request is submitted.
+  
+
+  
+   Note that when PQcancelConn returns a non-null
+   pointer, you must call  when you
+   are finished with it, in order to dispose of the structure and any
+   associated memory blocks. This must be done even if the cancel request
+   failed or was abandoned.
+  
+ 
+
+
+
+ PQcancelSendPQcancelSend
+
+ 
+  
+   Requests that the server abandons processing of the current command in a blocking manner.
+
+int PQcancelSend(PGcancelConn *conn);
+
+  
+
+  
+   The request is made over the given PGcancelConn,
+   which needs to be created with 
+   The return value of 
+   is 1 if the cancel request was successfully
+   dispatched and 0 if not. If it was unsuccessful, the error message can be
+   retrieved using .
+  
+
+  
+   Successful dispatch of the cancellation is no guarantee that the request
+   will have any effect, however. If the cancellation is effective, the
+   command being canceled will terminate early and return an error result.
+   If the cancellation fails (say, because 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-25 Thread vignesh C
On Wed, 20 Dec 2023 at 19:17, Jelte Fennema-Nio  wrote:
>
> On Thu, 14 Dec 2023 at 13:57, Jelte Fennema-Nio  wrote:
> > I changed all the places that were not adhering to those spellings.
>
> It seems I forgot a /g on my sed command to do this so it turned out I
> missed one that caused the test to fail to compile... Attached is a
> fixed version.
>
> I also updated the patchset to use the EOF detection provided by
> 0a5c46a7a488f2f4260a90843bb9de6c584c7f4e instead of introducing a new
> way of EOF detection using a -2 return value.

CFBot shows that the patch does not apply anymore as in [1]:
patching file doc/src/sgml/libpq.sgml
...
patching file src/interfaces/libpq/exports.txt
Hunk #1 FAILED at 191.
1 out of 1 hunk FAILED -- saving rejects to file
src/interfaces/libpq/exports.txt.rej
patching file src/interfaces/libpq/fe-connect.c

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_3511.log

Regards,
Vignesh




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-12-20 Thread Jelte Fennema-Nio
On Thu, 14 Dec 2023 at 13:57, Jelte Fennema-Nio  wrote:
> I changed all the places that were not adhering to those spellings.

It seems I forgot a /g on my sed command to do this so it turned out I
missed one that caused the test to fail to compile... Attached is a
fixed version.

I also updated the patchset to use the EOF detection provided by
0a5c46a7a488f2f4260a90843bb9de6c584c7f4e instead of introducing a new
way of EOF detection using a -2 return value.


v24-0003-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v24-0001-Fix-spelling-of-canceled-cancellation.patch
Description: Binary data


v24-0002-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-12-14 Thread Jelte Fennema-Nio
On Mon, 13 Nov 2023 at 03:39, Thomas Munro  wrote:
> We follow the common en-US usage: "canceled", "canceling" but
> "cancellation".  Blame Webstah et al.

I changed all the places that were not adhering to those spellings.
There were also a few of such places in parts of the codebase that
these changes didn't touch. I included a new 0001 patch to fix those.

I do feel like this patchset is pretty much in a committable state. So
it would be very much appreciated if any comitter could help push it
over the finish line.


v23-0002-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v23-0001-Fix-spelling-of-canceled-cancellation.patch
Description: Binary data


v23-0004-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v23-0003-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-11-12 Thread Thomas Munro
Trivial observation: these patches obviously introduce many instances
of words derived from "cancel", but they don't all conform to
established project decisions (cf 21f1e15a) about how to spell them.
We follow the common en-US usage: "canceled", "canceling" but
"cancellation".  Blame Webstah et al.

https://english.stackexchange.com/questions/176957/cancellation-canceled-canceling-us-usage




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-07-17 Thread Jelte Fennema
Rebased again to resolve some conflicts


v22-0003-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v22-0002-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v22-0004-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-06-19 Thread Jelte Fennema
I noticed that cfbot was unable to run tests due to some rebase
conflict. It seems the pgindent changes from patch 1 have now been
made.
So adding the rebased patches without patch 1 now to unblock cfbot.


v21-0002-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v21-0003-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v21-0004-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-04-21 Thread Jelte Fennema
Okay, I rebased again. Indeed 983ec23007b gave the most problems.

On Fri, 7 Apr 2023 at 10:02, Denis Laxalde  wrote:
>
> The patch set does not apply any more.
>
> I tried to rebase locally; even leaving out 1 ("libpq: Run pgindent
> after a9e9a9f32b3"), patch 4 ("Start using new libpq cancel APIs") is
> harder to resolve following 983ec23007b (I suppose).
>
> Appart from that, the implementation in v19 sounds good to me, and seems
> worthwhile. FWIW, as said before, I also implemented it in Psycopg in a
> sort of an end-to-end validation.


v20-0003-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v20-0004-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v20-0002-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v20-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-04-07 Thread Denis Laxalde

The patch set does not apply any more.

I tried to rebase locally; even leaving out 1 ("libpq: Run pgindent 
after a9e9a9f32b3"), patch 4 ("Start using new libpq cancel APIs") is 
harder to resolve following 983ec23007b (I suppose).


Appart from that, the implementation in v19 sounds good to me, and seems 
worthwhile. FWIW, as said before, I also implemented it in Psycopg in a 
sort of an end-to-end validation.





Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-30 Thread Jelte Fennema
On Thu, 30 Mar 2023 at 10:07, Denis Laxalde  wrote:
> Patch 5 is missing respective changes; please find attached a fixup
> patch for these.

Thanks, attached are newly rebased patches that include this change. I
also cast the result of PQcancelSend to to void in the one case where
it's ignored on purpose. Note that the patchset shrunk by one, since
the original patch 0002 has been committed now.


v19-0002-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v19-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: Binary data


v19-0003-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v19-0004-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-30 Thread Denis Laxalde
Jelte Fennema a écrit :
> On Wed, 29 Mar 2023 at 10:43, Denis Laxalde  wrote:
> > More importantly, not having PQcancelSend() creating the PGcancelConn
> > makes reuse of that value, passing through PQcancelReset(), more
> > intuitive. E.g., in the tests:
> 
> You convinced me. Attached is an updated patch where PQcancelSend
> takes the PGcancelConn and returns 1 or 0.

Patch 5 is missing respective changes; please find attached a fixup
patch for these.
>From c9e59fb3e30db1bfab75be9fdd4afbc227a5270e Mon Sep 17 00:00:00 2001
From: Denis Laxalde 
Date: Thu, 30 Mar 2023 09:19:18 +0200
Subject: [PATCH] fixup! Start using new libpq cancel APIs

---
 contrib/dblink/dblink.c  | 4 ++--
 src/fe_utils/connect_utils.c | 4 +++-
 src/test/isolation/isolationtester.c | 4 ++--
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index e139f66e11..073795f088 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1332,11 +1332,11 @@ dblink_cancel_query(PG_FUNCTION_ARGS)
 
 	dblink_init();
 	conn = dblink_get_named_conn(text_to_cstring(PG_GETARG_TEXT_PP(0)));
-	cancelConn = PQcancelSend(conn);
+	cancelConn = PQcancelConn(conn);
 
 	PG_TRY();
 	{
-		if (PQcancelStatus(cancelConn) == CONNECTION_BAD)
+		if (!PQcancelSend(cancelConn))
 		{
 			msg = pchomp(PQcancelErrorMessage(cancelConn));
 		}
diff --git a/src/fe_utils/connect_utils.c b/src/fe_utils/connect_utils.c
index b32448c010..1cfd717217 100644
--- a/src/fe_utils/connect_utils.c
+++ b/src/fe_utils/connect_utils.c
@@ -161,7 +161,9 @@ disconnectDatabase(PGconn *conn)
 
 	if (PQtransactionStatus(conn) == PQTRANS_ACTIVE)
 	{
-		PQcancelFinish(PQcancelSend(conn));
+		PGcancelConn *cancelConn = PQcancelConn(conn);
+		PQcancelSend(cancelConn);
+		PQcancelFinish(cancelConn);
 	}
 
 	PQfinish(conn);
diff --git a/src/test/isolation/isolationtester.c b/src/test/isolation/isolationtester.c
index 3781f7982b..de31a87571 100644
--- a/src/test/isolation/isolationtester.c
+++ b/src/test/isolation/isolationtester.c
@@ -946,9 +946,9 @@ try_complete_step(TestSpec *testspec, PermutationStep *pstep, int flags)
 			 */
 			if (td > max_step_wait && !canceled)
 			{
-PGcancelConn *cancel_conn = PQcancelSend(conn);
+PGcancelConn *cancel_conn = PQcancelConn(conn);
 
-if (PQcancelStatus(cancel_conn) == CONNECTION_OK)
+if (PQcancelSend(cancel_conn))
 {
 	/*
 	 * print to stdout not stderr, as this should appear in
-- 
2.30.2



Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-29 Thread Jelte Fennema
On Wed, 29 Mar 2023 at 10:43, Denis Laxalde  wrote:
> More importantly, not having PQcancelSend() creating the PGcancelConn
> makes reuse of that value, passing through PQcancelReset(), more
> intuitive. E.g., in the tests:

You convinced me. Attached is an updated patch where PQcancelSend
takes the PGcancelConn and returns 1 or 0.

> The thing is that we need the connection encoding (client_encoding) when
> eventually forwarding the result of PQcancelErrorMessage(), decoded, to
> the user.

Cancel connections don't have an encoding specified. They never
receive an error from the server. All errors come from the machine
that libpq is on. So I think you're making the decoding more
complicated than it needs to be.


v18-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v18-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v18-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: Binary data


v18-0002-Copy-and-store-addrinfo-in-libpq-owned-private-m.patch
Description: Binary data


v18-0003-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-29 Thread Denis Laxalde
Jelte Fennema a écrit :
> > Namely, I wonder why it returns a PGcancelConn and what's the
> > point of requiring the user to call PQcancelStatus() to see if something
> > got wrong. Maybe it could be defined as:
> >
> >   int PQcancelSend(PGcancelConn *cancelConn);
> >
> > where the return value would be status? And the user would only need to
> > call PQcancelErrorMessage() in case of error. This would leave only one
> > single way to create a PGcancelConn value (i.e. PQcancelConn()), which
> > seems less confusing to me.
> 
> To clarify what you mean, the API would then be like this:
> PGcancelConn cancelConn = PQcancelConn(conn);
> if (PQcancelSend(cancelConn) == CONNECTION_BAD) {
>printf("ERROR %s\n", PQcancelErrorMessage(cancelConn))
>exit(1)
> }

I'm not sure it's worth returning the connection status, maybe just an
int value (the return value of connectDBComplete() for instance).

More importantly, not having PQcancelSend() creating the PGcancelConn
makes reuse of that value, passing through PQcancelReset(), more
intuitive. E.g., in the tests:

diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c 
b/src/test/modules/libpq_pipeline/libpq_pipeline.c
index 6764ab513b..91363451af 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -217,17 +217,18 @@ test_cancel(PGconn *conn, const char *conninfo)
pg_fatal("failed to run PQrequestCancel: %s", 
PQerrorMessage(conn));
confirm_query_cancelled(conn);
 
+   cancelConn = PQcancelConn(conn);
+
/* test PQcancelSend */
send_cancellable_query(conn, monitorConn);
-   cancelConn = PQcancelSend(conn);
-   if (PQcancelStatus(cancelConn) == CONNECTION_BAD)
+   if (PQcancelSend(cancelConn) == CONNECTION_BAD)
pg_fatal("failed to run PQcancelSend: %s", 
PQcancelErrorMessage(cancelConn));
confirm_query_cancelled(conn);
-   PQcancelFinish(cancelConn);
+
+   PQcancelReset(cancelConn);
 
/* test PQcancelConn and then polling with PQcancelPoll */
send_cancellable_query(conn, monitorConn);
-   cancelConn = PQcancelConn(conn);
if (PQcancelStatus(cancelConn) == CONNECTION_BAD)
pg_fatal("bad cancel connection: %s", 
PQcancelErrorMessage(cancelConn));
while (true)

Otherwise, it's not clear if the PGcancelConn created by PQcancelSend()
should be reused or not. But maybe that's a matter of documentation?


> > As part of my testing, I've implemented non-blocking cancellation in
> > Psycopg, based on v16 on this patchset. Overall this worked fine and
> > seems useful; if you want to try it:
> >
> >   https://github.com/dlax/psycopg3/tree/pg16/non-blocking-pqcancel
> 
> That's great to hear! I'll try to take a closer look at that change tomorrow.

See also https://github.com/psycopg/psycopg/issues/534 if you want to
discuss about this.

> > (The only thing I found slightly inconvenient is the need to convey the
> > connection encoding (from PGconn) when handling error message from the
> > PGcancelConn.)
> 
> Could you expand a bit more on this? And if you have any idea on how
> to improve the API with regards to this?

The thing is that we need the connection encoding (client_encoding) when
eventually forwarding the result of PQcancelErrorMessage(), decoded, to
the user. More specifically, it seems to me that we'd the encoding of
the *cancel connection*, but since PQparameterStatus() cannot be used
with a PGcancelConn, I use that of the PGconn. Roughly, in Python:

encoding = conn.parameter_status(b"client_encoding")
# i.e, in C: char *encoding PQparameterStatus(conn, "client_encoding");
cancel_conn = conn.cancel_conn()
# i.e., in C: PGcancelConn *cancelConn = PQcancelConn(conn);
# [... then work with with cancel_conn ...]
if cancel_conn.status == ConnStatus.BAD:
raise OperationalError(cancel_conn.error_message().decode(encoding))

This feels a bit non-atomic to me; isn't there a risk that
client_encoding be changed between PQparameterStatus(conn) and
PQcancelConn(conn) calls?

So maybe PQcancelParameterStatus(PGcancelConn *cancelConn, char *name)
is needed?




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-28 Thread Jelte Fennema
On Tue, 28 Mar 2023 at 16:54, Denis Laxalde  wrote:
> I had a look into your patchset (v16), did a quick review and played a
> bit with the feature.
>
> Patch 2 is missing the documentation about PQcancelSocket() and contains
> a few typos; please find attached a (fixup) patch to correct these.

Thanks applied that patch and attached a new patchset

> Namely, I wonder why it returns a PGcancelConn and what's the
> point of requiring the user to call PQcancelStatus() to see if something
> got wrong. Maybe it could be defined as:
>
>   int PQcancelSend(PGcancelConn *cancelConn);
>
> where the return value would be status? And the user would only need to
> call PQcancelErrorMessage() in case of error. This would leave only one
> single way to create a PGcancelConn value (i.e. PQcancelConn()), which
> seems less confusing to me.

To clarify what you mean, the API would then be like this:
PGcancelConn cancelConn = PQcancelConn(conn);
if (PQcancelSend(cancelConn) == CONNECTION_BAD) {
   printf("ERROR %s\n", PQcancelErrorMessage(cancelConn))
   exit(1)
}

Instead of:
PGcancelConn cancelConn = PQcancelSend(conn);
if (PQcancelStatus(cancelConn) == CONNECTION_BAD) {
   printf("ERROR %s\n", PQcancelErrorMessage(cancelConn))
   exit(1)
}

Those are so similar, that I have no preference either way. If more
people prefer one over the other I'm happy to change it, but for now
I'll keep it as is.

> As part of my testing, I've implemented non-blocking cancellation in
> Psycopg, based on v16 on this patchset. Overall this worked fine and
> seems useful; if you want to try it:
>
>   https://github.com/dlax/psycopg3/tree/pg16/non-blocking-pqcancel

That's great to hear! I'll try to take a closer look at that change tomorrow.

> (The only thing I found slightly inconvenient is the need to convey the
> connection encoding (from PGconn) when handling error message from the
> PGcancelConn.)

Could you expand a bit more on this? And if you have any idea on how
to improve the API with regards to this?


v17-0002-Copy-and-store-addrinfo-in-libpq-owned-private-m.patch
Description: Binary data


v17-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v17-0003-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v17-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v17-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-28 Thread Denis Laxalde
Hi Jelte,

I had a look into your patchset (v16), did a quick review and played a
bit with the feature.

Patch 2 is missing the documentation about PQcancelSocket() and contains
a few typos; please find attached a (fixup) patch to correct these.


--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -321,16 +328,28 @@ extern PostgresPollingStatusType PQresetPoll(PGconn 
*conn);
 /* Synchronous (blocking) */
 extern void PQreset(PGconn *conn);
 
+/* issue a cancel request */
+extern PGcancelConn * PQcancelSend(PGconn *conn);
[...]

Maybe I'm missing something, but this function above seems a bit
strange. Namely, I wonder why it returns a PGcancelConn and what's the
point of requiring the user to call PQcancelStatus() to see if something
got wrong. Maybe it could be defined as:

  int PQcancelSend(PGcancelConn *cancelConn);

where the return value would be status? And the user would only need to
call PQcancelErrorMessage() in case of error. This would leave only one
single way to create a PGcancelConn value (i.e. PQcancelConn()), which
seems less confusing to me.

Jelte Fennema wrote:
> Especially since I ran into another use case that I would want to use
> this patch for recently: Adding an async cancel function to Python
> it's psycopg3 library. This library exposes both a Connection class
> and an AsyncConnection class (using python its asyncio feature). But
> one downside of the AsyncConnection type is that it doesn't have a
> cancel method.

As part of my testing, I've implemented non-blocking cancellation in
Psycopg, based on v16 on this patchset. Overall this worked fine and
seems useful; if you want to try it:

  https://github.com/dlax/psycopg3/tree/pg16/non-blocking-pqcancel

(The only thing I found slightly inconvenient is the need to convey the
connection encoding (from PGconn) when handling error message from the
PGcancelConn.)

Cheers,
Denis
>From a5f9cc680ffa520b05fe34b7cac5df2e60a6d4ad Mon Sep 17 00:00:00 2001
From: Denis Laxalde 
Date: Tue, 28 Mar 2023 16:06:42 +0200
Subject: [PATCH] fixup! Add non-blocking version of PQcancel

---
 doc/src/sgml/libpq.sgml  | 16 +++-
 src/interfaces/libpq/fe-connect.c|  2 +-
 src/test/modules/libpq_pipeline/libpq_pipeline.c |  2 +-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 29f08a4317..aa404c4d15 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -5795,7 +5795,7 @@ PGcancelConn *PQcancelSend(PGconn *conn);
options as the the original PGconn. So when the
original connection is encrypted (using TLS or GSS), the connection for
the cancel request is encrypted in the same way. Any connection options
-   that only are only used during authentication or after authentication of
+   that are only used during authentication or after authentication of
the client are ignored though, because cancellation requests do not
require authentication and the connection is closed right after the
cancellation request is submitted.
@@ -5912,6 +5912,20 @@ ConnStatusType PQcancelStatus(const PGcancelConn *conn);
  
 
 
+
+ PQcancelSocketPQcancelSocket
+
+ 
+  
+   A version of  that can be used for
+   cancellation connections.
+
+int PQcancelSocket(PGcancelConn *conn);
+
+  
+ 
+
+
 
  PQcancelPollPQcancelPoll
 
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 74e337fddf..16af7303d4 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -808,7 +808,7 @@ PQcancelConn(PGconn *conn)
 	/*
 	 * Cancel requests should not iterate over all possible hosts. The request
 	 * needs to be sent to the exact host and address that the original
-	 * connection used. So we we manually create the host and address arrays
+	 * connection used. So we manually create the host and address arrays
 	 * with a single element after freeing the host array that we generated
 	 * from the connection options.
 	 */
diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c b/src/test/modules/libpq_pipeline/libpq_pipeline.c
index e8e904892c..6764ab513b 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -89,7 +89,7 @@ pg_fatal_impl(int line, const char *fmt,...)
 /*
  * Check that the query on the given connection got cancelled.
  *
- * This is a function wrapped in a macrco to make the reported line number
+ * This is a function wrapped in a macro to make the reported line number
  * in an error match the line number of the invocation.
  */
 #define confirm_query_cancelled(conn) confirm_query_cancelled_impl(__LINE__, conn)
-- 
2.30.2



Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-22 Thread Jelte Fennema
Rebased after conflicts with bfc9497ece01c7c45437bc36387cb1ebe346f4d2

Also included the fix for feedback from Daniel on patch 2, which he
had shared in the load balancing thread.

On Wed, 15 Mar 2023 at 09:49, Jelte Fennema  wrote:
>
> The rebase was indeed trivial (git handled everything automatically),
> because my first patch was doing a superset of the changes that were
> committed in b6dfee28f. Attached are the new patches.
>
> On Tue, 14 Mar 2023 at 19:04, Greg Stark  wrote:
> >
> > On Tue, 14 Mar 2023 at 13:59, Tom Lane  wrote:
> > >
> > > "Gregory Stark (as CFM)"  writes:
> > > > It looks like this needs a big rebase in fea-uth.c fe-auth-scram.c and
> > > > fe-connect.c. Every hunk is failing which perhaps means the code
> > > > you're patching has been moved or refactored?
> > >
> > > The cfbot is giving up after
> > > v14-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch fails,
> > > but that's been superseded (at least in part) by b6dfee28f.
> >
> > Ah, same with Jelte Fennema's patch for load balancing in libpq.
> >
> > --
> > greg


v16-0003-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v16-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v16-0002-Refactor-libpq-to-store-addrinfo-in-a-libpq-owne.patch
Description: Binary data


v16-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: Binary data


v16-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-15 Thread Jelte Fennema
The rebase was indeed trivial (git handled everything automatically),
because my first patch was doing a superset of the changes that were
committed in b6dfee28f. Attached are the new patches.

On Tue, 14 Mar 2023 at 19:04, Greg Stark  wrote:
>
> On Tue, 14 Mar 2023 at 13:59, Tom Lane  wrote:
> >
> > "Gregory Stark (as CFM)"  writes:
> > > It looks like this needs a big rebase in fea-uth.c fe-auth-scram.c and
> > > fe-connect.c. Every hunk is failing which perhaps means the code
> > > you're patching has been moved or refactored?
> >
> > The cfbot is giving up after
> > v14-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch fails,
> > but that's been superseded (at least in part) by b6dfee28f.
>
> Ah, same with Jelte Fennema's patch for load balancing in libpq.
>
> --
> greg


v15-0003-Return-2-from-pqReadData-on-EOF.patch
Description: Binary data


v15-0002-Refactor-libpq-to-store-addrinfo-in-a-libpq-owne.patch
Description: Binary data


v15-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: Binary data


v15-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v15-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-14 Thread Greg Stark
On Tue, 14 Mar 2023 at 13:59, Tom Lane  wrote:
>
> "Gregory Stark (as CFM)"  writes:
> > It looks like this needs a big rebase in fea-uth.c fe-auth-scram.c and
> > fe-connect.c. Every hunk is failing which perhaps means the code
> > you're patching has been moved or refactored?
>
> The cfbot is giving up after
> v14-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch fails,
> but that's been superseded (at least in part) by b6dfee28f.

Ah, same with Jelte Fennema's patch for load balancing in libpq.

-- 
greg




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-14 Thread Tom Lane
"Gregory Stark (as CFM)"  writes:
> It looks like this needs a big rebase in fea-uth.c fe-auth-scram.c and
> fe-connect.c. Every hunk is failing which perhaps means the code
> you're patching has been moved or refactored?

The cfbot is giving up after
v14-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch fails,
but that's been superseded (at least in part) by b6dfee28f.

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-14 Thread Gregory Stark (as CFM)
It looks like this needs a big rebase in fea-uth.c fe-auth-scram.c and
fe-connect.c. Every hunk is failing which perhaps means the code
you're patching has been moved or refactored?




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-01 Thread Gregory Stark (as CFM)
On Wed, 1 Mar 2023 at 14:48, Jelte Fennema  wrote:

> > > This looks like it needs a rebase.
>
> done

Great. Please update the CF entry to Needs Review or Ready for
Committer as appropriate :)

-- 
Gregory Stark
As Commitfest Manager




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-01 Thread Greg S
On Tue, 28 Feb 2023 at 15:59, Gregory Stark  wrote:
>
> This looks like it needs a rebase.

So I'm updating the patch to Waiting on Author




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-03-01 Thread Gregory Stark
This looks like it needs a rebase.

=== Applying patches on top of PostgreSQL commit ID
71a75626d5271f2bcdbdc43b8c13065c4634fd9f ===
=== applying patch ./v11-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
patching file src/interfaces/libpq/fe-auth-scram.c
patching file src/interfaces/libpq/fe-auth.c
patching file src/interfaces/libpq/fe-connect.c
Hunk #35 FAILED at 3216.
Hunk #36 succeeded at 3732 (offset 27 lines).
Hunk #37 succeeded at 3782 (offset 27 lines).
Hunk #38 succeeded at 3795 (offset 27 lines).
Hunk #39 succeeded at 7175 (offset 27 lines).
1 out of 39 hunks FAILED -- saving rejects to file
src/interfaces/libpq/fe-connect.c.rej
patching file src/interfaces/libpq/fe-exec.c
patching file src/interfaces/libpq/fe-lobj.c
patching file src/interfaces/libpq/fe-misc.c
patching file src/interfaces/libpq/fe-protocol3.c
patching file src/interfaces/libpq/fe-secure-common.c
patching file src/interfaces/libpq/fe-secure-gssapi.c
Hunk #3 succeeded at 590 (offset 2 lines).
patching file src/interfaces/libpq/fe-secure-openssl.c
Hunk #3 succeeded at 415 (offset 5 lines).
Hunk #4 succeeded at 967 (offset 5 lines).
Hunk #5 succeeded at 993 (offset 5 lines).
Hunk #6 succeeded at 1037 (offset 5 lines).
Hunk #7 succeeded at 1089 (offset 5 lines).
Hunk #8 succeeded at 1122 (offset 5 lines).
Hunk #9 succeeded at 1140 (offset 5 lines).
Hunk #10 succeeded at 1239 (offset 5 lines).
Hunk #11 succeeded at 1250 (offset 5 lines).
Hunk #12 succeeded at 1265 (offset 5 lines).
Hunk #13 succeeded at 1278 (offset 5 lines).
Hunk #14 succeeded at 1315 (offset 5 lines).
Hunk #15 succeeded at 1326 (offset 5 lines).
Hunk #16 succeeded at 1383 (offset 5 lines).
Hunk #17 succeeded at 1399 (offset 5 lines).
Hunk #18 succeeded at 1452 (offset 5 lines).
Hunk #19 succeeded at 1494 (offset 5 lines).
patching file src/interfaces/libpq/fe-secure.c
patching file src/interfaces/libpq/libpq-int.h




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-01-19 Thread Jelte Fennema
Is there anything that is currently blocking this patch? I'd quite
like it to get into PG16.

Especially since I ran into another use case that I would want to use
this patch for recently: Adding an async cancel function to Python
it's psycopg3 library. This library exposes both a Connection class
and an AsyncConnection class (using python its asyncio feature). But
one downside of the AsyncConnection type is that it doesn't have a
cancel method.

I ran into this while changing the PgBouncer tests to use python. And
the cancellation tests were the only tests that required me to use a
ThreadPoolExecutor instead of simply being able to use async-await
style programming:
https://github.com/pgbouncer/pgbouncer/blob/master/test/test_cancel.py#LL9C17-L9C17




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2022-11-30 Thread Jelte Fennema
> This version of the patch no longer applies, a rebased version is needed.

Attached is a patch that applies cleanly again and is also changed
to use the recently introduced libpq_append_conn_error.

I also attached a patch that runs pgindent after the introduction of
libpq_append_conn_error. I noticed that this hadn't happened when
trying to run pgindent on my own changes.

v9-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: v9-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch


v9-0002-Add-non-blocking-version-of-PQcancel.patch
Description: v9-0002-Add-non-blocking-version-of-PQcancel.patch


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2022-09-14 Thread Tom Lane
Jelte Fennema  writes:
> [ non-blocking PQcancel ]

I pushed the 0001 patch (libpq_pipeline documentation) with a bit
of further wordsmithing.

As for 0002, I'm not sure that's anywhere near ready.  I doubt it's
a great idea to un-deprecate PQrequestCancel with a major change
in its behavior.  If there is anybody out there still using it,
they're not likely to appreciate that.  Let's leave that alone and
pick some other name.

I'm also finding the entire design of PQrequestCancelStart etc to
be horribly confusing --- it's not *bad* necessarily, but the chosen
function names are seriously misleading.  PQrequestCancelStart doesn't
actually "start" anything, so the apparent parallel with PQconnectStart
is just wrong.  It's also fairly unclear what the state of a cancel
PQconn is after the request cycle is completed, and whether you can
re-use it (especially after a failed request), and whether you have
to dispose of it separately.

On the whole it feels like a mistake to have two separate kinds of
PGconn with fundamentally different behaviors and yet no distinction
in the API.  I think I'd recommend having a separate struct type
(which might internally contain little more than a pointer to a
cloned PGconn), and provide only a limited set of operations on it.
Seems like create, start/continue cancel request, destroy, and
fetch error message ought to be enough.  I don't see a reason why we
need to support all of libpq's inquiry operations on such objects ---
for instance, if you want to know which host is involved, you could
perfectly well query the parent PGconn.  Nor do I want to run around
and add code to every single libpq entry point to make it reject cancel
PGconns if it can't support them, but we'd have to do so if there's
just one struct type.

I'm not seeing the use-case for PQconnectComplete.  If you want
a non-blocking cancel request, why would you then use a blocking
operation to complete the request?  Seems like it'd be better
to have just a monolithic cancel function for those who don't
need non-blocking.

This change:

--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -59,12 +59,15 @@ typedef enum
 {
CONNECTION_OK,
CONNECTION_BAD,
+   CONNECTION_CANCEL_FINISHED,
/* Non-blocking mode only below here */

is an absolute non-starter: it breaks ABI for every libpq client,
even ones that aren't using this facility.  Why do we need a new
ConnStatusType value anyway?  Seems like PostgresPollingStatusType
covers what we need: once you reach PGRES_POLLING_OK, the cancel
request is done.

The test case is still not very bulletproof on slow machines,
as it seems to be assuming that 30 seconds == forever.  It
would be all right to use $PostgreSQL::Test::Utils::timeout_default,
but I'm not sure that that's easily retrievable by C code.
Maybe make the TAP test pass it in with another optional switch
to libpq_pipeline?  Alternatively, we could teach libpq_pipeline
to do getenv("PG_TEST_TIMEOUT_DEFAULT") with a fallback to 180,
but that feels like it might be overly familiar with the innards
of Utils.pm.

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2022-06-27 Thread Jelte Fennema
(resent because it was blocked from the mailing-list due to inclusion of a 
blocked email address in the To line)

From: Andres Freund 
> On 2022-04-04 15:21:54 +, Jelte Fennema wrote:
> > 2. Added some extra sleeps to the cancellation test, to remove random
> > failures on FreeBSD.
> 
> That's extremely extremely rarely the solution to address test reliability
> issues. It'll fail when running test under valgrind etc.
> 
> Why do you need sleeps / can you find another way to make the test reliable?

The problem they are solving is racy behaviour between sending the query
and sending the cancellation. If the cancellation is handled before the query
is started, then the query doesn't get cancelled. To solve this problem I used
the sleeps to wait a bit before sending the cancelation request.

When I wrote this, I couldn't think of a better way to do it then with sleeps.
But I didn't like it either (and I still don't). These emails made me start to 
think
again, about other ways of solving the problem. I think I've found another 
solution (see attached patch). The way I solve it now is by using another 
connection to check the state of the first one.

Jelte

0001-Add-documentation-for-libpq_pipeline-tests.patch
Description: 0001-Add-documentation-for-libpq_pipeline-tests.patch


0002-Add-non-blocking-version-of-PQcancel.patch
Description: 0002-Add-non-blocking-version-of-PQcancel.patch


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2022-03-28 Thread Jelte Fennema
Thanks for all the feedback everyone. I'll try to send a new patch
later this week that includes user facing docs and a simplified API.
For now a few responses:

> Yeah.  We need to choose a name for these new function(s) that is
> sufficiently different from "PQcancel" that people won't expect them
> to behave exactly the same as that does.  I lack any good ideas about
> that, how about you?

So I guess the names I proposed were not great, since everyone seems to be 
falling over them. 
But I'd like to make my intention clear with the current naming. After this 
patch there would be 
four different APIs for starting a cancelation:
1. PQrequestCancel: deprecated+old, not signal-safe function for requesting 
query cancellation, only uses a specific set of connection options
2. PQcancel: Cancel queries in a signal safe way, to be signal-safe it only 
uses a limited set of connection options
3. PQcancelConnect: Cancel queries in a non-signal safe way that uses all 
connection options
4. PQcancelConnectStart: Cancel queries in a non-signal safe and non-blocking 
way that uses all connection options

So the idea was that you should not look at PQcancelConnectStart as the 
non-blocking
version of PQcancel, but as the non-blocking version of PQcancelConnect. I'll 
try to 
think of some different names too, but IMHO these names could be acceptable
when their differences are addressed sufficiently in the documentation. 

One other approach to naming that comes to mind now is repurposing 
PQrequestCancel:
1. PQrequestCancel: Cancel queries in a non-signal safe way that uses all 
connection options
2. PQrequestCancelStart: Cancel queries in a non-signal safe and non-blocking 
way that uses all connection options
3. PQcancel: Cancel queries in a signal safe way, to be signal-safe it only 
uses a limited set of connection options

> I think it's probably a
> bad sign that this function is tinkering with logic in this sort of
> low-level function anyway. pqReadData() is a really general function
> that manages to work with non-blocking I/O already, so why does
> non-blocking query cancellation need to change its return values, or
> whether or not it drops data in certain cases?

The reason for this low level change is that the cancellation part of the
Postgres protocol is following a different, much more simplistic design 
than all the other parts. The client does not expect a response message back 
from the server after sending the cancellation request. The expectation 
is that the server signals completion by closing the connection, i.e. sending 
EOF. 
For all other parts of the protocol, connection termination should be initiated
client side by sending a Terminate message. So the server closing (sending
EOF) is always unexpected and is thus currently considered an error by 
pqReadData.

But since this is not the case for the cancellation protocol, the result is
changed to -2 in case of EOF to make it possible to distinguish between
an EOF and an actual error.

> And it updates that function to return -2 when the is
> ECONNRESET, which seems to fly in the face of the comment's idea that
> this is the "clean connection closure" case. 

The diff sadly does not include the very relevant comment right above these
lines. Pasting the whole case statement here to clear up this confusion:

case SSL_ERROR_ZERO_RETURN:

/*
 * Per OpenSSL documentation, this error code is only returned for
 * a clean connection closure, so we should not report it as a
 * server crash.
 */
appendPQExpBufferStr(>errorMessage,
 libpq_gettext("SSL connection 
has been closed unexpectedly\n"));
result_errno = ECONNRESET;
n = -2;
break;


> For example, it updates the
> comment for pqsecure_read() to say "Returns -1 in case of failures,
> except in the case of clean connection closure then it returns -2."
> But that function calls any of three different implementation
> functions depending on the situation and the patch only updates one of
> them. 

That comment is indeed not describing what is happening correctly and I'll 
try to make it clearer. The main reason for it being incorrect is coming from 
the fact that receiving EOFs is handled in different places based on the 
encryption method:

1. Unencrypted TCP: EOF is not returned as an error by pqsecure_read, but 
detected by pqReadData (see comments related to definitelyEOF)
2. OpenSSL: EOF is returned as an error by pqsecure_read (see copied case 
statement above)
3. GSS: When writing the patch I was not sure how EOF handling worked here, but 
given that the tests passed for Jacob on GSS, I'm guessing it works the same as 
unencrypted TCP.