Re: libpq async duplicate error results

2022-05-06 Thread Tom Lane
I wrote:
> What is happening is that this bit in PQsendQueryStart is deciding
> not to clear the message buffer because conn->cmd_queue_head isn't
> NULL:

> /*
>  * If this is the beginning of a query cycle, reset the error state.
>  * However, in pipeline mode with something already queued, the error
>  * buffer belongs to that command and we shouldn't clear it.
>  */
> if (newQuery && conn->cmd_queue_head == NULL)
> pqClearConnErrorState(conn);

> So apparently the fault is with the pipelined-query logic.  I'm
> not sure what cleanup step is missing though.

I experimented with the attached patch, which is based on the idea
that clearing the command queue is being done in entirely the wrong
place.  It's more appropriate to do it where we drop unsent output
data, ie in pqDropConnection not pqDropServerData.  The fact that
it was jammed into the latter without any commentary doesn't do much
to convince me that that placement was thought about carefully.

This fixes the complained-of symptom and still passes check-world.
However, I wonder whether cutting the queue down to completely empty
is the right thing.  Why is the queue set up like this in the first
place --- that is, why don't we remove an entry the moment the
command is sent?  I also wonder whether pipelineStatus ought to
get reset here.  We aren't resetting asyncStatus here, so maybe it's
fine, but I'm not quite sure.

regards, tom lane

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 4c12f1411f..6e936bbff3 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -377,6 +377,7 @@ static int	connectDBStart(PGconn *conn);
 static int	connectDBComplete(PGconn *conn);
 static PGPing internal_ping(PGconn *conn);
 static PGconn *makeEmptyPGconn(void);
+static void pqFreeCommandQueue(PGcmdQueueEntry *queue);
 static bool fillPGconn(PGconn *conn, PQconninfoOption *connOptions);
 static void freePGconn(PGconn *conn);
 static void closePGconn(PGconn *conn);
@@ -462,6 +463,12 @@ pqDropConnection(PGconn *conn, bool flushInput)
 	/* Always discard any unsent data */
 	conn->outCount = 0;
 
+	/* Likewise, discard any pending pipelined commands */
+	pqFreeCommandQueue(conn->cmd_queue_head);
+	conn->cmd_queue_head = conn->cmd_queue_tail = NULL;
+	pqFreeCommandQueue(conn->cmd_queue_recycle);
+	conn->cmd_queue_recycle = NULL;
+
 	/* Free authentication/encryption state */
 #ifdef ENABLE_GSS
 	{
@@ -569,12 +576,6 @@ pqDropServerData(PGconn *conn)
 	}
 	conn->notifyHead = conn->notifyTail = NULL;
 
-	pqFreeCommandQueue(conn->cmd_queue_head);
-	conn->cmd_queue_head = conn->cmd_queue_tail = NULL;
-
-	pqFreeCommandQueue(conn->cmd_queue_recycle);
-	conn->cmd_queue_recycle = NULL;
-
 	/* Reset ParameterStatus data, as well as variables deduced from it */
 	pstatus = conn->pstatus;
 	while (pstatus != NULL)


Re: libpq async duplicate error results

2022-05-06 Thread Tom Lane
Peter Eisentraut  writes:
> I took another look at this.  The output from the test program at the 
> beginning of the thread is now (master branch):
> ...
> command = SELECT 'after';
> PQsendQuery() error: FATAL:  terminating connection due to administrator 
> command
> server closed the connection unexpectedly
>   This probably means the server terminated abnormally
>   before or while processing the request.
> no connection to the server

> It appears the "after" query is getting the error message from the 
> previous cycle somehow.

What is happening is that this bit in PQsendQueryStart is deciding
not to clear the message buffer because conn->cmd_queue_head isn't
NULL:

/*
 * If this is the beginning of a query cycle, reset the error state.
 * However, in pipeline mode with something already queued, the error
 * buffer belongs to that command and we shouldn't clear it.
 */
if (newQuery && conn->cmd_queue_head == NULL)
pqClearConnErrorState(conn);

So apparently the fault is with the pipelined-query logic.  I'm
not sure what cleanup step is missing though.

regards, tom lane




Re: libpq async duplicate error results

2022-05-06 Thread Peter Eisentraut
I took another look at this.  The output from the test program at the 
beginning of the thread is now (master branch):


command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: FATAL:  terminating connection due to administrator 
command

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
no connection to the server


It appears the "after" query is getting the error message from the 
previous cycle somehow.



The output in PG14 and PG13 is:

command = SELECT 'after';
PQsendQuery() error: no connection to the server


Is the change intended or do we need to think about more tweaking?




Re: libpq async duplicate error results

2022-02-24 Thread Tom Lane
Actually ... it now seems to me that there's live bugs in the
interaction between pipeline mode and error-buffer clearing.
Namely, that places like PQsendQueryStart will unconditionally
clear the buffer even though we may be in the middle of a pipeline
sequence, and the buffer might hold an error that's yet to be
reported to the application.  So I think we need something
more like the attached.

regards, tom lane

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 45dddaf556..0c39bc9abf 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1380,10 +1380,7 @@ pqAppendCmdQueueEntry(PGconn *conn, PGcmdQueueEntry *entry)
 			 * state, we don't have to do anything.
 			 */
 			if (conn->asyncStatus == PGASYNC_IDLE)
-			{
-pqClearConnErrorState(conn);
 pqPipelineProcessQueue(conn);
-			}
 			break;
 	}
 }
@@ -1730,8 +1727,10 @@ PQsendQueryStart(PGconn *conn, bool newQuery)
 
 	/*
 	 * If this is the beginning of a query cycle, reset the error state.
+	 * However, in pipeline mode with something already queued, the error
+	 * buffer belongs to that command and we shouldn't clear it.
 	 */
-	if (newQuery)
+	if (newQuery && conn->cmd_queue_head == NULL)
 		pqClearConnErrorState(conn);
 
 	/* Don't try to send if we know there's no live connection. */
@@ -2149,11 +2148,8 @@ PQgetResult(PGconn *conn)
 /*
  * We're about to return the NULL that terminates the round of
  * results from the current query; prepare to send the results
- * of the next query when we're called next.  Also, since this
- * is the start of the results of the next query, clear any
- * prior error message.
+ * of the next query when we're called next.
  */
-pqClearConnErrorState(conn);
 pqPipelineProcessQueue(conn);
 			}
 			break;
@@ -2362,6 +2358,14 @@ PQexecStart(PGconn *conn)
 	if (!conn)
 		return false;
 
+	/*
+	 * Since this is the beginning of a query cycle, reset the error state.
+	 * However, in pipeline mode with something already queued, the error
+	 * buffer belongs to that command and we shouldn't clear it.
+	 */
+	if (conn->cmd_queue_head == NULL)
+		pqClearConnErrorState(conn);
+
 	if (conn->pipelineStatus != PQ_PIPELINE_OFF)
 	{
 		appendPQExpBufferStr(>errorMessage,
@@ -2369,11 +2373,6 @@ PQexecStart(PGconn *conn)
 		return false;
 	}
 
-	/*
-	 * Since this is the beginning of a query cycle, reset the error state.
-	 */
-	pqClearConnErrorState(conn);
-
 	/*
 	 * Silently discard any prior query result that application didn't eat.
 	 * This is probably poor design, but it's here for backward compatibility.
@@ -2928,8 +2927,11 @@ PQfn(PGconn *conn,
 
 	/*
 	 * Since this is the beginning of a query cycle, reset the error state.
+	 * However, in pipeline mode with something already queued, the error
+	 * buffer belongs to that command and we shouldn't clear it.
 	 */
-	pqClearConnErrorState(conn);
+	if (conn->cmd_queue_head == NULL)
+		pqClearConnErrorState(conn);
 
 	if (conn->pipelineStatus != PQ_PIPELINE_OFF)
 	{
@@ -3099,6 +3101,12 @@ pqPipelineProcessQueue(PGconn *conn)
 		conn->cmd_queue_head == NULL)
 		return;
 
+	/*
+	 * Reset the error state.  This and the next couple of steps correspond to
+	 * what PQsendQueryStart didn't do for this query.
+	 */
+	pqClearConnErrorState(conn);
+
 	/* Initialize async result-accumulation state */
 	pqClearAsyncResult(conn);
 
@@ -3809,9 +3817,11 @@ PQsetnonblocking(PGconn *conn, int arg)
 	 * behavior. this is ok because either they are making a transition _from_
 	 * or _to_ blocking mode, either way we can block them.
 	 *
-	 * Clear error state in case pqFlush adds to it.
+	 * Clear error state in case pqFlush adds to it, unless we're actively
+	 * pipelining, in which case it seems best not to.
 	 */
-	pqClearConnErrorState(conn);
+	if (conn->cmd_queue_head == NULL)
+		pqClearConnErrorState(conn);
 
 	/* if we are going from blocking to non-blocking flush here */
 	if (pqFlush(conn))
@@ -4003,7 +4013,8 @@ PQescapeStringConn(PGconn *conn,
 		return 0;
 	}
 
-	pqClearConnErrorState(conn);
+	if (conn->cmd_queue_head == NULL)
+		pqClearConnErrorState(conn);
 
 	return PQescapeStringInternal(conn, to, from, length, error,
   conn->client_encoding,
@@ -4041,7 +4052,8 @@ PQescapeInternal(PGconn *conn, const char *str, size_t len, bool as_ident)
 	if (!conn)
 		return NULL;
 
-	pqClearConnErrorState(conn);
+	if (conn->cmd_queue_head == NULL)
+		pqClearConnErrorState(conn);
 
 	/* Scan the string for characters that must be escaped. */
 	for (s = str; (s - str) < len && *s != '\0'; ++s)
@@ -4306,7 +4318,8 @@ PQescapeByteaConn(PGconn *conn,
 	if (!conn)
 		return NULL;
 
-	pqClearConnErrorState(conn);
+	if (conn->cmd_queue_head == NULL)
+		pqClearConnErrorState(conn);
 
 	return PQescapeByteaInternal(conn, from, from_length, to_length,
  conn->std_strings,


Re: libpq async duplicate error results

2022-02-23 Thread Tom Lane
I wrote:
> However ... in the wake of 618c16707, I wonder if we should consider
> an alternative definition, which is to NOT clear errorMessage when
> starting a new pipelined query.  (That would be this patch minus
> the addition to pqPipelineProcessQueue.)  Thanks to 618c16707,
> the returned error PGresult(s) should bear exactly the same contents
> either way.  What would change is that PQerrorMessage(conn) would return
> the concatenation of all errors that have occurred during the current
> pipeline sequence.  I think that's arguably a more useful definition
> than what we have now --- though obviously it'd require a docs change.
> It seems to fit with the spirit of the v14 changes to ensure that
> PQerrorMessage tells you about everything that happened during a
> failed connection attempt, not only the last thing.

After studying the pipeline mode some more, I no longer think that's
a great idea.  ISTM we want pipeline mode to act as nearly as possible
the same as issuing the same series of queries non-pipelined.   But
this redefinition would make it different.  So, barring objection,
I'll push that cleanup patch as-posted.

regards, tom lane




Re: libpq async duplicate error results

2022-02-20 Thread Tom Lane
Alvaro Herrera  writes:
> On 2022-Feb-07, Tom Lane wrote:
>> I see that PQgetResult does a hacky "resetPQExpBuffer(>errorMessage)"
>> in between pipelined queries.  It seems like if there are real usability
>> issues in this area, they probably come from that not being well placed.
>> In particular, I wonder why that's done with the pqPipelineProcessQueue
>> call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
>> call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
>> result).  It kinda looks to me like maybe pqPipelineProcessQueue
>> ought to have the responsibility for doing that, rather than having
>> two out of the three call sites do it.  Also it would seem more natural
>> to associate it with the reinitialization that happens inside
>> pqPipelineProcessQueue.

> Yeah, I agree that the placement of error message reset in pipelined
> libpq is more ad-hoc to the testing I was doing than carefully
> principled.  I didn't test changing it, but on a quick look your
> proposed change seems reasonable to me.

Here's a patch to do that.  It passes check-world (particularly
src/test/modules/libpq_pipeline), for whatever that's worth.

However ... in the wake of 618c16707, I wonder if we should consider
an alternative definition, which is to NOT clear errorMessage when
starting a new pipelined query.  (That would be this patch minus
the addition to pqPipelineProcessQueue.)  Thanks to 618c16707,
the returned error PGresult(s) should bear exactly the same contents
either way.  What would change is that PQerrorMessage(conn) would return
the concatenation of all errors that have occurred during the current
pipeline sequence.  I think that's arguably a more useful definition
than what we have now --- though obviously it'd require a docs change.
It seems to fit with the spirit of the v14 changes to ensure that
PQerrorMessage tells you about everything that happened during a
failed connection attempt, not only the last thing.

I've tested that variant, and it also passes check-world, which may
say something about libpq_pipeline's coverage of error cases ...
I didn't look closely.

regards, tom lane

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 45dddaf556..495d2b4c13 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1380,10 +1380,7 @@ pqAppendCmdQueueEntry(PGconn *conn, PGcmdQueueEntry *entry)
 			 * state, we don't have to do anything.
 			 */
 			if (conn->asyncStatus == PGASYNC_IDLE)
-			{
-pqClearConnErrorState(conn);
 pqPipelineProcessQueue(conn);
-			}
 			break;
 	}
 }
@@ -2149,11 +2146,8 @@ PQgetResult(PGconn *conn)
 /*
  * We're about to return the NULL that terminates the round of
  * results from the current query; prepare to send the results
- * of the next query when we're called next.  Also, since this
- * is the start of the results of the next query, clear any
- * prior error message.
+ * of the next query when we're called next.
  */
-pqClearConnErrorState(conn);
 pqPipelineProcessQueue(conn);
 			}
 			break;
@@ -3099,6 +3093,9 @@ pqPipelineProcessQueue(PGconn *conn)
 		conn->cmd_queue_head == NULL)
 		return;
 
+	/* Reset the error state */
+	pqClearConnErrorState(conn);
+
 	/* Initialize async result-accumulation state */
 	pqClearAsyncResult(conn);
 


Re: libpq async duplicate error results

2022-02-16 Thread Tom Lane
Andres Freund  writes:
> On 2022-02-16 20:28:02 -0500, Tom Lane wrote:
>> Uh ... then we'd have to cast away the const to do free().

> I was just thinking of
> if ((const PGresult *) res == _result)

Oh, I see.  Sure, can do it like that.

regards, tom lane




Re: libpq async duplicate error results

2022-02-16 Thread Andres Freund
Hi,

On 2022-02-16 20:28:02 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-02-16 18:51:37 -0500, Tom Lane wrote:
> >> +  /* Also, do nothing if the argument is OOM_result */
> >> +  if (res == unconstify(PGresult *, _result))
> >> +  return;
>
> > Wouldn't it make more sense to make res const, rather than unconstifying
> > _result?
>
> Uh ... then we'd have to cast away the const to do free().

I was just thinking of

if ((const PGresult *) res == _result)

It's not important, I just find it stylistically nicer (making a pointer const
from an non-const pointer is safe, the other way round not generally).

- Andres




Re: libpq async duplicate error results

2022-02-16 Thread Tom Lane
Andres Freund  writes:
> On 2022-02-16 18:51:37 -0500, Tom Lane wrote:
>> +/* Also, do nothing if the argument is OOM_result */
>> +if (res == unconstify(PGresult *, _result))
>> +return;

> Wouldn't it make more sense to make res const, rather than unconstifying
> _result?

Uh ... then we'd have to cast away the const to do free().

regards, tom lane




Re: libpq async duplicate error results

2022-02-16 Thread Andres Freund
Hi,

On 2022-02-16 18:51:37 -0500, Tom Lane wrote:
> This seems workable, and you'll notice it fixes the duplicated text
> in the test case Andres was worried about.

Cool.

I find it mildly scary that we didn't have any other tests verifying the libpq
side of connection termination. Seems like we we maybe should add a few more?
Some simple cases we can do via isolationtester. But some others would
probably require a C test program to be robust...


> + /* Also, do nothing if the argument is OOM_result */
> + if (res == unconstify(PGresult *, _result))
> + return;

Wouldn't it make more sense to make res const, rather than unconstifying
_result?


Greetings,

Andres Freund




Re: libpq async duplicate error results

2022-02-16 Thread Tom Lane
I wrote:
>>> An idea that I don't have time to pursue right now is to track
>>> how much of conn->errorMessage has been read out by PQgetResult,
>>> and only report newly-added text in later PQgetResult calls of
>>> a series, while PQerrorMessage would continue to report the
>>> whole thing.  Buffer resets would occur where they do now.

> I spent a bit more time poking at this issue.  I no longer like the
> idea of reporting only the tail part of conn->errorMessage; I'm
> afraid that would convert "edge cases sometimes return redundant text"
> into "edge cases sometimes miss returning text at all", which is not
> an improvement.

I figured out what seems like an adequately bulletproof solution to
this problem.  The thing that was scaring me before is that there are
code paths in which we'll discard a pending conn->result PGresult
(which could be an error) and make a new one; it wasn't entirely
clear that we could track which result represents what text.
The solution to that has two components:

1. Never advance the how-much-text-has-been-reported counter until
we are just about to return an error PGresult to the application.
This reduces the risk of missing or duplicated text to what seems
an acceptable level.  I found that pqPrepareAsyncResult can be used
to handle this logic -- it's only called at outer levels, and it
already has some state-updating behavior, so we'd already have bugs
if it were called improperly.

2. Don't make error PGresults at all until we reach pqPrepareAsyncResult.
This saves some cycles in the paths where we previously replaced one
error with another.  Importantly, it also means that we can now have
a bulletproof solution to the problem of how to return an error when
we get OOM on any attempt to make a PGresult.  The attached patch
includes a statically-declared PGresult that we can return in that
case.

It's not very practical to do #2 exactly, because in the case of
an error received from the server, we need to build a PGresult that
includes the broken-down error fields.  But all internally-generated
errors can be handled in this style.

This seems workable, and you'll notice it fixes the duplicated text
in the test case Andres was worried about.

Note that this patch is on top of the event-proc changes I posted
in [1].  Without that change, it's not very clear how to cope
with PGEVT_RESULTCREATE failure after we've already updated the
error status.

regards, tom lane

[1] https://www.postgresql.org/message-id/3390587.1645035110%40sss.pgh.pa.us

diff --git a/contrib/test_decoding/expected/slot_creation_error.out b/contrib/test_decoding/expected/slot_creation_error.out
index 321648c339..043bdae0a2 100644
--- a/contrib/test_decoding/expected/slot_creation_error.out
+++ b/contrib/test_decoding/expected/slot_creation_error.out
@@ -98,7 +98,6 @@ t
 
 step s2_init: <... completed>
 FATAL:  terminating connection due to administrator command
-FATAL:  terminating connection due to administrator command
 server closed the connection unexpectedly
 	This probably means the server terminated abnormally
 	before or while processing the request.
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index f8f4111fef..6fceff561b 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -1237,7 +1237,7 @@ PQencryptPasswordConn(PGconn *conn, const char *passwd, const char *user,
 	if (!conn)
 		return NULL;
 
-	resetPQExpBuffer(>errorMessage);
+	pqClearConnErrorState(conn);
 
 	/* If no algorithm was given, ask the server. */
 	if (algorithm == NULL)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9c9416e8ff..2a3d68b4d1 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3685,7 +3685,7 @@ keep_going:		/* We will come back to here until there is
  * (and it seems some clients expect it to be empty after a
  * successful connection).
  */
-resetPQExpBuffer(>errorMessage);
+pqClearConnErrorState(conn);
 
 /* We are open for business! */
 conn->status = CONNECTION_OK;
@@ -4231,7 +4231,7 @@ closePGconn(PGconn *conn)
 
 	/*
 	 * Close the connection, reset all transient state, flush I/O buffers.
-	 * Note that this includes clearing conn->errorMessage; we're no longer
+	 * Note that this includes clearing conn's error state; we're no longer
 	 * interested in any failures associated with the old connection, and we
 	 * want a clean slate for any new connection attempt.
 	 */
@@ -4241,7 +4241,7 @@ closePGconn(PGconn *conn)
 	conn->xactStatus = PQTRANS_IDLE;
 	conn->pipelineStatus = PQ_PIPELINE_OFF;
 	pqClearAsyncResult(conn);	/* deallocate result */
-	resetPQExpBuffer(>errorMessage);
+	pqClearConnErrorState(conn);
 	release_conn_addrinfo(conn);
 
 	/* Reset all state obtained from server, too */
@@ -5236,7 +5236,7 @@ ldapServiceLookup(const char *purl, PQconninfoOption *options,
  * Returns 0 on success, nonzero on 

Re: libpq async duplicate error results

2022-02-08 Thread Alvaro Herrera
On 2022-Feb-07, Tom Lane wrote:

> In any case, the particular example we're looking at here is connection
> loss, which is not something that should greatly concern us as far as
> pipeline semantics are concerned, because you're not getting any more
> pipelined results anyway if that happens.  In the non-I/O-error case,
> I see that PQgetResult does a hacky "resetPQExpBuffer(>errorMessage)"
> in between pipelined queries.  It seems like if there are real usability
> issues in this area, they probably come from that not being well placed.
> In particular, I wonder why that's done with the pqPipelineProcessQueue
> call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
> call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
> result).  It kinda looks to me like maybe pqPipelineProcessQueue
> ought to have the responsibility for doing that, rather than having
> two out of the three call sites do it.  Also it would seem more natural
> to associate it with the reinitialization that happens inside
> pqPipelineProcessQueue.

Yeah, I agree that the placement of error message reset in pipelined
libpq is more ad-hoc to the testing I was doing than carefully
principled.  I didn't test changing it, but on a quick look your
proposed change seems reasonable to me.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)




Re: libpq async duplicate error results

2022-02-07 Thread Tom Lane
[ cc'ing Alvaro for pipeline questions ]

Fabien COELHO  writes:
>> It is annoying that some of the text is duplicated in the second
>> report, but in the end that's cosmetic, and I'm not sure we can
>> improve it without breaking other things.  In particular, we
>> can't just think about what comes back in the PGgetResult calls,
>> we also have to consider what PQerrorMessage(conn) will report
>> afterwards.  So I don't think that resetting conn->errorMessage
>> during a PQgetResult series would be a good fix.
>> 
>> An idea that I don't have time to pursue right now is to track
>> how much of conn->errorMessage has been read out by PQgetResult,
>> and only report newly-added text in later PQgetResult calls of
>> a series, while PQerrorMessage would continue to report the
>> whole thing.  Buffer resets would occur where they do now.

> I agree that the message reset is not convincing, but it was the only way 
> to get the expected behavior without changing the existing functions.

I spent a bit more time poking at this issue.  I no longer like the
idea of reporting only the tail part of conn->errorMessage; I'm
afraid that would convert "edge cases sometimes return redundant text"
into "edge cases sometimes miss returning text at all", which is not
an improvement.

In any case, the particular example we're looking at here is connection
loss, which is not something that should greatly concern us as far as
pipeline semantics are concerned, because you're not getting any more
pipelined results anyway if that happens.  In the non-I/O-error case,
I see that PQgetResult does a hacky "resetPQExpBuffer(>errorMessage)"
in between pipelined queries.  It seems like if there are real usability
issues in this area, they probably come from that not being well placed.
In particular, I wonder why that's done with the pqPipelineProcessQueue
call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
result).  It kinda looks to me like maybe pqPipelineProcessQueue
ought to have the responsibility for doing that, rather than having
two out of the three call sites do it.  Also it would seem more natural
to associate it with the reinitialization that happens inside
pqPipelineProcessQueue.

regards, tom lane




Re: libpq async duplicate error results

2022-02-05 Thread Fabien COELHO



Hello Tom,


I concur with Fabien's analysis: we report the FATAL message from
the server during the first PQgetResult, and then the second call
discovers that the connection is gone and reports "server closed
the connection unexpectedly".  Those are two independent events
(in libpq's view anyway) so reporting them separately is correct,
or at least necessary unless you want to engage in seriously
major redesign and behavioral changes.

It is annoying that some of the text is duplicated in the second
report, but in the end that's cosmetic, and I'm not sure we can
improve it without breaking other things.  In particular, we
can't just think about what comes back in the PGgetResult calls,
we also have to consider what PQerrorMessage(conn) will report
afterwards.  So I don't think that resetting conn->errorMessage
during a PQgetResult series would be a good fix.

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing.  Buffer resets would occur where they do now.

It'd probably be necessary to make a user-visible PQgetResult
that becomes a wrapper around PQgetResultInternal, because
internal callers such as PQexecFinish will need the old behavior,
or at least some of them will.


I agree that the message reset is not convincing, but it was the only way 
to get the expected behavior without changing the existing functions.


I see two paths:

(1) keep the duplicate message in this corner case.

(2) do the hocus pocus you suggest around PQerrorMessage and PQgetResult 
to work around the duplication, just for this corner case.


I'd tend to choose (1) to keep it simple, even if (2) is feasible.

--
Fabien.




Re: libpq async duplicate error results

2022-02-03 Thread Tom Lane
Peter Eisentraut  writes:
> Tom, you worked on reorganzing the error message handling in libpq in 
> PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f).  Any 
> thoughts on this?

Ah, sorry, I'd not noticed this thread.

I concur with Fabien's analysis: we report the FATAL message from
the server during the first PQgetResult, and then the second call
discovers that the connection is gone and reports "server closed
the connection unexpectedly".  Those are two independent events
(in libpq's view anyway) so reporting them separately is correct,
or at least necessary unless you want to engage in seriously
major redesign and behavioral changes.

It is annoying that some of the text is duplicated in the second
report, but in the end that's cosmetic, and I'm not sure we can
improve it without breaking other things.  In particular, we
can't just think about what comes back in the PGgetResult calls,
we also have to consider what PQerrorMessage(conn) will report
afterwards.  So I don't think that resetting conn->errorMessage
during a PQgetResult series would be a good fix.

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing.  Buffer resets would occur where they do now.

It'd probably be necessary to make a user-visible PQgetResult
that becomes a wrapper around PQgetResultInternal, because
internal callers such as PQexecFinish will need the old behavior,
or at least some of them will.

regards, tom lane




Re: libpq async duplicate error results

2022-02-03 Thread Peter Eisentraut
Tom, you worked on reorganzing the error message handling in libpq in 
PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f).  Any 
thoughts on this?



On 25.01.22 09:32, Peter Eisentraut wrote:


This issue was discovered by Fabien in the SHOW_ALL_RESULTS thread.  I'm 
posting it here separately, because I think it ought to be addressed in 
libpq rather than with a workaround in psql, as proposed over there.


When using PQsendQuery() + PQgetResult() and the server crashes during 
the execution of the command, PQgetResult() then returns two result sets 
with partially duplicated error messages, like this from the attached 
test program:


command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server


This is hidden in normal use because PQexec() throws away all but the 
last result set, but the extra one is still generated internally.


Apparently, this has changed between PG13 and PG14.  In PG13 and 
earlier, the output is


command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

In PG13, PQexec() concatenates all the error messages from multiple 
results, so a user of PQexec() sees the same output before and after. 
But for users of the lower-level APIs, things have become a bit more 
confusing.


Also, why are there multiple results being generated in the first place?


[0]: 
https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo






Re: libpq async duplicate error results

2022-01-27 Thread Peter Eisentraut

On 26.01.22 14:52, Fabien COELHO wrote:



command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"



Also, why are there multiple results being generated in the first place?


My interpretation is that the first message is a close message issued by 
the server before actually severing the connection, and the second 
message is generated by libpq when it notices that the connection has 
been closed, so there is some sense in having to results to report these 
two consecutive errors, and the question might be about when the buffer 
should be reset.


I see.  If we stipulate for a moment that getting two results is what we 
want here, I see a conceptual problem.  Users of PQexec() don't care 
whether the error message they are ultimately getting is a concatenation 
of two messages from two internal results (old behavior) or just an 
appropriately accumulated message from the last internal result (new 
behavior).  But users of PQgetResult() don't have a principled way to 
deal with this anymore.  They can't just ignore one or the other result, 
because they could after all be two separate errors from two separate 
commands.  So the only proper behavior is to show all errors from all 
results.  But that results in this misbehavior because the last error 
message must include the messages from the previous messages in order to 
accommodate PQexec() users.






Re: libpq async duplicate error results

2022-01-26 Thread Fabien COELHO




command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"



Also, why are there multiple results being generated in the first place?


My interpretation is that the first message is a close message issued by 
the server before actually severing the connection, and the second message 
is generated by libpq when it notices that the connection has been closed, 
so there is some sense in having to results to report these two 
consecutive errors, and the question might be about when the buffer should 
be reset.


--
Fabien.




libpq async duplicate error results

2022-01-25 Thread Peter Eisentraut


This issue was discovered by Fabien in the SHOW_ALL_RESULTS thread.  I'm 
posting it here separately, because I think it ought to be addressed in 
libpq rather than with a workaround in psql, as proposed over there.


When using PQsendQuery() + PQgetResult() and the server crashes during 
the execution of the command, PQgetResult() then returns two result sets 
with partially duplicated error messages, like this from the attached 
test program:


command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server


This is hidden in normal use because PQexec() throws away all but the 
last result set, but the extra one is still generated internally.


Apparently, this has changed between PG13 and PG14.  In PG13 and 
earlier, the output is


command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

In PG13, PQexec() concatenates all the error messages from multiple 
results, so a user of PQexec() sees the same output before and after. 
But for users of the lower-level APIs, things have become a bit more 
confusing.


Also, why are there multiple results being generated in the first place?


[0]: 
https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo#include 
#include 

static void
run_query(PGconn *conn, const char *command)
{
int rc;
PGresult *res;
int count = 0;

fprintf(stderr, "command = %s\n", command);
rc = PQsendQuery(conn, command);
if (rc != 1)
{
char *error = PQerrorMessage(conn);

fprintf(stderr, "PQsendQuery() error: %s", error);
return;
}

while ((res = PQgetResult(conn)))
{
ExecStatusType status;

status = PQresultStatus(res);
fprintf(stderr, "result %d status = %s\n", ++count, 
PQresStatus(status));
fprintf(stderr, "error message = \"%s\"\n", 
PQresultErrorMessage(res));
PQclear(res);
}

fprintf(stderr, "\n");
return;
}

int
main()
{
PGconn *conn;

conn = PQconnectdb("");
if (PQstatus(conn) != CONNECTION_OK)
{
fprintf(stderr, "Connection to database failed: %s\n",
PQerrorMessage(conn));
return 1;
}

run_query(conn, "SELECT 'before';");
run_query(conn, "SELECT pg_terminate_backend(pg_backend_pid());");
run_query(conn, "SELECT 'after';");

PQfinish(conn);
return 0;
}