date:20210612

Andrew Dunstan  writes:
> Note, too that the psql and safe_psql methods also pass the query via stdin.

Yeah.  We need all of these to act the same, IMO.  Recall that
the previous patches that introduced the undef hack were changing
callers of those routines, not poll_query_until.

regards, tom lane

Re: recovery test failures on hoverfly



On 6/12/21 5:28 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> I'm a bit dubious about this. It doesn't seem more robust to insist that
>> we pass undef in certain cases.
> True, it'd be nicer if that didn't matter; mainly because people
> will get it wrong in future.


Right, that's what I'm worried about.


>
>> If passing the SQL via stdin is fragile,
>> as we also found to be the case with passing it via the command line,
>> perhaps we should try passing it via a tmp file. Then there would
>> presumably be no SIGPIPE.
> Seems kind of inefficient.  Maybe writing and reading a file would
> be a negligible cost compared to everything else involved, but
> I'm not sure.


Well, in poll_query_until we would of course set up the file outside the
loop. I suspect the cost would in fact be negligible.


Note, too that the psql and safe_psql methods also pass the query via stdin.


>
> Another angle is that the SIGPIPE complaints aren't necessarily
> a bad thing: if psql doesn't read what we send, it's good to
> know about that.  IMO the real problem is that the errors are
> so darn nonrepeatable.  I wonder if there is a way to make them
> more reproducible?
>
>   


I don't know.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error on pgbench logs

2021-06-12 Thread Fabien COELHO





+   while ((next = agg->start_time + agg_interval * 
INT64CONST(100)) <= now)

I can find the similar code to convert "seconds" to "us" using casting like

end_time = threads[0].create_time + (int64) 100 * duration;

or

next_report = last_report + (int64) 100 * progress;

Is there a reason use INT64CONST instead of (int64)? Do these imply the same 
effect?


I guess that the macros does 100LL or something similar to directly 
create an int64 constant. It is necessary if the constant would overflow a 
usual 32 bits C integer, whereas the cast is sufficient if there is no 
overflow. Maybe I c/should have used the previous approach.



Sorry, if this is a dumb question...


Nope.

--
Fabien.

Re: Race condition in recovery?

On 6/12/21 1:54 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 6/12/21 1:07 PM, Tom Lane wrote:
>>> OK.  But it makes me itch a bit that this one wait-for-wal-to-be-
>>> processed query looks different from all the other ones.
>> I'm happy to bring the other two queries that look like this into line
>> with this one if you like.
> I see a lot more than two --- grepping for poll_query_until with
> a test involving a LSN comparison finds a bunch.  Are we sure that
> there are only three in which the LSN could be null?  

Well, I'm counting places that specifically compare it with
pg_stat_archiver.last_archived_wal.

> How much
> does it really matter if it is?
>
>   

It makes it harder to tell if there was any result at all when there's a
failure. If it bugs you that much I can revert just that line. Now that
I have fixed the immediate issue it matters less. I'm not prepared to
put in a lot of effort here, though.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: recovery test failures on hoverfly

Andrew Dunstan  writes:
> I'm a bit dubious about this. It doesn't seem more robust to insist that
> we pass undef in certain cases.

True, it'd be nicer if that didn't matter; mainly because people
will get it wrong in future.

> If passing the SQL via stdin is fragile,
> as we also found to be the case with passing it via the command line,
> perhaps we should try passing it via a tmp file. Then there would
> presumably be no SIGPIPE.

Seems kind of inefficient.  Maybe writing and reading a file would
be a negligible cost compared to everything else involved, but
I'm not sure.

Another angle is that the SIGPIPE complaints aren't necessarily
a bad thing: if psql doesn't read what we send, it's good to
know about that.  IMO the real problem is that the errors are
so darn nonrepeatable.  I wonder if there is a way to make them
more reproducible?

regards, tom lane

Re: recovery test failures on hoverfly

On 6/12/21 3:15 PM, Tom Lane wrote:
>>> Michael Paquier  writes:
 This is the same problem as c757a3da and 6d41dd0, where we write a
 query to a pipe but the kill, causing a failure, makes the test fail
 with a SIGPIPE in IPC::Run as a query is sent down to a pipe.
> After checking the git logs, I realized that this failure is actually
> new since 11e9caff8: before that, poll_query_until passed the query
> on the command line not stdin, so it wasn't vulnerable to SIGPIPE.
> So that explains why we only recently started to see this.
>
> The fix I proposed seems to work fine in all branches, so I went
> ahead and pushed it.
>
>   

I'm a bit dubious about this. It doesn't seem more robust to insist that
we pass undef in certain cases. If passing the SQL via stdin is fragile,
as we also found to be the case with passing it via the command line,
perhaps we should try passing it via a tmp file. Then there would
presumably be no SIGPIPE.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Character expansion with ICU collations

2021-06-12 Thread Finnerty, Jim

Re: 
>> Can a CI collation be ordered upper case first, or is this a limitation 
of ICU?

> I don't know the authoritative answer to that, but to me it doesn't make
> sense, since the effect of a case-insensitive collation is to throw away
> the third-level weights, so there is nothing left for "upper case first"
> to operate on.

It wouldn't make sense for the ICU sort key of a CI collation itself because 
the sort keys need to be binary equal, but what the collation of interest does 
is equivalent to adding a secondary "C"-collated expression to the ORDER BY 
clause.  For example:

SELECT ... ORDER BY expr COLLATE ci_as;

Is ordered as if the query had been written:

SELECT ... ORDER BY expr COLLATE ci_as, expr COLLATE "C";

Re: 
> tailoring rules
>> yes

It looks like the relevant API call is ucol_openRules(), 
Interface documented here: 
https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/ucol_8h.html
example usage from C here: 
https://android.googlesource.com/platform/external/icu/+/db20b09/source/test/cintltst/citertst.c

for example:

/* Test with an expanding character sequence */
u_uastrcpy(rule, "&a < b < c/abd < d");
c2 = ucol_openRules(rule, u_strlen(rule), UCOL_OFF, UCOL_DEFAULT_STRENGTH, 
NULL, &status);

and a reordering rule test:

u_uastrcpy(rule, "&z < AB");
coll = ucol_openRules(rule, u_strlen(rule), UCOL_OFF, 
UCOL_DEFAULT_STRENGTH, NULL, &status);

that looks encouraging.  It returns a UCollator object, like ucol_open(const 
char *localeString, ...), so it's an alternative to ucol_open().  One of the 
parameters is the equivalent of colStrength, so then the question would be, how 
are the other keyword/value pairs like colCaseFirst, colAlternate, etc. 
specified via the rules argument?  In the same way with the exception of 
colStrength?

e.g. is "colAlternate=shifted;&z < AB" a valid rules string?

The ICU documentation says simply:

" rules A string describing the collation rules. For the syntax of the rules 
please see users guide."

Transform rules are documented here: 
http://userguide.icu-project.org/transforms/general/rules

But there are no examples of using the keyword/value pairs that may appear in a 
locale string with the transform rules, and there's no locale argument on 
ucol_openRules.  How can the keyword/value pairs that may appear in the locale 
string be applied in combination with tailoring rules (with the exception of 
colStrength)?

Re: Signed vs Unsigned (take 2) (src/backend/storage/ipc/procarray.c)

2021-06-12 Thread Andres Freund

Hi,

On 2021-06-12 10:55:22 -0300, Ranier Vilela wrote:
> With the recent changes at procarray.c, I take a look in.
> msvc compiler, has some warnings about signed vs unsigned.

> 1. Size_t is weird, because all types are int.

Not sure why I ended up using size_t here. There are cases where using a
natively sized integer can lead to better code being generated, so I'd
want to see some evaluation of the code generation effects.

> 2. Wouldn't it be better to initialize static variables?

No, explicit initialization needs additional space in the binary, and
static variables are always zero initialized.

> 3. There are some shadowing parameters.

Hm, yea, that's not great. Those are from

commit 0e141c0fbb211bdd23783afa731e3eef95c9ad7a
Author: Robert Haas 
Date:   2015-08-06 11:52:51 -0400

Reduce ProcArrayLock contention by removing backends in batches.

Amit, Robert, I assume you don't mind changing this?

> 4. Possible loop beyond numProcs?

What are you referring to here?

Greetings,

Andres Freund

Re: recovery test failures on hoverfly

>> Michael Paquier  writes:
>>> This is the same problem as c757a3da and 6d41dd0, where we write a
>>> query to a pipe but the kill, causing a failure, makes the test fail
>>> with a SIGPIPE in IPC::Run as a query is sent down to a pipe.

After checking the git logs, I realized that this failure is actually
new since 11e9caff8: before that, poll_query_until passed the query
on the command line not stdin, so it wasn't vulnerable to SIGPIPE.
So that explains why we only recently started to see this.

The fix I proposed seems to work fine in all branches, so I went
ahead and pushed it.

regards, tom lane

Avoid stuck of pbgench due to skipped transactions

2021-06-12 Thread Yugo NAGATA

Hi,

I found that pgbench could get stuck when every transaction
come to be skipped and the number of transaction is not limitted
by -t option.

For example, when I usee a large rate (-R) for throttling and a
small latency limit (-L) values with a duration (-T), pbbench
got stuck.

 $ pgbench -T 5 -R 1 -L 1;

When we specify the number of transactions by -t, it doesn't get
stuck because the number of skipped transactions are counted and
checked during the loop. However, the timer expiration is not
checked in the loop although it is checked before and after a
sleep for throttling. 

I think it is better to check the timer expiration even in the loop
of transaction skips and to finish pgbnech successfully because we
should correcly repport how many transactions are proccessed and
skipped also in this case, and getting stuck would not be good
anyway.

I attached a patch for this fix.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index dc84b7b9b7..1aa3e6b7be 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3232,7 +3232,8 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 	pg_time_now_lazy(&now);
 
 	while (thread->throttle_trigger < now - latency_limit &&
-		   (nxacts <= 0 || st->cnt < nxacts))
+		   (nxacts <= 0 || st->cnt < nxacts) &&
+		   !timer_exceeded)
 	{
 		processXactStats(thread, st, &now, true, agg);
 		/* next rendez-vous */

Re: logical replication of truncate command with trigger causes Assert

2021-06-12 Thread Alvaro Herrera

On 2021-Jun-12, Tom Lane wrote:

> Amit Kapila  writes:
> > On Fri, Jun 11, 2021 at 8:56 PM Tom Lane  wrote:
> >> I was thinking maybe we could mark all these replication protocol
> >> violation errors non-translatable.  While we don't want to crash on a
> >> protocol violation, it shouldn't really be a user-facing case either.
> 
> > I don't see any problem with that as these are not directly related to
> > any user operation. So, +1 for making these non-translatable.
> 
> Done that way.

Good call, thanks.  Not only it's not very useful to translate such
messages, but it's also quite a burden because some of them are
difficult to translate.

-- 
Álvaro Herrera   Valdivia, Chile
"Tiene valor aquel que admite que es un cobarde" (Fernandel)

Re: Error on pgbench logs

2021-06-12 Thread Yugo NAGATA

On Thu, 10 Jun 2021 23:29:30 +0200 (CEST)
Fabien COELHO  wrote:

> 
> Bonjour Michaël,
> 
> Here is an updated patch. While having a look at Kyotaro-san patch, I 
> noticed that the aggregate stuff did not print the last aggregate. I think 
> that it is a side effect of switching the precision from per-second to 
> per-µs. I've done an attempt at also fixing that which seems to work.

This is just out of curiosity.

+   while ((next = agg->start_time + agg_interval * 
INT64CONST(100)) <= now)

I can find the similar code to convert "seconds" to "us" using casting like

 end_time = threads[0].create_time + (int64) 100 * duration;

or
 
 next_report = last_report + (int64) 100 * progress;

Is there a reason use INT64CONST instead of (int64)? Do these imply the same 
effect?

Sorry, if this is a dumb question...

Regards,
Yugo Nagata
 
-- 
Yugo NAGATA

Re: A new function to wait for the backend exit after termination

2021-06-12 Thread Justin Pryzby

On Sat, Jun 12, 2021 at 08:21:39AM -0700, Noah Misch wrote:
> On Sat, Jun 12, 2021 at 12:12:12AM -0500, Justin Pryzby wrote:
> > Even if it's not removed, the descriptions should be cleaned up.
> > 
> > | src/include/catalog/pg_proc.dat-  descr => 'terminate a backend process 
> > and if timeout is specified, wait for its exit or until timeout occurs',
> > => I think doesn't need to change or mention the optional timeout at all
> 
> Agreed, these strings generally give less detail.  I can revert that to the
> v13 wording, 'terminate a server process'.

Maybe you'd also update the release notes.

I suggest some edits from the remaining parts of the original patch.

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fbc80c1403..b7383bc8aa 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24998,7 +24998,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
 milliseconds) and greater than zero, the function waits until the
 process is actually terminated or until the given time has passed. If
 the process is terminated, the function
-returns true.  On timeout a warning is emitted and
+returns true.  On timeout, a warning is emitted and
 false is returned.

   
diff --git a/src/backend/storage/ipc/signalfuncs.c 
b/src/backend/storage/ipc/signalfuncs.c
index 837699481c..f12c417854 100644
--- a/src/backend/storage/ipc/signalfuncs.c
+++ b/src/backend/storage/ipc/signalfuncs.c
@@ -187,12 +187,12 @@ pg_wait_until_termination(int pid, int64 timeout)
 }
 
 /*
- * Signal to terminate a backend process. This is allowed if you are a member
- * of the role whose process is being terminated. If timeout input argument is
- * 0 (which is default), then this function just signals the backend and
- * doesn't wait. Otherwise it waits until given the timeout milliseconds or no
- * process has the given PID and returns true. On timeout, a warning is emitted
- * and false is returned.
+ * Send a signal to terminate a backend process. This is allowed if you are a
+ * member of the role whose process is being terminated. If the timeout input
+ * argument is 0, then this function just signals the backend and returns true.
+ * If timeout is nonzero, then it waits until no process has the given PID; if
+ * the process ends within the timeout, true is returned, and if the timeout is
+ * exceeded, a warning is emitted and false is returned.
  *
  * Note that only superusers can signal superuser-owned processes.
  */
@@ -201,7 +201,7 @@ pg_terminate_backend(PG_FUNCTION_ARGS)
 {
int pid;
int r;
-   int timeout;
+   int timeout; /* milliseconds */
 
pid = PG_GETARG_INT32(0);
timeout = PG_GETARG_INT64(1);

Re: Error on pgbench logs

2021-06-12 Thread Yugo NAGATA

On Fri, 11 Jun 2021 16:09:10 +0200 (CEST)
Fabien COELHO  wrote:

> 
> Bonjour Michaël,
> 
> >> +  /* flush remaining stats */
> >> +  if (!logged && latency == 0.0)
> >> +  logAgg(logfile, agg);
> >
> > You are right, this is missing the final stats.  Why the choice of
> > latency here for the check?
> 
> For me zero latency really says that there is no actual transaction to 
> count, so it is a good trigger for the closing call. I did not wish to add 
> a new "flush" parameter, or a specific function. I agree that it looks 
> strange, though.

It will not work if the transaction is skipped, in which case latency is 0.0.
It would work if we check also "skipped" as bellow.

+   if (!logged && !skipped && latency == 0.0)

However, it still might not work if the latency is so small so that  we could
observe latency == 0.0. I observed this when I used a script that contained
only a meta command. This is not usual and would be a corner case, though.

> > Wouldn't it be better to do a final push of the states once a thread 
> > reaches CSTATE_FINISHED or CSTATE_ABORTED instead?
> 
> The call was already in place at the end of threadRun and had just become 
> ineffective. I did not wish to revisit its place and change the overall 
> structure, it is just a bug fix. I agree that it could be done differently 
> with the added logAgg function which could be called directly. Attached 
> another version which does that.

/* log aggregated but not yet reported transactions */
doLog(thread, state, &aggs, false, 0, 0);
+   logAgg(thread->logfile, &aggs);

I think we don't have to call doLog() before logAgg(). If we call doLog(),
we will count an extra transaction that is not actually processed because
accumStats() is called in this.

Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: Race condition in recovery?

Andrew Dunstan  writes:
> On 6/12/21 1:07 PM, Tom Lane wrote:
>> OK.  But it makes me itch a bit that this one wait-for-wal-to-be-
>> processed query looks different from all the other ones.

> I'm happy to bring the other two queries that look like this into line
> with this one if you like.

I see a lot more than two --- grepping for poll_query_until with
a test involving a LSN comparison finds a bunch.  Are we sure that
there are only three in which the LSN could be null?  How much
does it really matter if it is?

regards, tom lane

Re: Failure in subscription test 004_sync.pl

Amit Kapila  writes:
> On Sat, Jun 12, 2021 at 1:13 PM Michael Paquier  wrote:
>> wrasse has just failed with what looks like a timing error with a
>> replication slot drop:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2021-06-12%2006%3A16%3A30

> If we want to fix this, we might want to wait till the slot is active
> on the publisher before trying to drop it but not sure if it is a good
> idea. In the worst case, if the user retries this operation (Drop
> Subscription), it will succeed.

wrasse's not the only animal reporting this type of failure.
See also

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2021-06-12%2011%3A32%3A04

error running SQL: 'psql::1: ERROR:  could not drop replication slot 
"pg_16387_sync_16384_697288694805957" on publisher: ERROR:  replication 
slot "pg_16387_sync_16384_697288694805957" is active for PID 2971625'
while running 'psql -XAtq -d port=60321 host=/tmp/vdQmH7ijFI dbname='postgres' 
-f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION testsub2' at 
/home/bf/build/buildfarm-komodoensis/HEAD/pgsql.build/../pgsql/src/test/perl/PostgresNode.pm
 line 1771.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio&dt=2021-06-11%2020%3A30%3A28

error running SQL: 'psql::1: ERROR:  could not drop replication slot 
"testsub2" on publisher: ERROR:  replication slot "testsub2" is active for PID 
27175'
while running 'psql -XAtq -d port=59579 host=/tmp/9Qchjsykek dbname='postgres' 
-f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION testsub2' at 
/home/pgbf/buildroot/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm
 line 1771.

Those are both in the t/100_bugs.pl test script, but otherwise they
look like the exact same thing.

I don't think that it's optional to fix a problem that occurs as
often as three times in 10 days in the buildfarm.

regards, tom lane

Re: Race condition in recovery?



On 6/12/21 1:07 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 6/12/21 10:20 AM, Tom Lane wrote:
>>> I wonder whether that is a workaround for the poll_query_until bug
>>> I proposed to fix at [1].
>> No, it's because I found it annoying and confusing that there was an
>> invisible result when last_archived_wal is null.
> OK.  But it makes me itch a bit that this one wait-for-wal-to-be-
> processed query looks different from all the other ones.
>
>   


I'm happy to bring the other two queries that look like this into line
with this one if you like.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pg_filenode_relation(0,0) elog

Justin Pryzby  writes:
> On Fri, Jun 11, 2021 at 11:51:35PM -0400, Tom Lane wrote:
>> Likely it should check the reltablespace, too.

> I don't think so.  The docs say:
> https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-DBLOCATION
> |For a relation in the database's default tablespace, the tablespace can be 
> specified as zero.

Right, my mistake.  Pushed.

regards, tom lane

Re: Race condition in recovery?

Andrew Dunstan  writes:
> On 6/12/21 10:20 AM, Tom Lane wrote:
>> I wonder whether that is a workaround for the poll_query_until bug
>> I proposed to fix at [1].

> No, it's because I found it annoying and confusing that there was an
> invisible result when last_archived_wal is null.

OK.  But it makes me itch a bit that this one wait-for-wal-to-be-
processed query looks different from all the other ones.

regards, tom lane

Re: logical replication of truncate command with trigger causes Assert

Amit Kapila  writes:
> On Fri, Jun 11, 2021 at 8:56 PM Tom Lane  wrote:
>> I was thinking maybe we could mark all these replication protocol
>> violation errors non-translatable.  While we don't want to crash on a
>> protocol violation, it shouldn't really be a user-facing case either.

> I don't see any problem with that as these are not directly related to
> any user operation. So, +1 for making these non-translatable.

Done that way.  On re-reading the code, there were a bunch more
Asserts that could be triggered by bad input data, so the committed
patch has rather more corrections than I posted before.

regards, tom lane

[PATCH] check_random_seed: use a boolean not an int..

2021-06-12 Thread Justin Pryzby

This confused me for a minute so took the opportunity to clean it up.

Since: 2594cf0e8c0440619b1651c5a406d376657c
---
 src/backend/commands/variable.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/variable.c b/src/backend/commands/variable.c
index 0c85679420..56f15d2e37 100644
--- a/src/backend/commands/variable.c
+++ b/src/backend/commands/variable.c
@@ -589,11 +589,12 @@ check_transaction_deferrable(bool *newval, void **extra, 
GucSource source)
 bool
 check_random_seed(double *newval, void **extra, GucSource source)
 {
-   *extra = malloc(sizeof(int));
-   if (!*extra)
+   bool *doit = *extra = malloc(sizeof(bool));
+   if (doit == NULL)
return false;
+
/* Arm the assign only if source of value is an interactive SET */
-   *((int *) *extra) = (source >= PGC_S_INTERACTIVE);
+   *doit = (source >= PGC_S_INTERACTIVE);
 
return true;
 }
@@ -601,10 +602,11 @@ check_random_seed(double *newval, void **extra, GucSource 
source)
 void
 assign_random_seed(double newval, void *extra)
 {
+   bool *doit = (bool *)extra;
/* We'll do this at most once for any setting of the GUC variable */
-   if (*((int *) extra))
+   if (*doit)
DirectFunctionCall1(setseed, Float8GetDatum(newval));
-   *((int *) extra) = 0;
+   *doit = false;
 }
 
 const char *
-- 
2.17.0

Re: Race condition in recovery?



On 6/12/21 10:20 AM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> I have pushed a fix, tested on a replica of fairywren/drongo,
> This bit seems a bit random:
>
>  # WAL segment, this is enough to guarantee that the history file was
>  # archived.
>  my $archive_wait_query =
> -  "SELECT '$walfile_to_be_archived' <= last_archived_wal FROM 
> pg_stat_archiver;";
> +  "SELECT coalesce('$walfile_to_be_archived' <= last_archived_wal, false) " .
> +  "FROM pg_stat_archiver";
>  $node_standby->poll_query_until('postgres', $archive_wait_query)
>or die "Timed out while waiting for WAL segment to be archived";
>  my $last_archived_wal_file = $walfile_to_be_archived;
>
> I wonder whether that is a workaround for the poll_query_until bug
> I proposed to fix at [1].
>
>   regards, tom lane
>
> [1] https://www.postgresql.org/message-id/2130215.1623450521%40sss.pgh.pa.us



No, it's because I found it annoying and confusing that there was an
invisible result when last_archived_wal is null.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

SQLSTATE for replication connection failures

So far as I can find, just about everyplace that deals with replication
connections has slipshod error reporting.  An example from worker.c is

LogRepWorkerWalRcvConn = walrcv_connect(MySubscription->conninfo, true,
MySubscription->name, &err);
if (LogRepWorkerWalRcvConn == NULL)
ereport(ERROR,
(errmsg("could not connect to the publisher: %s", err)));

Because of the lack of any errcode() call, this failure will be reported
as XX000 ERRCODE_INTERNAL_ERROR, which is surely not appropriate.
worker.c is in good company though, because EVERY caller of walrcv_connect
is equally slipshod.

Shall we just use ERRCODE_CONNECTION_FAILURE for these failures, or
would it be better to invent another SQLSTATE code?  Arguably,
ERRCODE_CONNECTION_FAILURE is meant for failures of client connections;
but on the other hand, a replication connection is a sort of client.

regards, tom lane

Re: A new function to wait for the backend exit after termination

2021-06-12 Thread Noah Misch

On Sat, Jun 12, 2021 at 12:12:12AM -0500, Justin Pryzby wrote:
> Even if it's not removed, the descriptions should be cleaned up.
> 
> | src/include/catalog/pg_proc.dat-  descr => 'terminate a backend process and 
> if timeout is specified, wait for its exit or until timeout occurs',
> => I think doesn't need to change or mention the optional timeout at all

Agreed, these strings generally give less detail.  I can revert that to the
v13 wording, 'terminate a server process'.

Re: pg_filenode_relation(0,0) elog

2021-06-12 Thread Justin Pryzby

On Fri, Jun 11, 2021 at 11:51:35PM -0400, Tom Lane wrote:
> Justin Pryzby  writes:
> > Per sqlsmith.
> > postgres=# SELECT pg_filenode_relation(0,0);
> > ERROR:  unexpected duplicate for tablespace 0, relfilenode 0
> 
> Ugh.
> 
> > The usual expectation is that sql callable functions should return null 
> > rather
> > than hitting elog().
> 
> Agreed, but you should put the short-circuit into the SQL-callable
> function, ie pg_filenode_relation.  Lower-level callers ought not be
> passing junk data.

Right.  I spent inadequate time reading output of git grep.

> Likely it should check the reltablespace, too.

I don't think so.  The docs say:
https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-DBLOCATION
|For a relation in the database's default tablespace, the tablespace can be 
specified as zero.

Also, that would breaks expected/alter_table.out for the same reason.

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 3c70bb5943..144aca1099 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -905,6 +905,9 @@ pg_filenode_relation(PG_FUNCTION_ARGS)
Oid relfilenode = PG_GETARG_OID(1);
Oid heaprel = InvalidOid;
 
+   if (!OidIsValid(relfilenode))
+   PG_RETURN_NULL();
+
heaprel = RelidByRelfilenode(reltablespace, relfilenode);
 
if (!OidIsValid(heaprel))

Add proper planner support for ORDER BY / DISTINCT aggregates

A few years ago I wrote a patch to implement the missing aggregate
combine functions for array_agg and string_agg [1]. In the end, the
patch was rejected due to some concern [2] that if we allow these
aggregates to run in parallel then it might mess up the order in which
values are being aggregated by some unsuspecting user who didn't
include an ORDER BY clause in the aggregate. It was mentioned at the
time that due to nodeAgg.c performing so terribly with ORDER BY
aggregates that we couldn't possibly ask users who were upset by this
change to include an ORDER BY in their aggregate functions.

I'd still quite like to finish off the remaining aggregate functions
that still don't have a combine function, so to get that going again
I've written some code that gets the query planner onboard with
picking a plan that allows nodeAgg.c to skip doing any internal sorts
when the input results are already sorted according to what's required
by the aggregate function.

Of course, the query could have many aggregates all with differing
ORDER BY clauses. Since we reuse the same input for each, it might not
be possible to feed each aggregate with suitably sorted input. When
the input is not sorted, nodeAgg.c still must perform the sort as it
does today. So unfortunately we can't remove the nodeAgg.c code for
that.

The best I could come up with is just picking a sort order that suits
the first ORDER BY aggregate, then spin through the remaining ones to
see if there's any with a more strict order requirement that we can
use to suit that one and the first one. The planner does something
similar for window functions already, although it's not quite as
comprehensive to look beyond the first window for windows with a more
strict sort requirement.

This allows us to give presorted input to both aggregates in the following case:

SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ...

but just the first agg in this one:

SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ...

In order to make DISTINCT work, I had to add a new expression
evaluation operator to allow filtering if the current value is the
same as the last value. The current unpatched code implements
distinct when reading back the sorted tuplestore data. Since I don't
have a tuplestore with the pre-sorted version, I needed to cache the
last Datum, or last tuple somewhere. I ended up putting that in the
AggStatePerTransData struct. I'm not quite sure if I like that, but I
don't really see where else it could go.

When testing the performance of all this I found that when a suitable
index exists to provide pre-sorted input for the aggregation that the
performance does improve. Unfortunately, it looks like things get more
complex when no index exists. In this case, since we're setting
pathkeys to tell the planner we need a plan that provides pre-sorted
input to the aggregates, the planner will add a sort below the
aggregate node. I initially didn't see any problem with that as it
just moves the sort to a Sort node rather than having it done
implicitly inside nodeAgg.c. The problem is, it just does not perform
as well. I guess this is because when the sort is done inside
nodeAgg.c that the transition function is called in a tight loop while
reading records back from the tuplestore. In the patched version,
there's an additional node transition in between nodeAgg and nodeSort
and that causes slower performance. For now, I'm not quite sure what
to do about that. We set the plan pathkeys well before we could
possibly decide if asking for pre-sorted input for the aggregates
would be a good idea or not.

Please find attached my WIP patch. It's WIP due to what I mentioned
in the above paragraph and also because I've not bothered to add JIT
support for the new expression evaluation steps.

I'll add this to the July commitfest.

David

[1]
https://www.postgresql.org/message-id/CAKJS1f9sx_6GTcvd6TMuZnNtCh0VhBzhX6FZqw17TgVFH-ga_A%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/flat/6538.1522096067%40sss.pgh.pa.us#c228ed67026faa15209c76dada035da4

wip_planner_support_for_orderby_distinct_aggs_v0.patch
Description: Binary data

Re: Use pg_nextpower2_* in a few more places

On Sat, Jun 12, 2021 at 7:35 AM David Rowley  wrote:

> On Sun, 13 Jun 2021 at 02:08, Zhihong Yu  wrote:
> > Maybe add an assertion after the assignment, that newalloc >=
> LWLockTrancheNamesAllocated.
>
> I don't quite see how it would be possible for that to ever fail.  I
> could understand adding an Assert() if some logic was outside the
> function and we wanted to catch something outside of the function's
> control, but that's not the case here.  All the logic is within a few
> lines.
>
> Maybe it would help if we look at the if condition that this code
> executes under:
>
> /* If necessary, create or enlarge array. */
> if (tranche_id >= LWLockTrancheNamesAllocated)
>
> So since we're doing:
>
> +   newalloc = pg_nextpower2_32(Max(8, tranche_id + 1));
>
> assuming pg_nextpower2_32 does not give us something incorrect, then I
> don't quite see why Assert(newalloc >=  LWLockTrancheNamesAllocated)
> could ever fail.
>
> Can you explain why you think it might?
>
> David
>
Hi,
Interesting, the quoted if () line was not shown in the patch.
Pardon my not checking this line.

In that case, the assertion is not needed.

Re: Use pg_nextpower2_* in a few more places

On Sun, 13 Jun 2021 at 02:08, Zhihong Yu  wrote:
> Maybe add an assertion after the assignment, that newalloc >=  
> LWLockTrancheNamesAllocated.

I don't quite see how it would be possible for that to ever fail.  I
could understand adding an Assert() if some logic was outside the
function and we wanted to catch something outside of the function's
control, but that's not the case here.  All the logic is within a few
lines.

Maybe it would help if we look at the if condition that this code
executes under:

/* If necessary, create or enlarge array. */
if (tranche_id >= LWLockTrancheNamesAllocated)

So since we're doing:

+   newalloc = pg_nextpower2_32(Max(8, tranche_id + 1));

assuming pg_nextpower2_32 does not give us something incorrect, then I
don't quite see why Assert(newalloc >=  LWLockTrancheNamesAllocated)
could ever fail.

Can you explain why you think it might?

David

Re: Race condition in recovery?

Andrew Dunstan  writes:
> I have pushed a fix, tested on a replica of fairywren/drongo,

This bit seems a bit random:

 # WAL segment, this is enough to guarantee that the history file was
 # archived.
 my $archive_wait_query =
-  "SELECT '$walfile_to_be_archived' <= last_archived_wal FROM 
pg_stat_archiver;";
+  "SELECT coalesce('$walfile_to_be_archived' <= last_archived_wal, false) " .
+  "FROM pg_stat_archiver";
 $node_standby->poll_query_until('postgres', $archive_wait_query)
   or die "Timed out while waiting for WAL segment to be archived";
 my $last_archived_wal_file = $walfile_to_be_archived;

I wonder whether that is a workaround for the poll_query_until bug
I proposed to fix at [1].

regards, tom lane

[1] https://www.postgresql.org/message-id/2130215.1623450521%40sss.pgh.pa.us

Re: Use pg_nextpower2_* in a few more places

On Sat, Jun 12, 2021 at 6:40 AM David Rowley  wrote:

> Thanks for having a look.
>
> On Sun, 13 Jun 2021 at 00:50, Zhihong Yu  wrote:
> > -   newalloc = Max(LWLockTrancheNamesAllocated, 8);
> > -   while (newalloc <= tranche_id)
> > -   newalloc *= 2;
> > +   newalloc = pg_nextpower2_32(Max(8, tranche_id + 1));
> >
> > Should LWLockTrancheNamesAllocated be included in the Max() expression
> (in case it gets to a high value) ?
>
> I think the new code will produce the same result as the old code in all
> cases.
>
> All the old code did was finding the next power of 2 that's >= 8 and
> larger than tranche_id.  LWLockTrancheNamesAllocated is just a hint at
> where the old code should start searching from.  The new code does not
> need that hint. All it seems to do is save the old code from having to
> start the loop at 8 each time we need more space.
>
> David
>
Hi,
Maybe add an assertion after the assignment, that newalloc >=
 LWLockTrancheNamesAllocated.

Cheers

Signed vs Unsigned (take 2) (src/backend/storage/ipc/procarray.c)

2021-06-12 Thread Ranier Vilela

Hi,

With the recent changes at procarray.c, I take a look in.
msvc compiler, has some warnings about signed vs unsigned.

So.

1. Size_t is weird, because all types are int.
2. Wouldn't it be better to initialize static variables?
3. There are some shadowing parameters.
4. Possible loop beyond numProcs?

- for (size_t pgxactoff = 0; pgxactoff < numProcs; pgxactoff++)
+ for (int pgxactoff = 0; pgxactoff < numProcs; pgxactoff++)

I think no functional behavior changed.
Patch attached.

best regards,
Ranier Vilela


reduce_unmatechs_types_procarray.patch
Description: Binary data

Re: Use pg_nextpower2_* in a few more places

Thanks for having a look.

On Sun, 13 Jun 2021 at 00:50, Zhihong Yu  wrote:
> -   newalloc = Max(LWLockTrancheNamesAllocated, 8);
> -   while (newalloc <= tranche_id)
> -   newalloc *= 2;
> +   newalloc = pg_nextpower2_32(Max(8, tranche_id + 1));
>
> Should LWLockTrancheNamesAllocated be included in the Max() expression (in 
> case it gets to a high value) ?

I think the new code will produce the same result as the old code in all cases.

All the old code did was finding the next power of 2 that's >= 8 and
larger than tranche_id.  LWLockTrancheNamesAllocated is just a hint at
where the old code should start searching from.  The new code does not
need that hint. All it seems to do is save the old code from having to
start the loop at 8 each time we need more space.

David

Re: Race condition in recovery?



On 6/12/21 7:31 AM, Andrew Dunstan wrote:
> On 6/12/21 3:48 AM, Michael Paquier wrote:
>> On Fri, Jun 11, 2021 at 10:46:45AM -0400, Tom Lane wrote:
>>> I think jacana uses msys[2?], so this likely indicates a problem
>>> in path sanitization for the archive command.  Andrew, any advice?
>> Err, something around TestLib::perl2host()?
>
> I'm working on a fix for this. Yes it includes perl2host, but that's not
> enough :-)
>
>

I have pushed a fix, tested on a replica of fairywren/drongo,


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Failure in subscription test 004_sync.pl

On Sat, Jun 12, 2021 at 1:13 PM Michael Paquier  wrote:
>
> wrasse has just failed with what looks like a timing error with a
> replication slot drop:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2021-06-12%2006%3A16%3A30
>
> Here is the error:
> error running SQL: 'psql::1: ERROR:  could not drop replication
> slot "tap_sub" on publisher: ERROR:  replication slot "tap_sub" is
> active for PID 1641'
>
> It seems to me that this just lacks a poll_query_until() doing some
> slot monitoring?
>

I think it is showing a race condition issue in the code. In
DropSubscription, we first stop the worker that is receiving the WAL,
and then in a separate connection with the publisher, it tries to drop
the slot which leads to this error. The reason is that walsender is
still active as we just wait for wal receiver (or apply worker) to
stop. Normally, as soon as the apply worker is stopped the walsender
detects it and exits but in this case, it took some time to exit, and
in the meantime, we tried to drop the slot which is still in use by
walsender.

If we want to fix this, we might want to wait till the slot is active
on the publisher before trying to drop it but not sure if it is a good
idea. In the worst case, if the user retries this operation (Drop
Subscription), it will succeed.

-- 
With Regards,
Amit Kapila.

Re: Use pg_nextpower2_* in a few more places

On Sat, Jun 12, 2021 at 5:32 AM David Rowley  wrote:

> Back in f0705bb62, we added pg_nextpower2_32 and pg_nextpower2_64 to
> efficiently obtain the next power of 2 of a given number using an
> intrinsic function to find the left-most 1 bit.
>
> In d025cf88b and 02a2e8b44, I added some usages of these new functions
> but I didn't quite get all of them done.   The attached replaces all
> of the remaining ones that I'm happy enough to go near.
>
> The ones that I left behind are ones in the form of:
>
> while (reqsize >= buflen)
> {
>buflen *= 2;
>buf = repalloc(buf, buflen);
> }
>
> The reason I left those behind is that I was too scared that I might
> introduce an opportunity to wrap buflen back around to zero again.  At
> the moment the repalloc() would prevent that as we'd go above
> MaxAllocSize before we wrapped buflen back to zero again.  All the
> other places I touched does not change the risk of that happening.
>
> It would be nice to get rid of doing that repalloc() in a loop, but it
> would need a bit more study to ensure we couldn't wrap or we'd need to
> add some error checking code that raised an ERROR if it did wrap.  I
> don't want to touch those as part of this effort.
>
> I've also fixed up a few places that were just doubling the size of a
> buffer but used a "while" loop to do this when a simple "if" would
> have done.  Using an "if" is ever so slightly more optimal since the
> condition will be checked once rather than twice when the buffer needs
> to increase in size.
>
> I'd like to fix these for PG15.
>
> David
>
Hi,

-   newalloc = Max(LWLockTrancheNamesAllocated, 8);
-   while (newalloc <= tranche_id)
-   newalloc *= 2;
+   newalloc = pg_nextpower2_32(Max(8, tranche_id + 1));

Should LWLockTrancheNamesAllocated be included in the Max() expression (in
case it gets to a high value) ?

Cheers

Use pg_nextpower2_* in a few more places

Back in f0705bb62, we added pg_nextpower2_32 and pg_nextpower2_64 to
efficiently obtain the next power of 2 of a given number using an
intrinsic function to find the left-most 1 bit.

In d025cf88b and 02a2e8b44, I added some usages of these new functions
but I didn't quite get all of them done.   The attached replaces all
of the remaining ones that I'm happy enough to go near.

The ones that I left behind are ones in the form of:

while (reqsize >= buflen)
{
   buflen *= 2;
   buf = repalloc(buf, buflen);
}

The reason I left those behind is that I was too scared that I might
introduce an opportunity to wrap buflen back around to zero again.  At
the moment the repalloc() would prevent that as we'd go above
MaxAllocSize before we wrapped buflen back to zero again.  All the
other places I touched does not change the risk of that happening.

It would be nice to get rid of doing that repalloc() in a loop, but it
would need a bit more study to ensure we couldn't wrap or we'd need to
add some error checking code that raised an ERROR if it did wrap.  I
don't want to touch those as part of this effort.

I've also fixed up a few places that were just doubling the size of a
buffer but used a "while" loop to do this when a simple "if" would
have done.  Using an "if" is ever so slightly more optimal since the
condition will be checked once rather than twice when the buffer needs
to increase in size.

I'd like to fix these for PG15.

David


v1-0001-Improve-various-places-that-double-the-size-of-a-.patch
Description: Binary data

Re: Logical replication keepalive flood

On Fri, Jun 11, 2021 at 7:07 AM Kyotaro Horiguchi
 wrote:
>
> At Thu, 10 Jun 2021 12:18:00 +0530, Amit Kapila  
> wrote in
> > Good analysis. I think this analysis has shown that walsender is
> > sending messages at top speed as soon as they are generated. So, I am
> > wondering why there is any need to wait/sleep in such a workload. One
> > possibility that occurred to me RecentFlushPtr is not updated and or
> > we are not checking it aggressively. To investigate on that lines, can
> > you check the behavior with the attached patch? This is just a quick
> > hack patch to test whether we need to really wait for WAL a bit
> > aggressively.
>
> Yeah, anyway the comment for the caller site of WalSndKeepalive tells
> that exiting out of the function *after* there is somewhat wrong.
>
> > * possibly are waiting for a later location. So, before sleeping, we
> > * send a ping containing the flush location. If the receiver is
>
> But I nothing changed by moving the keepalive check to after the exit
> check. (loc <= RecentFlushPtr).
>
> And the patch also doesn't change the situation so much. The average
> number of loops is reduced from 3 to 2 per call but the ratio between
> total records and keepalives doesn't change.
>
> previsous: A=#total-rec = 19476, B=#keepalive=3006, B/A = 0.154
> this time: A=#total-rec = 13208, B=#keepalive=1988, B/A = 0.151
>
> Total records: 13208
> reqsz: #sent/ #!sent/ #call: wr lag  / fl lag
>8: 4 / 4 / 4:6448 /  268148
>   16: 1 / 1 / 1:8688 /  387320
>   24:  1988 /  1987 /  1999:6357 /  226163
>  195: 1 / 0 /20: 408 /1647
> 7477: 2 / 0 /   244:  68 / 847
> 8225: 1 / 1 / 1:7208 /7208
>
> So I checked how many bytes RecentFlushPtr is behind requested loc if
> it is not advanced enough.
>
> Total records: 15128
> reqsz:  #sent/ #!sent/ #call: wr lag  / fl lag  / RecentFlushPtr lag
> 8: 2 / 2 / 2: 520 /   60640 /   8
>16: 1 / 1 / 1:8664 /   89336 /  16
>24:  2290 /  2274 /  2302:5677 /  230583 /  23
>   187: 1 / 0 /40:   1 /6118 /   1
>  7577: 1 / 0 /69: 120 /3733 /  65
>  8177: 1 / 1 / 1:8288 /8288 /2673
>

Does this data indicate that when the request_size is 187 or 7577,
even though we have called WalSndWaitForWal() 40 and 69 times
respectively but keepalive is sent just once? Why such a behavior
should depend upon request size?

> So it's not a matter of RecentFlushPtr check. (Almost) Always when
> WalSndWakeupRequest feels to need to send a keepalive, the function is
> called before the record begins to be written.
>

I think we always wake up walsender after we have flushed the WAL via
WalSndWakeupProcessRequests(). I think here the reason why we are
seeing keepalives is that we always send it before sleeping. So, it
seems each time we try to read a new page, we call WalSndWaitForWal
which sends at least one keepalive message. I am not sure what is an
appropriate way to reduce the frequency of these keepalive messages.
Andres might have some ideas?

-- 
With Regards,
Amit Kapila.

Re: Race condition in recovery?



On 6/12/21 3:48 AM, Michael Paquier wrote:
> On Fri, Jun 11, 2021 at 10:46:45AM -0400, Tom Lane wrote:
>> I think jacana uses msys[2?], so this likely indicates a problem
>> in path sanitization for the archive command.  Andrew, any advice?
> Err, something around TestLib::perl2host()?


I'm working on a fix for this. Yes it includes perl2host, but that's not
enough :-)


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Small clean up in nodeAgg.c

Much of nodeAgg.c does not really know the difference between the
aggregate's combine function and the aggregate's transition function.
This was done on purpose so that we can keep as much code the same as
possible between partial aggregate and finalize aggregate.

We can take this a bit further with the attached patch which managed a
net reduction of about 3 dozen lines of code.

3 files changed, 118 insertions(+), 155 deletions(-)

I also did some renaming to try and make it more clear about when
we're talking about aggtransfn and when we're just talking about the
transition function that's being used, which in the finalize aggregate
case will be the combine function.

I proposed this a few years ago in [1], but at the time we went with a
more minimal patch to fix the bug being reported there with plans to
come back and do a bit more once we branched.

I've rebased this and I'd like to propose this small cleanup for pg15.

The patch is basically making build_pertrans_for_aggref() oblivious to
if it's working with the aggtransfn or the aggcombinefn and all the
code that needs to treat them differently is moved up into
ExecInitAgg().  This allows us to just completely get rid of
build_aggregate_combinefn_expr() and just use
build_aggregate_transfn_expr() instead.

I feel this is worth doing as nodeAgg.c has grown quite a bit over the
years. Shrinking it down a bit and maybe making it a bit more readable
seems like a worthy goal.  Heikki took a big step forward towards that
goal in 0a2bc5d61e. This, arguably, helps a little more.

I've included Andres and Horiguchi-san because they were part of the
discussion on the original thread.

David

[1] 
https://www.postgresql.org/message-id/CAKJS1f-Qu2Q9g6Xfcf5dctg99oDkbV9LyW4Lym9C4L1vEHTN%3Dg%40mail.gmail.com


v2-0001-Cleanup-some-aggregate-code-in-the-executor.patch
Description: Binary data

Re: Question about StartLogicalReplication() error path

On Fri, Jun 11, 2021 at 11:52 AM Jeff Davis  wrote:
>
> On Fri, 2021-06-11 at 10:13 +0530, Amit Kapila wrote:
> > I think because there is no need to process the WAL that has been
> > confirmed by the client. Do you see any problems with this scheme?
>
> Several:
>
> * Replication setups are complex, and it can be easy to misconfigure
> something or have a bug in some control code. An error is valuable to
> detect the problem closer to the source.
>
> * There are plausible configurations where things could go badly wrong.
> For instance, if you are storing the decoded data in another postgres
> server with syncrhonous_commit=off, and acknowledging LSNs before they
> are durable. A crash of the destination system would be consistent, but
> it would be missing some data earlier than the confirmed_flush_lsn. The
> client would then request the data starting at its stored lsn progress
> value, but the server would skip ahead to the confirmed_flush_lsn;
> silently missing data.
>

AFAIU, currently, in such a case, the subscriber (client) won't
advance the flush location (confirmed_flush_lsn). So, we won't lose
any data.

-- 
With Regards,
Amit Kapila.

Re: psql - add SHOW_ALL_RESULTS option

2021-06-12 Thread Fabien COELHO



Hello Peter,

My overly naive trust in non regression test to catch any issues has been 
largely proven wrong. Three key features do not have a single tests. Sigh.


I'll have some time to look at it over next week-end, but not before.


I have reverted the patch and moved the commit fest entry to CF 2021-07.


Attached a v7 which fixes known issues.

I've tried to simplify the code and added a few comments. I've moved query 
cancellation reset in one place in SendQuery. I've switched to an array of 
buffers for notices, as suggested by Tom.


The patch includes basic AUTOCOMMIT and ON_ERROR_ROLLBACK tests, which did 
not exist before, at all. I tried cancelling queries manually, but did not 
develop a test for this, mostly because last time I submitted a TAP test 
about psql to raise its coverage it was rejected.


As usual, what is not tested does not work…

--
Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 40b5109b55..ccce72fb85 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -50,8 +50,28 @@ BEGIN \;
 SELECT 2.0 AS "float" \;
 SELECT 'world' AS "text" \;
 COMMIT;
+ float 
+---
+   2.0
+(1 row)
+
+ text  
+---
+ world
+(1 row)
+
 -- compound with empty statements and spurious leading spacing
 \;\;   SELECT 3 + 3 \;\;\;   SELECT ' ' || ' !' \;\;   SELECT 1 + 4 \;;
+ ?column? 
+--
+6
+(1 row)
+
+ ?column? 
+--
+   !
+(1 row)
+
  ?column? 
 --
 5
@@ -61,6 +81,11 @@ COMMIT;
 SELECT 1 + 1 + 1 AS "add" \gset
 SELECT :add + 1 + 1 AS "add" \;
 SELECT :add + 1 + 1 AS "add" \gset
+ add 
+-
+   5
+(1 row)
+
 -- set operator
 SELECT 1 AS i UNION SELECT 2 ORDER BY i;
  i 
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index a8dfc41e40..1dbfb6a52a 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql
commands included in the string to divide it into multiple
transactions.  (See 
for more details about how the server handles multi-query strings.)
-   Also, psql only prints the
-   result of the last SQL command in the string.
-   This is different from the behavior when the same string is read from
-   a file or fed to psql's standard input,
-   because then psql sends
-   each SQL command separately.
   
   
-   Because of this behavior, putting more than one SQL command in a
-   single -c string often has unexpected results.
-   It's better to use repeated -c commands or feed
-   multiple commands to psql's standard input,
+   If having several commands executed in one transaction is not desired, 
+   use repeated -c commands or feed multiple commands to
+   psql's standard input,
either using echo as illustrated above, or
via a shell here-document, for example:
 
@@ -3532,10 +3525,6 @@ select 1\; select 2\; select 3;
 commands included in the string to divide it into multiple
 transactions.  (See 
 for more details about how the server handles multi-query strings.)
-psql prints only the last query result
-it receives for each request; in this example, although all
-three SELECTs are indeed executed, psql
-only prints the 3.
 
 
   
@@ -4122,6 +4111,18 @@ bar
   
 
   
+SHOW_ALL_RESULTS
+
+
+When this variable is set to off, only the last
+result of a combined query (\;) is shown instead of
+all of them.  The default is on.  The off behavior
+is for compatibility with older versions of psql.
+
+
+  
+
+   
 SHOW_CONTEXT
 
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 9a00499510..54a097a493 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -33,6 +33,7 @@ static bool DescribeQuery(const char *query, double *elapsed_msec);
 static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec);
 static bool command_no_begin(const char *query);
 static bool is_select_command(const char *query);
+static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, bool is_watch);
 
 
 /*
@@ -353,7 +354,7 @@ CheckConnection(void)
  * Returns true for valid result, false for error state.
  */
 static bool
-AcceptResult(const PGresult *result)
+AcceptResult(const PGresult *result, bool show_error)
 {
 	bool		OK;
 
@@ -384,7 +385,7 @@ AcceptResult(const PGresult *result)
 break;
 		}
 
-	if (!OK)
+	if (!OK && show_error)
 	{
 		const char *error = PQerrorMessage(pset.db);
 
@@ -472,6 +473,18 @@ ClearOrSaveResult(PGresult *result)
 	}
 }
 
+/*
+ * Consume all results
+ */
+static void
+ClearOrSaveAllResults()
+{
+	PGresult	*result;
+

Re: Add client connection check during the execution of the query

On Fri, Jun 11, 2021 at 9:24 PM Thomas Munro  wrote:

> On Fri, Apr 30, 2021 at 2:23 PM Thomas Munro 
> wrote:
> > Here's something I wanted to park here to look into for the next
> > cycle:  it turns out that kqueue's EV_EOF flag also has the right
> > semantics for this.  That leads to the idea of exposing the event via
> > the WaitEventSet API, and would the bring
> > client_connection_check_interval feature to 6/10 of our OSes, up from
> > 2/10.  Maybe Windows' FD_CLOSE event could get us up to 7/10, not
> > sure.
>
> Rebased.  Added documentation tweak and a check to reject the GUC on
> unsupported OSes.
>
Hi,

-   Assert(count > 0);
+   /* For WL_SOCKET_READ -> WL_SOCKET_CLOSED, no change needed. */
+   if (count == 0)
+   return;
+
Assert(count <= 2);

It seems that the remaining Assert() should say 1 <= count && count <= 2

+#ifdef POLLRDHUP
+   if ((cur_event->events & WL_SOCKET_CLOSED) &&
+   (cur_pollfd->revents & (POLLRDHUP | errflags)))

It seems the last condition above should be written as:

((cur_pollfd->revents & POLLRDHUP) | (cur_pollfd->revents & errflags))

Cheers

Re: Hook for extensible parsing.

2021-06-12 Thread Julien Rouhaud

On Tue, Jun 08, 2021 at 12:16:48PM +0800, Julien Rouhaud wrote:
> On Sun, Jun 06, 2021 at 02:50:19PM +0800, Julien Rouhaud wrote:
> > On Sat, May 01, 2021 at 03:24:58PM +0800, Julien Rouhaud wrote:
> > > 
> > > I'm attaching some POC patches that implement this approach to start a
> > > discussion.

The regression tests weren't stable, v4 fixes that.
>From 236c61e4f26b5ef2dedb9ecb7efacb175777fba8 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Wed, 21 Apr 2021 22:47:18 +0800
Subject: [PATCH v4 1/4] Add a parser_hook hook.

This does nothing but allow third-party plugins to implement a different
syntax, and fallback on the core parser if they don't implement a superset of
the supported core syntax.
---
 src/backend/tcop/postgres.c | 16 ++--
 src/include/tcop/tcopprot.h |  5 +
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..e941b59b85 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -99,6 +99,9 @@ int			log_statement = LOGSTMT_NONE;
 /* GUC variable for maximum stack depth (measured in kilobytes) */
 int			max_stack_depth = 100;
 
+/* Hook for plugins to get control in pg_parse_query() */
+parser_hook_type parser_hook = NULL;
+
 /* wait N seconds to allow attach from a debugger */
 int			PostAuthDelay = 0;
 
@@ -589,18 +592,27 @@ ProcessClientWriteInterrupt(bool blocked)
  * database tables.  So, we rely on the raw parser to determine whether
  * we've seen a COMMIT or ABORT command; when we are in abort state, other
  * commands are not processed any further than the raw parse stage.
+ *
+ * To support loadable plugins that monitor the parsing or implements SQL
+ * syntactic sugar we provide a hook variable that lets a plugin get control
+ * before and after the standard parsing process.  If the plugin only implement
+ * a subset of postgres supported syntax, it's its duty to call raw_parser (or
+ * the previous hook if any) for the statements it doesn't understand.
  */
 List *
 pg_parse_query(const char *query_string)
 {
-	List	   *raw_parsetree_list;
+	List	   *raw_parsetree_list = NIL;
 
 	TRACE_POSTGRESQL_QUERY_PARSE_START(query_string);
 
 	if (log_parser_stats)
 		ResetUsage();
 
-	raw_parsetree_list = raw_parser(query_string, RAW_PARSE_DEFAULT);
+	if (parser_hook)
+		raw_parsetree_list = (*parser_hook) (query_string, RAW_PARSE_DEFAULT);
+	else
+		raw_parsetree_list = raw_parser(query_string, RAW_PARSE_DEFAULT);
 
 	if (log_parser_stats)
 		ShowUsage("PARSER STATISTICS");
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 968345404e..131dc2b22e 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -17,6 +17,7 @@
 #include "nodes/params.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
+#include "parser/parser.h"
 #include "storage/procsignal.h"
 #include "utils/guc.h"
 #include "utils/queryenvironment.h"
@@ -43,6 +44,10 @@ typedef enum
 
 extern PGDLLIMPORT int log_statement;
 
+/* Hook for plugins to get control in pg_parse_query() */
+typedef List *(*parser_hook_type) (const char *str, RawParseMode mode);
+extern PGDLLIMPORT parser_hook_type parser_hook;
+
 extern List *pg_parse_query(const char *query_string);
 extern List *pg_rewrite_query(Query *query);
 extern List *pg_analyze_and_rewrite(RawStmt *parsetree,
-- 
2.31.1

>From dcd5a2b45fcc65724ac81e78afb0611e310f15e7 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Wed, 21 Apr 2021 23:54:02 +0800
Subject: [PATCH v4 2/4] Add a sqlol parser.

This is a toy example of alternative grammar that only accept a LOLCODE
compatible version of a

SELECT [column, ] column FROM tablename

and fallback on the core parser for everything else.
---
 contrib/Makefile|   1 +
 contrib/sqlol/.gitignore|   7 +
 contrib/sqlol/Makefile  |  33 ++
 contrib/sqlol/sqlol.c   | 107 +++
 contrib/sqlol/sqlol_gram.y  | 440 ++
 contrib/sqlol/sqlol_gramparse.h |  61 
 contrib/sqlol/sqlol_keywords.c  |  98 ++
 contrib/sqlol/sqlol_keywords.h  |  38 +++
 contrib/sqlol/sqlol_kwlist.h|  21 ++
 contrib/sqlol/sqlol_scan.l  | 544 
 contrib/sqlol/sqlol_scanner.h   | 118 +++
 11 files changed, 1468 insertions(+)
 create mode 100644 contrib/sqlol/.gitignore
 create mode 100644 contrib/sqlol/Makefile
 create mode 100644 contrib/sqlol/sqlol.c
 create mode 100644 contrib/sqlol/sqlol_gram.y
 create mode 100644 contrib/sqlol/sqlol_gramparse.h
 create mode 100644 contrib/sqlol/sqlol_keywords.c
 create mode 100644 contrib/sqlol/sqlol_keywords.h
 create mode 100644 contrib/sqlol/sqlol_kwlist.h
 create mode 100644 contrib/sqlol/sqlol_scan.l
 create mode 100644 contrib/sqlol/sqlol_scanner.h

diff --git a/contrib/Makefile b/contrib/Makefile
index f27e458482..2a80cd137b 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -43,6 +43,7 @@ SUBDIRS = \
 		postgres_fdw	\
 		s

Re: Race condition in recovery?

2021-06-12 Thread Michael Paquier

On Fri, Jun 11, 2021 at 10:46:45AM -0400, Tom Lane wrote:
> I think jacana uses msys[2?], so this likely indicates a problem
> in path sanitization for the archive command.  Andrew, any advice?

Err, something around TestLib::perl2host()?
--
Michael


signature.asc
Description: PGP signature

Failure in subscription test 004_sync.pl

2021-06-12 Thread Michael Paquier

Hi all,

wrasse has just failed with what looks like a timing error with a
replication slot drop:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2021-06-12%2006%3A16%3A30

Here is the error:
error running SQL: 'psql::1: ERROR:  could not drop replication
slot "tap_sub" on publisher: ERROR:  replication slot "tap_sub" is
active for PID 1641'

It seems to me that this just lacks a poll_query_until() doing some
slot monitoring?  I have not checked in details if we need to do that
in more places.  The code path that failed has been added in 7c4f524
from 2017.

Thanks,
--
Michael


signature.asc
Description: PGP signature

Re: Questions about support function and abbreviate

2021-06-12 Thread Han Wang

Hi Darafei,

Thanks for your reply.

However, I still don't get the full picture of this.  Let me make my
question more clear.

First of all, in the *`gistproc.c
`*
of Postgres, it shows that the `abbreviate` attributes should be set before
the `abbrev_converter` defined. So I would like to know where to define a
`SortSupport` structure with `abbreviate` is `true`.

Secondly, in the support functions of internal data type `Point`, the
`abbrev_full_copmarator` just z-order hash the point first like the
`abbrev_converter` doing and then compare the hash value. So I don't know
the difference between `full_comparator` and `comparator` after
`abbrev_converter`.

Best regards,
Han

On Sat, Jun 12, 2021 at 2:55 PM Darafei "Komяpa" Praliaskouski <
m...@komzpa.net> wrote:

> Hello,
>
> the abbrev_converter is applied whenever it is defined. The values are
> sorted using the abbreviated comparator first using the shortened version,
> and if there is a tie the system asks the real full comparator to resolve
> it.
>
> This article seems to be rather comprehensive:
> https://brandur.org/sortsupport
>
> On Sat, Jun 12, 2021 at 9:51 AM Han Wang  wrote:
>
>> Hi all,
>>
>> I am trying to implement a sort support function for geometry data types
>> in PostGIS with the new feature `SortSupport`. However, I have a question
>> about this.
>>
>> I think it is hardly to apply a sort support function to a complex data
>> type without the `abbrev_converter` to simply the data structure into a
>> single `Datum`. However, I do not know how the system determines when to
>> apply the converter.
>>
>> I appreciate any answers or suggestions. I am looking forward to hearing
>> from you.
>>
>> Best regards,
>> Han
>>
>
>
> --
> Darafei "Komяpa" Praliaskouski
> OSM BY Team - http://openstreetmap.by/
>

Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command

On Sat, Jun 12, 2021 at 12:56 AM Jeff Davis  wrote:
>
> On Fri, 2021-06-11 at 15:43 +0530, Amit Kapila wrote:
> > The new patches look mostly good apart from the below cosmetic
> > issues.
> > I think the question is whether we want to do these for PG-14 or
> > postpone them till PG-15. I think these don't appear to be risky
> > changes so we can get them in PG-14 as that might help some outside
> > core solutions as appears to be the case for Jeff.
>
> My main interest here is that I'm working on replication protocol
> support in a rust library. While doing that, I've encountered a lot of
> rough edges (as you may have seen in my recent posts), and this patch
> fixes one of them.
>
> But at the same time, several small changes to the protocol spread
> across several releases introduces more opportunity for confusion.
>
> If we are confident this is the right direction, then v14 makes sense
> for consistency. But if waiting for v15 might result in a better
> change, then we should probably just get it into the July CF for v15.
>

If that is the case, I would prefer July CF v15. The patch is almost
ready, so I'll try to get it early in the July CF. Ajin, feel free to
post the patch after addressing some minor comments raised by me
yesterday.

-- 
With Regards,
Amit Kapila.

Re: logical replication of truncate command with trigger causes Assert