date:20211215

Re: Emit a warning if the extension's GUC is set incorrectly

2021-12-15 Thread Shinya Kato


Thank you for the review and sorry for the late reply.

On 2021-11-16 19:25, Bharath Rupireddy wrote:

> I observed an odd behaviour:
> 1) I set postgres_fdw.XXX = 'I_messed_up_conf_file' in postgresql.conf
> 2) With EmitWarningsOnPlaceholders("postgres_fdw"); in postgres_fdw
> contrib module, I created the extension, have seen the following
> warning:
> 2021-11-15 06:02:31.198 UTC [2018111] WARNING:  unrecognized
> configuration parameter "postgres_fdw.XXX"
> 3) I further did, "alter system set
> postgres_fdw.XXX='I_further_messed_up_conf_file';" and also "select
> pg_reload_conf();", it silently accepts.
>
> postgres=# create extension postgres_fdw ;
> WARNING:  unrecognized configuration parameter "postgres_fdw.XXX"
> CREATE EXTENSION
> postgres=# alter system set
> postgres_fdw.XXX='I_further_messed_up_conf_file';
> ALTER SYSTEM
> postgres=# select pg_reload_conf();
>  pg_reload_conf
> 
>  t
> (1 row)


I have made changes to achieve the above.
However, it also gives me an error when
---
postgres=# SET abc.a TO on;
SET
postgres=# SET abc.b TO on;
2021-12-16 16:22:20.351 JST [2489236] ERROR:  unrecognized configuration 
parameter "abc.a"

2021-12-16 16:22:20.351 JST [2489236] STATEMENT:  SET abc.b TO on;
ERROR:  unrecognized configuration parameter "abc.a"
---

I know that some people do not think this is good.
Therefore, can I leave this problem alone or is there another better 
way?




For backward compatibility you can retain the function
EmitWarningsOnPlaceholders as-is and have another similar function
that emits an error:

In guc.c, have something like below:
static void
check_conf_params(const char *className, bool emit_error)
{
   /* have the existing code of EmitWarningsOnPlaceholders here */
ereport(emit_error ? ERROR : WARNING,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg("unrecognized configuration parameter 
\"%s\"",

var->name)));
}

void
EmitErrorOnPlaceholders(const char *className)
{
check_conf_params(className, true);
}

void
EmitWarningsOnPlaceholders(const char *className)
{
check_conf_params(className, false);
}



Thank you for your advise.
According to [1], we used the same function name, but the warning level 
was INFO.

Therefore, I think it is OK to use the same function name.

[1] 
https://www.postgresql.org/message-id/flat/200901051634.n05GYNr06169%40momjian.us#1d045374f014494e4b40a4862a000723


--
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c
index 5820ac328d..d11dd1e416 100644
--- a/contrib/auth_delay/auth_delay.c
+++ b/contrib/auth_delay/auth_delay.c
@@ -67,6 +67,9 @@ _PG_init(void)
 			NULL,
 			NULL,
 			NULL);
+
+	EmitWarningsOnPlaceholders("auth_delay");
+
 	/* Install Hooks */
 	original_client_auth_hook = ClientAuthentication_hook;
 	ClientAuthentication_hook = auth_delay_checks;
diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index fb38135f7a..0407c7dd64 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -100,6 +100,8 @@ _PG_init(void)
 			 NULL,
 			 NULL,
 			 NULL);
+
+	EmitWarningsOnPlaceholders("pg_trgm");
 }
 
 /*
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 48c7417e6e..36555398ec 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -469,4 +469,6 @@ _PG_init(void)
 			   NULL,
 			   NULL,
 			   NULL);
+
+	EmitWarningsOnPlaceholders("postgres_fdw");
 }
diff --git a/contrib/sepgsql/hooks.c b/contrib/sepgsql/hooks.c
index 19a3ffb7ff..44a09c35a7 100644
--- a/contrib/sepgsql/hooks.c
+++ b/contrib/sepgsql/hooks.c
@@ -455,6 +455,8 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	EmitWarningsOnPlaceholders("sepgsql");
+
 	/* Initialize userspace access vector cache */
 	sepgsql_avc_init();
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e91d5a3cfd..d136ca12ea 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -5431,11 +5431,19 @@ valid_custom_variable_name(const char *name)
 {
 	bool		saw_sep = false;
 	bool		name_start = true;
+	int 		placeholder_length = 0;
 
 	for (const char *p = name; *p; p++)
 	{
+		placeholder_length++;
 		if (*p == GUC_QUALIFIER_SEPARATOR)
 		{
+			/* Check invalid placeholder and custom parameter*/
+			char *placeholder = malloc(sizeof(char) * (placeholder_length - 1));
+			strncpy(placeholder, name, placeholder_length - 1);
+			placeholder[placeholder_length -1] = '\0';
+			EmitErrorsOnPlaceholders(placeholder);
+
 			if (name_start)
 return false;	/* empty name component */
 			saw_sep = true;
@@ -9322,6 +9330,13 @@ DefineCustomEnumVariable(const char *name,
 	define_custom_variable(&var->gen);
 }
 
+/*
+ * For historical context, this function name is EmitWarningsOnPlaceholde

RE: parallel vacuum comments

2021-12-15 Thread houzj.f...@fujitsu.com

On Wed, Dec 15, 2021 4:03 PM Masahiko Sawada  wrote:
> On Tue, Dec 14, 2021 at 12:03 PM Amit Kapila  wrote:
> > There is still pending
> > work related to moving parallel vacuum code to a separate file and a
> > few other pending comments that are still under discussion. We can
> > take care of those in subsequent patches. Do, let me know if you or
> > others think differently?
> 
> I'm on the same page.
> 
> I've attached an updated patch. The patch incorporated several changes from
> the last version:
> 
> * Rename parallel_vacuum_begin() to parallel_vacuum_init()
> * Unify the terminology; use "index bulk-deletion" and "index cleanup"
> instead of "index vacuum" and "index cleanup".
> * Fix the comment of parallel_vacuum_init() pointed out by Andres
> * Fix a typo that is left in commit 22bd3cbe0c (pointed out by Hou)
> 
> Please review it.

Thanks for updating the patch.
Here are a few comments:

1)
+   case PARALLEL_INDVAC_STATUS_NEED_CLEANUP:
+   errcontext("while cleanup index \"%s\" of relation 
\"%s.%s\"",

I noticed current code uses error msg "While cleaning up index xxx" which seems 
a little
different from the patch's maybe we can use the previous one ?

2)
static inline Size max_items_to_alloc_size(int max_items);

This old function declaration can be deleted.

3)
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list

I think we need to remove LVShared, LVSharedIndStats, LVDeadItems and
LVParallelState from typedefs.list and add PVShared and PVIndStats to the file.

Best regards,
Hou zj

Re: Failed transaction statistics to measure the logical replication progress

2021-12-15 Thread Greg Nancarrow

On Tue, Dec 14, 2021 at 1:28 PM Amit Kapila  wrote:
>
> If we fear a large number of entries for such workers then won't it be
> better to show the value of these stats only for apply workers. I
> think normally the table sync workers perform only copy operation or
> maybe a fixed number of xacts, so, one might not be interested in the
> transaction stats of these workers. I find merging only specific stats
> of two different types of workers confusing.
>
> What do others think about this?
>

I think it might be OK to NOT include the transaction stats of the
tablesync workers, but my understanding (and slight concern) is that
currently there is potentially some overlap in the work done by the
tablesync and apply workers - perhaps the small patch (see [1]) proposed by
Peter Smith could also be considered, in order to make that distinction of
work clearer, and the stats more meaningful?

[1]
https://www.postgresql.org/message-id/flat/cahut+pt39pbqs0sxt9rmm89ayizoq0kw46yzskkzwk8z5ho...@mail.gmail.com

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Failed transaction statistics to measure the logical replication progress

2021-12-15 Thread osumi.takami...@fujitsu.com

On Wednesday, December 15, 2021 9:52 PM vignesh C  wrote:
> On Tue, Dec 14, 2021 at 7:58 AM Amit Kapila 
> wrote:
> >
> > On Mon, Dec 13, 2021 at 5:48 PM osumi.takami...@fujitsu.com
> >  wrote:
> > >
> > > On Monday, December 13, 2021 6:19 PM Amit Kapila
>  wrote:
> > > > On Tue, Dec 7, 2021 at 3:12 PM osumi.takami...@fujitsu.com
> > > >  wrote:
> > > >
> > > > Few questions and comments:
> > > Thank you for your comments !
> > >
> > > > 
> > > > 1.
> > > > The pg_stat_subscription_workers view
> > > > will contain
> > > > one row per subscription worker on which errors have occurred, for
> workers
> > > > applying logical replication changes and workers handling the 
> > > > initial
> data
> > > > -   copy of the subscribed tables.  The statistics entry is removed
> when the
> > > > -   corresponding subscription is dropped.
> > > > +   copy of the subscribed tables. Also, the row corresponding to the
> apply
> > > > +   worker shows all transaction statistics of both types of workers on
> the
> > > > +   subscription. The statistics entry is removed when the
> corresponding
> > > > +   subscription is dropped.
> > > >
> > > > Why did you choose to show stats for both types of workers in one row?
> > > This is because if we have hundreds or thousands of tables for table
> > > sync, we need to create many entries to cover them and store the entries 
> > > for
> all tables.
> > >
> >
> > If we fear a large number of entries for such workers then won't it be
> > better to show the value of these stats only for apply workers. I
> > think normally the table sync workers perform only copy operation or
> > maybe a fixed number of xacts, so, one might not be interested in the
> > transaction stats of these workers. I find merging only specific stats
> > of two different types of workers confusing.
> >
> > What do others think about this?
> 
> We can remove the table sync workers transaction stats count to avoid
> confusion, take care of the documentation changes too accordingly.
Hi, apologies for my late reply.

Thank you, everyone for confirming the direction.
I'll follow the consensus of the community
and fix the patch, including other comments.
I'll treat only the stats for apply workers.


Best Regards,
Takamichi Osumi

Addition of --no-sync to pg_upgrade for test speedup

2021-12-15 Thread Michael Paquier

Hi all,

As per $subject, avoiding the flush of the new cluster's data
directory shortens a bint the runtime of the test.  In some of my slow
VMs, aka Windows, this shaves a couple of seconds even if the bulk of
the time is still spent on the main regression test suite.

In pg_upgrade, we let the flush happen with initdb --sync-only, based
on the binary path of the new cluster, so I think that we are not
going to miss any test coverage by skipping that.

Thoughts or opinions?
--
Michael
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 64bbda5650..5c6b6d2411 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -43,6 +43,7 @@ parseCommandLine(int argc, char *argv[])
 		{"new-datadir", required_argument, NULL, 'D'},
 		{"old-bindir", required_argument, NULL, 'b'},
 		{"new-bindir", required_argument, NULL, 'B'},
+		{"no-sync", no_argument, NULL, 'N'},
 		{"old-options", required_argument, NULL, 'o'},
 		{"new-options", required_argument, NULL, 'O'},
 		{"old-port", required_argument, NULL, 'p'},
@@ -67,6 +68,7 @@ parseCommandLine(int argc, char *argv[])
 	time_t		run_time = time(NULL);
 
 	user_opts.transfer_mode = TRANSFER_MODE_COPY;
+	user_opts.do_sync = true;
 
 	os_info.progname = get_progname(argv[0]);
 
@@ -101,7 +103,7 @@ parseCommandLine(int argc, char *argv[])
 	if (os_user_effective_id == 0)
 		pg_fatal("%s: cannot be run as root\n", os_info.progname);
 
-	while ((option = getopt_long(argc, argv, "d:D:b:B:cj:ko:O:p:P:rs:U:v",
+	while ((option = getopt_long(argc, argv, "d:D:b:B:cj:kNo:O:p:P:rs:U:v",
  long_options, &optindex)) != -1)
 	{
 		switch (option)
@@ -134,6 +136,10 @@ parseCommandLine(int argc, char *argv[])
 user_opts.transfer_mode = TRANSFER_MODE_LINK;
 break;
 
+			case 'N':
+user_opts.do_sync = false;
+break;
+
 			case 'o':
 /* append option? */
 if (!old_cluster.pgopts)
@@ -297,6 +303,7 @@ usage(void)
 	printf(_("  -D, --new-datadir=DATADIR new cluster data directory\n"));
 	printf(_("  -j, --jobs=NUMnumber of simultaneous processes or threads to use\n"));
 	printf(_("  -k, --linklink instead of copying files to new cluster\n"));
+	printf(_("  -N, --no-sync do not wait for changes to be written safely to disk\n"));
 	printf(_("  -o, --old-options=OPTIONS old cluster options to pass to the server\n"));
 	printf(_("  -O, --new-options=OPTIONS new cluster options to pass to the server\n"));
 	printf(_("  -p, --old-port=PORT   old cluster port number (default %d)\n"), old_cluster.port);
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 3628bd74a7..f85cb2e262 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -169,11 +169,14 @@ main(int argc, char **argv)
 			  new_cluster.pgdata);
 	check_ok();
 
-	prep_status("Sync data directory to disk");
-	exec_prog(UTILITY_LOG_FILE, NULL, true, true,
-			  "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir,
-			  new_cluster.pgdata);
-	check_ok();
+	if (user_opts.do_sync)
+	{
+		prep_status("Sync data directory to disk");
+		exec_prog(UTILITY_LOG_FILE, NULL, true, true,
+  "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir,
+  new_cluster.pgdata);
+		check_ok();
+	}
 
 	create_script_for_old_cluster_deletion(&deletion_script_file_name);
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 235a770026..22169f1002 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -279,6 +279,7 @@ typedef struct
 {
 	bool		check;			/* true -> ask user for permission to make
  * changes */
+	bool		do_sync;		/* flush changes to disk */
 	transferMode transfer_mode; /* copy files or link them? */
 	int			jobs;			/* number of processes/threads to use */
 	char	   *socketdir;		/* directory to use for Unix sockets */
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 32d186d897..d6a318367a 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -233,7 +233,7 @@ PGDATA="$BASE_PGDATA"
 
 standard_initdb 'initdb'
 
-pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT"
+pg_upgrade $PG_UPGRADE_OPTS --no-sync -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT"
 
 # make sure all directories and files have group permissions, on Unix hosts
 # Windows hosts don't support Unix-y permissions.
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 1289123129..c5ce732ee9 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -130,6 +130,22 @@ PostgreSQL documentation
   cluster
  
 
+ 
+  -N
+  --no-sync
+  
+   
+By default, pg_upgrade will wait for all files
+of the upgraded cluster to be written safely to disk.  This option
+causes pg_upgrade to ret

Re: Allow escape in application_name

2021-12-15 Thread Fujii Masao





On 2021/12/16 11:53, kuroda.hay...@fujitsu.com wrote:

Dear Fujii-san,

Thank you for updating! I read your patches and I have
only one comment.


if (strcmp(keywords[i], "application_name") == 0 &&
values[i] != NULL && *(values[i]) != '\0')


I'm not sure but do we have a case that values[i] becomes NULL
even if keywords[i] is "application_name"?


No for now, I guess. But isn't it safer to check that, too? I also could not 
find strong reason why that check should be dropped. But you'd like to drop 
that?


I think other parts are perfect.


Thanks for the review! At first I pushed 0001 patch.

BTW, 0002 patch adds the regression test that checks 
pg_stat_activity.application_name. But three months before, we added the 
similar test when introducing postgres_fdw.application_name GUC and 
reverted/removed it because it's not stable [1]. So we should review carefully 
whether the test 0002 patch adds may have the same issue or not. As far as I 
read the patch, ISTM that the patch has no same issue. But could you double 
check that?

[1]
https://postgr.es/m/848ff477-effd-42b9-8b25-3f7cfe289...@oss.nttdata.com

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Documenting when to retry on serialization failure

2021-12-15 Thread Greg Stark

Fwiw I think the real problem with automatic retries is that the SQL
interface doesn't lend itself to it because the server never really
knows if the command is going to be followed by a commit or more
commands.

I actually think if that problem were tackled it would very likely be
a highly appreciated option. Because I think there's a big overlap
between the set of users interested in higher isolation levels and the
set of users writing stored procedures defining their business logic.
They're both kind of "traditional" SQL engine approaches and both lend
themselves to the environment where you have a lot of programmers
working on a system and you're not able to do things like define
strict locking and update orderings.

So a lot of users are probably looking at something like "BEGIN;
SELECT create_customer_order(); COMMIT" and wondering why the
server can't handle automatically retrying the query if they get an
isolation failure.

There are actually other reasons why providing the whole logic for the
transaction up front with a promise that it'll be the whole
transaction is attractive. E.g. vacuum could ignore a transaction if
it knows the transaction will never look at the table it's
processing... Or automatic deadlock testing tools could extract the
list of tables being accessed and suggest "lock table" commands to put
at the head of the transaction sorted in a canonical order.

These things may not be easy but they're currently impossible for the
same reasons automatically retrying is. The executor doesn't know what
subsequent commands will be coming after the current one and doesn't
know whether it has the whole transaction.

Re: [PATCH] Accept IP addresses in server certificate SANs

2021-12-15 Thread Kyotaro Horiguchi

At Thu, 16 Dec 2021 01:13:57 +, Jacob Champion  wrote 
in 
> This patch arose because I was writing tests for the NSS implementation
> that used a server cert with both DNS names and IP addresses, and then
> they failed when I ran those tests against the OpenSSL implementation.
> NSS supports this functionality natively. Anecdotally, I've heard from
> at least one client group who is utilizing IP-based certificates in
> their cloud deployments. It seems uncommon but still useful.
> 
> There are two open questions I have; they're based on NSS
> implementation details that I did not port here:
> 
> - NSS allows an IPv4 SAN to match an IPv6 mapping of that same address,
>   and vice-versa. I chose not to implement that behavior, figuring it
>   is easy enough for people to issue a certificate with both addresses.
>   Is that okay?

> - If a certificate contains only iPAddress SANs, and none of them
>   match, I fall back to check the certificate Common Name. OpenSSL will
>   not do this (its X509_check_ip considers only SANs). NSS will only do
>   this if the client's address is itself a DNS name. The spec says that
>   we can't fall back to Common Name if the SANs contain any DNS
>   entries, but it's silent on the subject of IP addresses. What should
>   the behavior be?
> 
> The patchset roadmap:
> 
> - 0001 moves inet_net_pton() to src/port, since libpq will need it.
> - 0002 implements the new functionality and adds tests.
> 
> WDYT?

In RFC2828 and 6125,

>   In some cases, the URI is specified as an IP address rather than a
>   hostname.  In this case, the iPAddress subjectAltName must be present
>   in the certificate and must exactly match the IP in the URI.

It seems like saying that we must search for iPAddress and mustn't use
CN nor dNSName if the client connected using IP address. Otherwise, if
the host name is a domain name, we use only dNSName if present, and
use CN otherwise.  That behavior seems agreeing to what you wrote as
NSS's behavior.  That being said it seems to me we should preserve
that behavior at least for OpenSSL as an established behavior.

In short, I think the current behavior of the patch is the direction
we would go but some documentation is may be needed.

I'm not sure about ipv4 comptible addresses.  However, I think we can
identify ipv4 compatible address easily.

+* pg_inet_net_pton() will accept CIDR masks, which we don't 
want to
+* match, so skip the comparison if the host string contains a 
slash.
+*/
+   if (!strchr(host, '/')
+   && pg_inet_net_pton(PGSQL_AF_INET6, host, addr, -1) == 
128)

If a cidr is given, pg_inet_net_pton returns a number less than 128 so
we don't need to check '/' explicity?  (I'm not sure '/128' is
sensible but doesn't harm..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Skipping logical replication transactions on subscriber side

2021-12-15 Thread Masahiko Sawada

On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila  wrote:
>
> On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada  
> wrote:
> >
> > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila  
> > wrote:
> > >
> > > I thought we just want to lock before clearing the skip_xid something
> > > like take the lock, check if the skip_xid in the catalog is the same
> > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > it as it is. How will that disallow users to change skip_xid when we
> > > are skipping changes?
> >
> > Oh I thought we wanted to keep holding the lock while skipping changes
> > (changing skip_xid requires acquiring the lock).
> >
> > So if skip_xid is already changed, the apply worker would do
> > replorigin_advance() with WAL logging, instead of committing the
> > catalog change?
> >
>
> Right. BTW, how are you planning to advance the origin? Normally, a
> commit transaction would do it but when we are skipping all changes,
> the commit might not do it as there won't be any transaction id
> assigned.

I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Optionally automatically disable logical replication subscriptions on error

2021-12-15 Thread Greg Nancarrow

On Tue, Dec 14, 2021 at 4:34 PM osumi.takami...@fujitsu.com
 wrote:
>
> Besides all of those changes, I've removed the obsolete
> comment of DisableSubscriptionOnError in v12.
>

I have a few minor comments, otherwise the patch LGTM at this point:

doc/src/sgml/catalogs.sgml
(1)
Current comment says:

+   If true, the subscription will be disabled when subscription's
+   worker detects any errors

However, in create_subscription.sgml, it says "disabled if any errors
are detected by subscription workers ..."

For consistency, I think it should be:

+   If true, the subscription will be disabled when subscription
+   workers detect any errors

src/bin/psql/describe.c
(2)
I think that:

+  gettext_noop("Disable On Error"));

should be:

+  gettext_noop("Disable on error"));

for consistency with the uppercase/lowercase usage on other similar entries?
(e.g. "Two phase commit")


src/include/catalog/pg_subscription.h
(3)

+  bool subdisableonerr; /* True if apply errors should disable the
+   * subscription upon error */

The comment should just say "True if occurrence of apply errors should
disable the subscription"


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

2021-12-15 Thread Amit Kapila

On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada  wrote:
>
> On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila  wrote:
> >
> > I thought we just want to lock before clearing the skip_xid something
> > like take the lock, check if the skip_xid in the catalog is the same
> > as we have skipped, if it is the same then clear it, otherwise, leave
> > it as it is. How will that disallow users to change skip_xid when we
> > are skipping changes?
>
> Oh I thought we wanted to keep holding the lock while skipping changes
> (changing skip_xid requires acquiring the lock).
>
> So if skip_xid is already changed, the apply worker would do
> replorigin_advance() with WAL logging, instead of committing the
> catalog change?
>

Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

2021-12-15 Thread Masahiko Sawada

On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila  wrote:
>
> On Wed, Dec 15, 2021 at 8:19 PM Masahiko Sawada  wrote:
> >
> > On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila  wrote:
> > >
> > > On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow  
> > > > wrote:
> > > > >
> > > > > On Tue, Dec 14, 2021 at 3:23 PM vignesh C  wrote:
> > > > > >
> > > > > > While the worker is skipping one of the skip transactions specified 
> > > > > > by
> > > > > > the user and immediately if the user specifies another skip
> > > > > > transaction while the skipping of the transaction is in progress 
> > > > > > this
> > > > > > new value will be reset by the worker while clearing the skip xid. I
> > > > > > felt once the worker has identified the skip xid and is about to 
> > > > > > skip
> > > > > > the xid, the worker can acquire a lock to prevent concurrency 
> > > > > > issues:
> > > > >
> > > > > That's a good point.
> > > > > If only the last_error_xid could be skipped, then this wouldn't be an
> > > > > issue, right?
> > > > > If a different xid to skip is specified while the worker is currently
> > > > > skipping a transaction, should that even be allowed?
> > > > >
> > > >
> > > > We don't expect such usage but yes, it could happen and seems not
> > > > good. I thought we can acquire Share lock on pg_subscription during
> > > > the skip but not sure it's a good idea. It would be better if we can
> > > > find a way to allow users to specify only XID that has failed.
> > > >
> > >
> > > Yeah, but as we don't have a definite way to allow specifying only
> > > failed XID, I think it is better to use share lock on that particular
> > > subscription. We are already using it for add/update rel state (see,
> > > AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
> > > another place to use a similar technique.
> >
> > Yes, but it seems to mean that we disallow users to change skip_xid
> > while the apply worker is skipping changes so we will end up having
> > the same problem we discussed so far;
> >
>
> I thought we just want to lock before clearing the skip_xid something
> like take the lock, check if the skip_xid in the catalog is the same
> as we have skipped, if it is the same then clear it, otherwise, leave
> it as it is. How will that disallow users to change skip_xid when we
> are skipping changes?

Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).

So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: pg_dump versus ancient server versions

2021-12-15 Thread Tom Lane

Justin Pryzby  writes:
> I think you missed a few parts though ?

Um.  I think those are leftover from when I was intending the
cutoff to be 9.0 not 9.2.  I'll take a fresh look tomorrow.

regards, tom lane

Re: pg_dump versus ancient server versions

2021-12-15 Thread Justin Pryzby

On Wed, Dec 15, 2021 at 10:08:07PM -0600, Justin Pryzby wrote:
> Is it possible to clean up pg_upgrade, too ?

Nevermind - I found yesterday's e469f0aaf3 after git-fetch.

I think you missed a few parts though ?

src/bin/pg_upgrade/function.c
if (GET_MAJOR_VERSION(old_cluster.major_version) <= 900)
...
if (GET_MAJOR_VERSION(old_cluster.major_version) <= 900 
&&
strcmp(lib, "$libdir/plpython") == 0)

src/bin/pg_upgrade/option.c
 * Someday, the port number option could be 
removed and passed
 * using -o/-O, but that requires postmaster -C 
to be
 * supported on all old/new versions (added in 
PG 9.2).
...
if (GET_MAJOR_VERSION(cluster->major_version) >= 901)

Re: parallel vacuum comments

2021-12-15 Thread Amit Kapila

On Wed, Dec 15, 2021 at 1:33 PM Masahiko Sawada  wrote:
>
> I've attached an updated patch. The patch incorporated several changes
> from the last version:
>
> * Rename parallel_vacuum_begin() to parallel_vacuum_init()
> * Unify the terminology; use "index bulk-deletion" and "index cleanup"
> instead of "index vacuum" and "index cleanup".
>

I am not sure it is a good idea to do this as part of the main patch
as the intention of that is to just refactor parallel vacuum code. I
suggest doing this as a separate patch. Also, can we move the common
code to be shared between vacuumparallel.c and vacuumlazy.c as a
separate patch?

Few other comments and questions:

1. /* Outsource everything to parallel variant */
- parallel_vacuum_process_all_indexes(vacrel, true);
+ LVSavedErrInfo saved_err_info;
+
+ /*
+ * Outsource everything to parallel variant. Since parallel vacuum will
+ * set the error context on an error we temporarily disable setting our
+ * error context.
+ */
+ update_vacuum_error_info(vacrel, &saved_err_info,
+ VACUUM_ERRCB_PHASE_UNKNOWN,
+ InvalidBlockNumber, InvalidOffsetNumber);
+
+ parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, vacrel->old_live_tuples);
+
+ /* Revert to the previous phase information for error traceback */
+ restore_vacuum_error_info(vacrel, &saved_err_info);

Is this change because you want a separate error callback for parallel
vacuum? If so, I suggest we can discuss this as a separate patch from
the refactoring patch.

2. Is introducing bulkdel_one_index/cleanup_one_index related to new
error context, or "Unify the terminology" task? Is there any other
reason for the same?

3. Why did you introduce
parallel_vacuum_bulkdel_all_indexes()/parallel_vacuum_cleanup_all_indexes()?
Is it because of your task "Unify the terminology"?

4.
@@ -3086,7 +2592,6 @@ lazy_cleanup_one_index(Relation indrel,
IndexBulkDeleteResult *istat,
  ivinfo.report_progress = false;
  ivinfo.estimated_count = estimated_count;
  ivinfo.message_level = elevel;
-
  ivinfo.num_heap_tuples = reltuples;

This seems like an unrelated change.

-- 
With Regards,
Amit Kapila.

Re: pg_dump versus ancient server versions

2021-12-15 Thread Tom Lane

Justin Pryzby  writes:
> On Tue, Dec 14, 2021 at 05:18:44PM -0500, Tom Lane wrote:
>> I've completed the pg_dump/pg_dumpall part of that, but while

> Is it possible to clean up pg_upgrade, too ?

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e469f0aaf3c586c8390bd65923f97d4b1683cd9f

I'm still working on psql.

regards, tom lane

Re: Transparent column encryption

2021-12-15 Thread Greg Stark

> In the server, the encrypted datums are stored in types called
> encryptedr and encryptedd (for randomized and deterministic
> encryption).  These are essentially cousins of bytea.

Does that mean someone could go in with psql and select out the data
without any keys and just get a raw bytea-like representation? That
seems like a natural and useful thing to be able to do. For example to
allow dumping a table and loading it elsewhere and transferring keys
through some other channel (perhaps only as needed).

Re: [PATCH] Document heuristics for parameterized paths

2021-12-15 Thread Greg Stark

On Wed, 15 Dec 2021 at 23:22, Greg Stark  wrote:
>
> On Mon, 6 Dec 2021 at 13:01, Steinar H. Gunderson
>  wrote:
> >
> > +one that must cannot be delayed right away (because of outer join
>
> must cannot?

Actually on further reading... "delayed right away"?

-- 
greg

Re: [PATCH] Document heuristics for parameterized paths

2021-12-15 Thread Greg Stark

On Mon, 6 Dec 2021 at 13:01, Steinar H. Gunderson
 wrote:
>
> +one that must cannot be delayed right away (because of outer join

must cannot?

-- 
greg

Re: pg_dump versus ancient server versions

2021-12-15 Thread Justin Pryzby

On Tue, Dec 14, 2021 at 05:18:44PM -0500, Tom Lane wrote:
> I wrote:
> > Anyway, it seems like there's some consensus that 9.2 is a good
> > stopping place for today.  I'll push forward with
> > (1) back-patching as necessary to make 9.2 and up build cleanly
> > on the platforms I have handy;
> > (2) ripping out pg_dump's support for pre-9.2 servers;
> > (3) ripping out psql's support for pre-9.2 servers.
> 
> I've completed the pg_dump/pg_dumpall part of that, but while

Is it possible to clean up pg_upgrade, too ?

-- 
Justin

Re: COPY IN/BOTH vs. extended query mode

2021-12-15 Thread Jeff Davis

On Mon, 2017-01-23 at 21:12 -0500, Robert Haas wrote:
> According to the documentation for COPY IN mode, "If the COPY command
> was issued via an extended-query message, the backend will now
> discard
> frontend messages until a Sync message is received, then it will
> issue
> ReadyForQuery and return to normal processing."  I added a similar
> note to the documentation for COPY BOTH mode in
> 91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
> accurately describes the behavior of the server.  However, this seems
> to make fully correct error handling for clients using libpq almost
> impossible, because PQsendQueryGuts() sends
> Parse-Bind-Describe-Execute-Sync in one shot without regard to
> whether
> the command that was just sent invoked COPY mode (cf. the note in
> CopyGetData about why we ignore Flush and Sync in that function).
> 
> So imagine that the client uses libpq to send (via the extended query
> protocol) a COPY IN command (or some hypothetical command that starts
> COPY BOTH mode to begin).  If the server throws an error before the
> Sync message is consumed, it will bounce back to PostgresMain which
> will set doing_extended_query_message = true after which it will
> consume messages, find the Sync, reset that flag, and send
> ReadyForQuery.  On the other hand, if the server enters CopyBoth
> mode,
> consumes the Sync message in CopyGetData (or a similar function), and
> *then* throws an ERROR, the server will wait for a second Sync
> message
> from the client before issuing ReadyForQuery.  There is no sensible
> way of coping with this problem in libpq, because there is no way for
> the client to know which part of the server code consumed the Sync
> message that it already sent.  In short, from the client's point of
> view, if it enters COPY IN or COPY BOTH mode via the extend query
> protocol, and an error occurs on the server, the server MAY OR MAY
> NOT
> expect a further Sync message before issuing ReadyForQuery, and the
> client has no way of knowing -- except maybe waiting for a while to
> see what happens.

I investigated a bit deeper here, and I'm not sure this is a real
problem (aside from ambiguity in the protocol docs).

If you send "COPY ... FROM STDIN" using the extended query protocol in
libpq, the non-error message flow is something like:

  -> Parse + Bind + Describe + Execute + Sync
  [ server processes Parse + Bind + Describe + Execute ]
  [ server enters copy-in mode ]
  <- CopyInResponse
  [ server ignores Sync ]
  -> CopyData
  [ server processes CopyData ]
  -> CopyDone
  [ server processes CopyDone ]
  [ server exits copy-in mode ]
  -> Sync
  [ server processes Sync ]
  <- ReadyForQuery

If an error happens before the server enters copy-in mode (e.g. syntax
error), then you get something like:

  -> Parse + Bind + Describe + Execute + Sync
  [ server processes
Parse, encounters error ]
  <- ErrorResponse
  [ server ignores Bind +
Describe + Execute ]
  [ server processes Sync ]
  <- ReadyForQuery
  [
client never got CopyInResponse, so never sent copy messages ]

If an error happens after the CopyInResponse is sent (e.g. malformed
data), you get something like:

  -> Parse + Bind + Describe + Execute + Sync
  [ server processes Bind + Describe + Execute ]
  [ server enters copy-in mode ]
  <- CopyInResponse
  [ server ignores Sync ]
  -> CopyData
  [ server processes CopyData, encounters error ]
  [ server exits copy-in mode ]
  <- ErrorResponse
  -> CopyDone
  [ server ignores CopyDone ]
  -> Sync
  [ server processes Sync ]
  <- ReadyForQuery

If the backend is canceled after the server sends CopyInResponse but
before it consumes (and ignores) the Sync, you get something like:

  -> Parse + Bind + Describe + Execute + Sync
  [ server processes Bind + Describe + Execute ]
  [ server enters copy-in mode ]
  <- CopyInResponse
  [ SIGINT, server encounters error ]
  <- ErrorResponse
  [ server exits copy-in mode ]
  [ server processes Sync ]
  <- ReadyForQuery
  -> CopyData
  [ server ignores CopyData ]
  -> CopyDone
  [ server ignores CopyDone ]
  -> Sync
  [ server processes Sync ]
  <- ReadyForQuery

The last case is not great, because I could imagine it confusing a
client, but I'm not sure about exactly how, and maybe it's something we
can document around?

Regards,
Jeff Davis

Re: A micro-optimisation for ProcSendSignal()

2021-12-15 Thread Thomas Munro

On Tue, Aug 3, 2021 at 2:56 PM Andres Freund  wrote:
> On 2021-08-03 13:44:58 +1200, Thomas Munro wrote:
> > In the case of buffer pin waits, we switch to storing the pgprocno of
> > the waiter.  In the case of SERIALIZABLE READ ONLY DEFERRABLE waits, we
> > derive the pgprocno from the vxid (though that's not yet as efficient as
> > it could be).
>
> That's kind of an understatement :)

I abandoned the vxid part for now and went back to v3.  If/when
BackendId is replaced with or becomes synonymous with pgprocno, we can
make this change and drop the pgprocno member from SERIALIZABLEXACT.

> > + SetLatch(&ProcGlobal->allProcs[pgprocno].procLatch);

> I think some validation of the pgprocno here would be a good idea. I'm worried
> that something could cause e.g. INVALID_PGPROCNO to be passed in, and suddenly
> we're modifying random memory. That could end up being a pretty hard bug to
> catch, because we might not even notice that the right latch isn't set...

Added.

> >  /* Accessor for PGPROC given a pgprocno. */
> >  #define GetPGProcByNumber(n) (&ProcGlobal->allProcs[(n)])
> > +/* Accessor for pgprocno given a pointer to PGPROC. */
> > +#define GetPGProcNumber(proc) ((proc) - ProcGlobal->allProcs)
>
> I'm not sure this is a good idea. There's no really cheap way for the compiler
> to compute this, because sizeof(PGPROC) isn't a power of two. Given that
> PGPROC is ~880 bytes, I don't see all that much gain in getting rid of
> ->pgprocno.

Yeah, that would need some examination; 0002 patch abandoned for now.

Pushed.

RE: Allow escape in application_name

2021-12-15 Thread kuroda.hay...@fujitsu.com

Dear Fujii-san,

Thank you for updating! I read your patches and I have
only one comment.

>   if (strcmp(keywords[i], "application_name") == 0 &&
>   values[i] != NULL && *(values[i]) != '\0')

I'm not sure but do we have a case that values[i] becomes NULL
even if keywords[i] is "application_name"?

I think other parts are perfect.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2021-12-15 Thread Michael Paquier

On Wed, Dec 15, 2021 at 10:47:24AM +0900, Michael Paquier wrote:
> Missed that, thanks!  I'll think about all that a bit more before
> sending a long-overdue rebased version.

Okay, here is finally a rebase of this patch, where I have fixed a
couple of existing issues, and I have extended the patch to the point
where the support range is what I expect should be.  In short:
- Added support for MSVC for the TAP test.  I have considered making
upgradecheck silent, but after thinking about it I have just filtered
pg_upgrade from bincheck, and simplified upgradecheck to launch the
new test.  It is simple to switch from one approach to another.  This
shaves some code in vcregress.pl.
- Fixed a set of issues with various chdir commands in the previous
patches.  The command of pg_regress has been tweaked so as all results
are part of src/bin/pg_upgrade/.  Any logs generated by pg_upgrade
stay in this location, same way as HEAD.
- Adapted to the new modules of src/test/perl/.
- Support for cross-upgrades now uses upgrade_adapt.sql (I have sent a
patch for the buildfarm client about that yesterday actually), same
way as test.sh on HEAD.  Like HEAD, attempting to use the
cross-version HEAD causes diffs between the old and the new dumps.
But there is nothing new here.  This could be improved more but the
attached does already a lot. 
- Like the previous versions, this supports two modes when setting up
the to-be-upgraded cluster: setup things from an old dump or use
pg_regress.  The buildfarm does the former for upgrades down to 9.2.
The core code does the latter.

I may have missed one thing or two, but I think that's pretty much
what we should be looking for to do the switch to TAP in terms of
coverage.
--
Michael
From 47fcfac534f947595a880eacdbcfe4f8bd6af516 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 16 Dec 2021 11:50:54 +0900
Subject: [PATCH v4] Switch tests of pg_upgrade to use TAP

---
 src/bin/pg_upgrade/.gitignore|   5 +
 src/bin/pg_upgrade/Makefile  |  23 +-
 src/bin/pg_upgrade/TESTING   |  33 ++-
 src/bin/pg_upgrade/t/001_basic.pl|   9 +
 src/bin/pg_upgrade/t/002_pg_upgrade.pl   | 275 ++
 src/bin/pg_upgrade/test.sh   | 279 ---
 src/test/perl/PostgreSQL/Test/Cluster.pm |  25 ++
 src/tools/msvc/vcregress.pl  |  91 +---
 8 files changed, 355 insertions(+), 385 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/001_basic.pl
 create mode 100644 src/bin/pg_upgrade/t/002_pg_upgrade.pl
 delete mode 100644 src/bin/pg_upgrade/test.sh

diff --git a/src/bin/pg_upgrade/.gitignore b/src/bin/pg_upgrade/.gitignore
index 2d3bfeaa50..3b64522ab6 100644
--- a/src/bin/pg_upgrade/.gitignore
+++ b/src/bin/pg_upgrade/.gitignore
@@ -7,3 +7,8 @@
 /loadable_libraries.txt
 /log/
 /tmp_check/
+
+# Generated by pg_regress
+/sql/
+/expected/
+/results/
diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 44d06be5a6..35b6c123a5 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -28,6 +28,12 @@ OBJS = \
 override CPPFLAGS := -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
 
+# required for 002_pg_upgrade.pl
+REGRESS_SHLIB=$(abs_top_builddir)/src/test/regress/regress$(DLSUFFIX)
+export REGRESS_SHLIB
+REGRESS_OUTPUTDIR=$(abs_top_builddir)/src/bin/pg_upgrade
+export REGRESS_OUTPUTDIR
+
 all: pg_upgrade
 
 pg_upgrade: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils
@@ -49,17 +55,8 @@ clean distclean maintainer-clean:
 	   pg_upgrade_dump_globals.sql \
 	   pg_upgrade_dump_*.custom pg_upgrade_*.log
 
-# When $(MAKE) is present, make automatically infers that this is a
-# recursive make. which is not actually what we want here, as that
-# e.g. prevents output synchronization from working (as make thinks
-# that the subsidiary make knows how to deal with that itself, but
-# we're invoking a shell script that doesn't know). Referencing
-# $(MAKE) indirectly avoids that behaviour.
-# See https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html#MAKE-Variable
-NOTSUBMAKEMAKE=$(MAKE)
+check:
+	$(prove_check)
 
-check: test.sh all temp-install
-	MAKE=$(NOTSUBMAKEMAKE) $(with_temp_install) bindir=$(abs_top_builddir)/tmp_install/$(bindir) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $<
-
-# installcheck is not supported because there's no meaningful way to test
-# pg_upgrade against a single already-running server
+installcheck:
+	$(prove_installcheck)
diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING
index e69874b42d..acf769386a 100644
--- a/src/bin/pg_upgrade/TESTING
+++ b/src/bin/pg_upgrade/TESTING
@@ -2,21 +2,30 @@ THE SHORT VERSION
 -
 
 On non-Windows machines, you can execute the testing process
-described below by running
+described below by running the following command in this directory:
 	m

Apple's ranlib warns about protocol_openssl.c

2021-12-15 Thread Thomas Munro

Hi,

Apple's ranlib doesn't like empty translation units[1], but
protocol_openssl.c doesn't define any symbols (unless you have an
ancient EOL'd openssl), so longfin and CI say:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
file: libpgcommon.a(protocol_openssl.o) has no symbols

I guess we still can't switch to (Apple) libtool.  Maybe configure
should be doing a test and adding it to LIBOBJS or a similar variable
only if necessary, or something like that?

[1] https://www.postgresql.org/message-id/flat/28521.1426352337%40sss.pgh.pa.us

Re: Skipping logical replication transactions on subscriber side

2021-12-15 Thread Amit Kapila

On Wed, Dec 15, 2021 at 8:19 PM Masahiko Sawada  wrote:
>
> On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila  wrote:
> >
> > On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow  
> > > wrote:
> > > >
> > > > On Tue, Dec 14, 2021 at 3:23 PM vignesh C  wrote:
> > > > >
> > > > > While the worker is skipping one of the skip transactions specified by
> > > > > the user and immediately if the user specifies another skip
> > > > > transaction while the skipping of the transaction is in progress this
> > > > > new value will be reset by the worker while clearing the skip xid. I
> > > > > felt once the worker has identified the skip xid and is about to skip
> > > > > the xid, the worker can acquire a lock to prevent concurrency issues:
> > > >
> > > > That's a good point.
> > > > If only the last_error_xid could be skipped, then this wouldn't be an
> > > > issue, right?
> > > > If a different xid to skip is specified while the worker is currently
> > > > skipping a transaction, should that even be allowed?
> > > >
> > >
> > > We don't expect such usage but yes, it could happen and seems not
> > > good. I thought we can acquire Share lock on pg_subscription during
> > > the skip but not sure it's a good idea. It would be better if we can
> > > find a way to allow users to specify only XID that has failed.
> > >
> >
> > Yeah, but as we don't have a definite way to allow specifying only
> > failed XID, I think it is better to use share lock on that particular
> > subscription. We are already using it for add/update rel state (see,
> > AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
> > another place to use a similar technique.
>
> Yes, but it seems to mean that we disallow users to change skip_xid
> while the apply worker is skipping changes so we will end up having
> the same problem we discussed so far;
>

I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?

-- 
With Regards,
Amit Kapila.

Re: Buildfarm support for older versions

2021-12-15 Thread Larry Rosenman


On 12/15/2021 11:15 am, Andrew Dunstan wrote:
OK, old_branches_of_interest.txt now exists on the buildfarm server, 
and

the code has been modified to take notice of it (i.e. to accept builds
for branches listed there). The contents are the non-live versions from
9.2 on.

I have set up a test buildfarm client (which will eventually report
under the name 'godwit') alongside crake (Fedora 34). So far testing 
has

run smoothly, there are only two glitches:

  * 9.3 and 9.2 don't have a show_dl_suffix make target. This would
require backpatching b40cb99b85 and d9cdb1ba9e. That's a tiny
change, and I propose to do it shortly unless there's an objection.
  * I need to undo the removal of client logic that supported 9.2's
unix_socket_directory setting as opposed to the later
unix_socket_directories.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com


Would a FreeBSD head (peripatus or a new animal) help?
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

RE: Confused comment about drop replica identity index

2021-12-15 Thread wangw.f...@fujitsu.com

On Tue, Dec 16, 2021 at 06:40AM, Michael Paquier wrote:
> Would you like to write a patch to address all that?

OK, I will push it soon.


Regards,
Wang wei

Re: Synchronizing slots from primary to standby

2021-12-15 Thread Hsu, John

Hello,

I started taking a brief look at the v2 patch, and it does appear to work for 
the basic case. Logical slot is synchronized across and I can connect to the 
promoted standby and stream changes afterwards.

It's not clear to me what the correct behavior is when a logical slot that has 
been synced to the replica and then it gets deleted on the writer. Would we 
expect this to be propagated or leave it up to the end-user to manage?

> +   rawname = pstrdup(standby_slot_names);
> +   SplitIdentifierString(rawname, ',', &namelist);
> +
> +   while (true)
> +   {
> +   int wait_slots_remaining;
> +   XLogRecPtr  oldest_flush_pos = InvalidXLogRecPtr;
> +   int rc;
> +
> +   wait_slots_remaining = list_length(namelist);
> +
> +   LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> +   for (int i = 0; i < max_replication_slots; i++)
> +   {

Even though standby_slot_names is PGC_SIGHUP, we never reload/re-process the 
value. If we have a wrong entry in there, the backend becomes stuck until we 
re-establish the logical connection. Adding "postmaster/interrupt.h" with 
ConfigReloadPending / ProcessConfigFile does seem to work.

Another thing I noticed is that once it starts waiting in this block, Ctrl+C 
doesn't seem to terminate the backend?

pg_recvlogical -d postgres -p 5432 --slot regression_slot --start -f -
..
^Cpg_recvlogical: error: unexpected termination of replication stream: 

The logical backend connection is still present:

ps aux | grep 51263
   hsuchen 51263 80.7  0.0 320180 14304 ?Rs   01:11   3:04 postgres: 
walsender hsuchen [local] START_REPLICATION

pstack 51263
#0  0x7ffee99e79a5 in clock_gettime ()
#1  0x7f8705e88246 in clock_gettime () from /lib64/libc.so.6
#2  0x0075f141 in WaitEventSetWait ()
#3  0x0075f565 in WaitLatch ()
#4  0x00720aea in ReorderBufferProcessTXN ()
#5  0x007142a6 in DecodeXactOp ()
#6  0x0071460f in LogicalDecodingProcessRecord ()

It can be terminated with a pg_terminate_backend though.

If we have a physical slot with name foo on the standby, and then a logical 
slot is created on the writer with the same slot_name it does error out on the 
replica although it prevents other slots from being synchronized which is 
probably fine.

2021-12-16 02:10:29.709 UTC [73788] LOG:  replication slot synchronization 
worker for database "postgres" has started
2021-12-16 02:10:29.713 UTC [73788] ERROR:  cannot use physical replication 
slot for logical decoding
2021-12-16 02:10:29.714 UTC [73037] DEBUG:  unregistering background worker 
"replication slot synchronization worker"

On 12/14/21, 2:26 PM, "Peter Eisentraut"  
wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.

On 28.11.21 07:52, Bharath Rupireddy wrote:
> 1) Instead of a new LIST_SLOT command, can't we use
> READ_REPLICATION_SLOT (slight modifications needs to be done to make
> it support logical replication slots and to get more information from
> the subscriber).

I looked at that but didn't see an obvious way to consolidate them.
This is something we could look at again later.

> 2) How frequently the new bg worker is going to sync the slot info?
> How can it ensure that the latest information exists say when the
> subscriber is down/crashed before it picks up the latest slot
> information?

The interval is currently hardcoded, but could be a configuration
setting.  In the v2 patch, there is a new setting that orders physical
replication before logical so that the logical subscribers cannot get
ahead of the physical standby.

> 3) Instead of the subscriber pulling the slot info, why can't the
> publisher (via the walsender or a new bg worker maybe?) push the
> latest slot info? I'm not sure we want to add more functionality to
> the walsender, if yes, isn't it going to be much simpler?

This sounds like the failover slot feature, which was rejected.

Re: pg_upgrade should truncate/remove its logs before running

2021-12-15 Thread Michael Paquier

On Wed, Dec 15, 2021 at 04:13:10PM -0600, Justin Pryzby wrote:
> On Wed, Dec 15, 2021 at 05:04:54PM -0500, Andrew Dunstan wrote:
>> The directory name needs to be predictable somehow, or maybe optionally
>> set as a parameter. Having just a timestamped directory name would make
>> life annoying for a poor buildfarm maintainer. Also, please don't change
>> anything before I have a chance to adjust the buildfarm code to what is
>> going to be done.
> 
> Feel free to suggest the desirable behavior.
> It could write to pg_upgrade.log/* and refuse to run if the dir already 
> exists.

Andrew's point looks rather sensible to me.  So, this stuff should
have a predictable name (pg_upgrade.log, pg_upgrade_log or upgrade_log
would be fine).  But I would also add an option to be able to define a
custom log path.  The latter would be useful for the regression tests
so as everything gets could get redirected to a path already filtered
out.
--
Michael


signature.asc
Description: PGP signature

[PATCH] Accept IP addresses in server certificate SANs

2021-12-15 Thread Jacob Champion

Hello all,

libpq currently supports server certificates with a single IP address
in the Common Name. It's fairly brittle; as far as I can tell, the
single name you choose has to match the client's address exactly.

Attached is a patch for libpq to support IP addresses in the server's
Subject Alternative Names, which would allow admins to issue certs for
multiple IP addresses, both IPv4 and IPv6, and mix them with
alternative DNS hostnames. These addresses are compared bytewise
instead of stringwise, so the client can contact the server via
alternative spellings of the same IP address.

This patch arose because I was writing tests for the NSS implementation
that used a server cert with both DNS names and IP addresses, and then
they failed when I ran those tests against the OpenSSL implementation.
NSS supports this functionality natively. Anecdotally, I've heard from
at least one client group who is utilizing IP-based certificates in
their cloud deployments. It seems uncommon but still useful.

There are two open questions I have; they're based on NSS
implementation details that I did not port here:

- NSS allows an IPv4 SAN to match an IPv6 mapping of that same address,
  and vice-versa. I chose not to implement that behavior, figuring it
  is easy enough for people to issue a certificate with both addresses.
  Is that okay?

- If a certificate contains only iPAddress SANs, and none of them
  match, I fall back to check the certificate Common Name. OpenSSL will
  not do this (its X509_check_ip considers only SANs). NSS will only do
  this if the client's address is itself a DNS name. The spec says that
  we can't fall back to Common Name if the SANs contain any DNS
  entries, but it's silent on the subject of IP addresses. What should
  the behavior be?

The patchset roadmap:

- 0001 moves inet_net_pton() to src/port, since libpq will need it.
- 0002 implements the new functionality and adds tests.

WDYT?

Thanks,
--Jacob
From db56d6593a4574c243ccee1dbb4c8c1f1c712795 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Wed, 24 Nov 2021 14:46:11 -0800
Subject: [PATCH 1/2] Move inet_net_pton() to src/port

This will be helpful for IP address verification in libpq.
---
 src/backend/utils/adt/Makefile  |  1 -
 src/include/port.h  |  3 +++
 src/include/utils/builtins.h|  4 
 src/port/Makefile   |  1 +
 src/{backend/utils/adt => port}/inet_net_pton.c | 16 +++-
 5 files changed, 19 insertions(+), 6 deletions(-)
 rename src/{backend/utils/adt => port}/inet_net_pton.c (96%)

diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 41b486bcef..d173d52157 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -43,7 +43,6 @@ OBJS = \
 	geo_selfuncs.o \
 	geo_spgist.o \
 	inet_cidr_ntop.o \
-	inet_net_pton.o \
 	int.o \
 	int8.o \
 	json.o \
diff --git a/src/include/port.h b/src/include/port.h
index fd9c9d6f94..cca29839bc 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -518,6 +518,9 @@ extern int	pg_codepage_to_encoding(UINT cp);
 extern char *pg_inet_net_ntop(int af, const void *src, int bits,
 			  char *dst, size_t size);
 
+/* port/inet_net_pton.c */
+extern int	pg_inet_net_pton(int af, const char *src, void *dst, size_t size);
+
 /* port/pg_strong_random.c */
 extern void pg_strong_random_init(void);
 extern bool pg_strong_random(void *buf, size_t len);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 40fcb0ab6d..d063a86260 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -93,10 +93,6 @@ extern int	xidComparator(const void *arg1, const void *arg2);
 extern char *pg_inet_cidr_ntop(int af, const void *src, int bits,
 			   char *dst, size_t size);
 
-/* inet_net_pton.c */
-extern int	pg_inet_net_pton(int af, const char *src,
-			 void *dst, size_t size);
-
 /* network.c */
 extern double convert_network_to_scalar(Datum value, Oid typid, bool *failure);
 extern Datum network_scan_first(Datum in);
diff --git a/src/port/Makefile b/src/port/Makefile
index b3754d8940..7191cd6633 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -43,6 +43,7 @@ OBJS = \
 	bsearch_arg.o \
 	chklocale.o \
 	inet_net_ntop.o \
+	inet_net_pton.o \
 	noblock.o \
 	path.o \
 	pg_bitutils.o \
diff --git a/src/backend/utils/adt/inet_net_pton.c b/src/port/inet_net_pton.c
similarity index 96%
rename from src/backend/utils/adt/inet_net_pton.c
rename to src/port/inet_net_pton.c
index d3221a1313..bae50ba67e 100644
--- a/src/backend/utils/adt/inet_net_pton.c
+++ b/src/port/inet_net_pton.c
@@ -14,14 +14,18 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
  * OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  *
- *	  src/backend/utils/adt/inet_net_pton.c
+ *	  src/port/inet_net_pton.c
  */
 
 #if defined(LIBC_SCCS) && !defined(lint)
 static const char rcsid[] =

Re: Unnecessary delay in streaming replication due to replay lag

2021-12-15 Thread Soumyadeep Chakraborty

Hi Bharath,

Thanks for the review!

On Sat, Nov 27, 2021 at 6:36 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> 1) A memory leak: add FreeDir(dir); before returning.
> + ereport(LOG,
> + (errmsg("Could not start streaming WAL eagerly"),
> + errdetail("There are timeline changes in the locally available WAL
files."),
> + errhint("WAL streaming will begin once all local WAL and archives
> are exhausted.")));
> + return;
> + }
>

Thanks for catching that. Fixed.

>
>
> 2) Is there a guarantee that while we traverse the pg_wal directory to
> find startsegno and endsegno, the new wal files arrive from the
> primary or archive location or old wal files get removed/recycled by
> the standby? Especially when wal_receiver_start_condition=consistency?
> + startsegno = (startsegno == -1) ? logSegNo : Min(startsegno, logSegNo);
> + endsegno = (endsegno == -1) ? logSegNo : Max(endsegno, logSegNo);
> + }
>

Even if newer wal files arrive after the snapshot of the dir listing
taken by AllocateDir()/ReadDir(), we will in effect start from a
slightly older location, which should be fine. It shouldn't matter if
an older file is recycled. If the last valid WAL segment is recycled,
we will ERROR out in StartWALReceiverEagerlyIfPossible() and the eager
start can be retried by the startup process when
CheckRecoveryConsistency() is called again.

>
>
> 3) I think the errmsg text format isn't correct. Note that the errmsg
> text starts with lowercase and doesn't end with "." whereas errdetail
> or errhint starts with uppercase and ends with ".". Please check other
> messages for reference.
> The following should be changed.
> + errmsg("Requesting stream from beginning of: %s",
> + errmsg("Invalid WAL segment found while calculating stream start:
> %s. Skipping.",
> + (errmsg("Could not start streaming WAL eagerly"),

Fixed.

> 4) I think you also need to have wal files names in double quotes,
> something like below:
> errmsg("could not close file \"%s\": %m", xlogfname)));

Fixed.

>
> 5) It is ".stream start: \"%s\", skipping..",
> + errmsg("Invalid WAL segment found while calculating stream start:
> %s. Skipping.",

Fixed.

> 4) I think the patch can make the startup process significantly slow,
> especially when there are lots of wal files that exist in the standby
> pg_wal directory. This is because of the overhead
> StartWALReceiverEagerlyIfPossible adds i.e. doing two while loops to
> figure out the start position of the
> streaming in advance. This might end up the startup process doing the
> loop over in the directory rather than the important thing of doing
> crash recovery or standby recovery.

Well, 99% of the time we can expect that the second loop finishes after
1 or 2 iterations, as the last valid WAL segment would most likely be
the highest numbered WAL file or thereabouts. I don't think that the
overhead will be significant as we are just looking up a directory
listing and not reading any files.

> 5) What happens if this new GUC is enabled in case of a synchronous
standby?
> What happens if this new GUC is enabled in case of a crash recovery?
> What happens if this new GUC is enabled in case a restore command is
> set i.e. standby performing archive recovery?

The GUC would behave the same way for all of these cases. If we have
chosen 'startup'/'consistency', we would be starting the WAL receiver
eagerly. There might be certain race conditions when one combines this
GUC with archive recovery, which was discussed upthread. [1]

> 6) How about bgwriter/checkpointer which gets started even before the
> startup process (or a new bg worker? of course it's going to be an
> overkill) finding out the new start pos for the startup process and
> then we could get rid of startup behaviour of the
> patch? This avoids an extra burden on the startup process. Many times,
> users will be complaining about why recovery is taking more time now,
> after the GUC wal_receiver_start_condition=startup.

Hmm, then we would be needing additional synchronization. There will
also be an added dependency on checkpoint_timeout. I don't think that
the performance hit is significant enough to warrant this change.

> 8) Can we have a better GUC name than wal_receiver_start_condition?
> Something like wal_receiver_start_at or wal_receiver_start or some
> other?

Sure, that makes more sense. Fixed.

Regards,
Soumyadeep (VMware)

[1]
https://www.postgresql.org/message-id/CAE-ML%2B-8KnuJqXKHz0mrC7-qFMQJ3ArDC78X3-AjGKos7Ceocw%40mail.gmail.com
From e6ffb5400fd841e4285939133739692c9cb4ba17 Mon Sep 17 00:00:00 2001
From: Soumyadeep Chakraborty 
Date: Fri, 19 Nov 2021 00:33:17 -0800
Subject: [PATCH v5 1/1] Introduce feature to start WAL receiver eagerly

This commit introduces a new GUC wal_receiver_start_condition which can
enable the standby to start it's WAL receiver at an earlier stage. The
GUC will default to starting the WAL receiver after WAL from archives
and pg_wal have been exhausted, designated by the value 'exhau

Re: A test for replay of regression tests

2021-12-15 Thread Michael Paquier

On Wed, Dec 15, 2021 at 05:50:45PM +0900, Michael Paquier wrote:
> Hmm.  FWIW, I am working on doing similar for pg_upgrade to switch to
> TAP there, and we share a lot in terms of running pg_regress on an
> exising cluster.  Wouldn't it be better to move this definition to
> src/Makefile.global.in rather than src/test/recovery/?
> 
> My pg_regress command is actually very similar to yours, so I am
> wondering if this would be better if somehow centralized, perhaps in
> Cluster.pm.

By the way, while I was sorting out my things, I have noticed that v4
does not handle EXTRA_REGRESS_OPT.  Is that wanted?  You could just
add that into your patch set and push the extra options to the
pg_regress command:
my $extra_opts_val = $ENV{EXTRA_REGRESS_OPT} || "";
my @extra_opts = split(/\s+/, $extra_opts_val);
--
Michael


signature.asc
Description: PGP signature

Re: Life cycles of tuple descriptors

2021-12-15 Thread Tom Lane

Thomas Munro  writes:
> On Thu, Dec 16, 2021 at 11:51 AM Tom Lane  wrote:
>> Here's a draft patch for this.

> LGTM.

Pushed, thanks for looking.

regards, tom lane

Re: Life cycles of tuple descriptors

2021-12-15 Thread Chapman Flack

On 12/15/21 17:50, Tom Lane wrote:

> Here's a draft patch for this.  There are several places that are
> directly using DecrTupleDescRefCount after lookup_rowtype_tupdesc
> or equivalent, which'd now be forbidden.  I think they are all safe
> given the assumption that the typcache's tupdescs for named composites
> are refcounted.  (The calls in expandedrecord.c could be working
> with RECORD, but those code paths just checked that the tupdesc
> is refcounted.)  So there's no actual bug here, and no reason to
> back-patch, but this seems like a good idea to decouple callers
> a bit more from typcache's internal logic.

I agree with the analysis at each of those sites, and the new comment
clears up everything that had puzzled me before.

Regards,
-Chap

Re: Life cycles of tuple descriptors

2021-12-15 Thread Thomas Munro

On Thu, Dec 16, 2021 at 11:51 AM Tom Lane  wrote:
> Here's a draft patch for this.  There are several places that are
> directly using DecrTupleDescRefCount after lookup_rowtype_tupdesc
> or equivalent, which'd now be forbidden.  I think they are all safe
> given the assumption that the typcache's tupdescs for named composites
> are refcounted.  (The calls in expandedrecord.c could be working
> with RECORD, but those code paths just checked that the tupdesc
> is refcounted.)  So there's no actual bug here, and no reason to
> back-patch, but this seems like a good idea to decouple callers
> a bit more from typcache's internal logic.  None of these call
> sites are so performance-critical that one extra test will hurt.

LGTM.

Re: Life cycles of tuple descriptors

2021-12-15 Thread Tom Lane

I wrote:
> Chapman Flack  writes:
>> Oh, hmm, maybe one thing in that API comment ought to be changed. It says
>> I must call ReleaseTupleDesc *or* DecrTupleDescRefCount. Maybe that dates
>> from before the shared registry? ReleaseTupleDesc is safe, but anybody who
>> uses DecrTupleDescRefCount on a lookup_rowtype_tupdesc result could be
>> in for an assertion failure if a non-refcounted tupdesc is returned.

> Yeah, I was just wondering the same.  I think DecrTupleDescRefCount
> is safe if you know you are looking up a named composite type, but
> maybe that's still too much familiarity with typcache innards.

Here's a draft patch for this.  There are several places that are
directly using DecrTupleDescRefCount after lookup_rowtype_tupdesc
or equivalent, which'd now be forbidden.  I think they are all safe
given the assumption that the typcache's tupdescs for named composites
are refcounted.  (The calls in expandedrecord.c could be working
with RECORD, but those code paths just checked that the tupdesc
is refcounted.)  So there's no actual bug here, and no reason to
back-patch, but this seems like a good idea to decouple callers
a bit more from typcache's internal logic.  None of these call
sites are so performance-critical that one extra test will hurt.

regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 47b29001d5..bf42587e38 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15401,7 +15401,7 @@ ATExecAddOf(Relation rel, const TypeName *ofTypename, LOCKMODE lockmode)
 	 errmsg("table \"%s\" has different type for column \"%s\"",
 			RelationGetRelationName(rel), type_attname)));
 	}
-	DecrTupleDescRefCount(typeTupleDesc);
+	ReleaseTupleDesc(typeTupleDesc);
 
 	/* Any remaining columns at the end of the table had better be dropped. */
 	for (; table_attno <= tableTupleDesc->natts; table_attno++)
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 892b4e17e0..7d343f0678 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -1465,7 +1465,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
 /* find out the number of columns in the composite type */
 tupDesc = lookup_rowtype_tupdesc(fstore->resulttype, -1);
 ncolumns = tupDesc->natts;
-DecrTupleDescRefCount(tupDesc);
+ReleaseTupleDesc(tupDesc);
 
 /* create workspace for column values */
 values = (Datum *) palloc(sizeof(Datum) * ncolumns);
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 313d7b6ff0..2d857a301b 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1484,7 +1484,7 @@ transformOfType(CreateStmtContext *cxt, TypeName *ofTypename)
 		n->location = -1;
 		cxt->columns = lappend(cxt->columns, n);
 	}
-	DecrTupleDescRefCount(tupdesc);
+	ReleaseTupleDesc(tupdesc);
 
 	ReleaseSysCache(tuple);
 }
diff --git a/src/backend/utils/adt/expandedrecord.c b/src/backend/utils/adt/expandedrecord.c
index e19491ecf7..38d5384c00 100644
--- a/src/backend/utils/adt/expandedrecord.c
+++ b/src/backend/utils/adt/expandedrecord.c
@@ -171,7 +171,7 @@ make_expanded_record_from_typeid(Oid type_id, int32 typmod,
 
 		/* If we called lookup_rowtype_tupdesc, release the pin it took */
 		if (type_id == RECORDOID)
-			DecrTupleDescRefCount(tupdesc);
+			ReleaseTupleDesc(tupdesc);
 	}
 	else
 	{
@@ -854,7 +854,7 @@ expanded_record_fetch_tupdesc(ExpandedRecordHeader *erh)
 		tupdesc->tdrefcount++;
 
 		/* Release the pin lookup_rowtype_tupdesc acquired */
-		DecrTupleDescRefCount(tupdesc);
+		ReleaseTupleDesc(tupdesc);
 	}
 	else
 	{
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 70e5c51297..d140ef6655 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -1820,8 +1820,11 @@ lookup_rowtype_tupdesc_internal(Oid type_id, int32 typmod, bool noError)
  * for example from record_in().)
  *
  * Note: on success, we increment the refcount of the returned TupleDesc,
- * and log the reference in CurrentResourceOwner.  Caller should call
- * ReleaseTupleDesc or DecrTupleDescRefCount when done using the tupdesc.
+ * and log the reference in CurrentResourceOwner.  Caller must call
+ * ReleaseTupleDesc when done using the tupdesc.  (There are some
+ * cases in which the returned tupdesc is not refcounted, in which
+ * case PinTupleDesc/ReleaseTupleDesc are no-ops; but in these cases
+ * the tupdesc is guaranteed to live till process exit.)
  */
 TupleDesc
 lookup_rowtype_tupdesc(Oid type_id, int32 typmod)

Re: Confused comment about drop replica identity index

2021-12-15 Thread Michael Paquier

On Wed, Dec 15, 2021 at 09:18:26AM +, wangw.f...@fujitsu.com wrote:
> Yeah, if we can add some details to pg-doc and code comments, I think it will
> be more friendly to PG users and developers.

Would you like to write a patch to address all that?
Thanks,
--
Michael


signature.asc
Description: PGP signature

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-15 Thread Justin Pryzby

On Fri, Dec 03, 2021 at 03:02:24PM -0500, Melanie Plageman wrote:
> Thanks again! I really appreciate the thorough review.
> 
> I have combined responses to all three of your emails below.
> Let me know if it is more confusing to do it this way.

One email is better than three - I'm just not a model citizen ;)

Thanks for updating the patch.  I checked that all my previous review comments
were addressed (except for the part about passing the 3D array to a function -
I know that technically the pointer is being passed).

+int backend_type_get_idx(BackendType backend_type) 

+BackendType idx_get_backend_type(int idx)  

=> I think it'd be desirable for these to be either static functions (which
won't work for your needs) or macros, or inline functions in the header.

-   if (strcmp(target, "archiver") == 0)

+   pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSHAREDCOUNTER);  

+   if (strcmp(target, "buffers") == 0) 

=> This should be added in alphabetical order.  Which is unimportant, but it
will also makes the patch 2 lines shorter.  The doc patch should also be in
order.

+* Don't count dead backends. They will be added below There 
are no  

=> Missing a period.

-- 
Justin

Re: pg_upgrade should truncate/remove its logs before running

2021-12-15 Thread Justin Pryzby

On Wed, Dec 15, 2021 at 05:04:54PM -0500, Andrew Dunstan wrote:
> On 12/15/21 16:23, Bruce Momjian wrote:
> > On Wed, Dec 15, 2021 at 04:17:23PM -0500, Tom Lane wrote:
> >> Bruce Momjian  writes:
> >>> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
>  If pg_upgrade fails and is re-run, it appends to its logfiles, which is
>  confusing since, if it fails again, it then looks like the original error
>  recurred and wasn't fixed.  The "append" behavior dates back to 
>  717f6d608.
> >>> Uh, the database server doesn't erase its logs on crash/failure, so why
> >>> should pg_upgrade do that?
> >> The server emits enough information so that it's not confusing:
> >> there are timestamps, and there's an identifiable startup line.
> >> pg_upgrade does neither.  If you don't want to truncate as
> >> Justin suggests, you should do that instead.
> >>
> >> Personally I like the idea of making a timestamped subdirectory
> >> and dropping all the files in that, because the thing that most
> >> annoys *me* about pg_upgrade is the litter it leaves behind in
> >> $CWD.  A subdirectory would make it far easier to mop up the mess.
> > Yes, lot of litter.  Putting it in a subdirectory makes a lot of sense.
> > Justin, do you want to work on that patch, since you had an earlier
> > version to fix this?
> 
> The directory name needs to be predictable somehow, or maybe optionally
> set as a parameter. Having just a timestamped directory name would make
> life annoying for a poor buildfarm maintainer. Also, please don't change
> anything before I have a chance to adjust the buildfarm code to what is
> going to be done.

Feel free to suggest the desirable behavior.
It could write to pg_upgrade.log/* and refuse to run if the dir already exists.

-- 
Justin

Re: Add index scan progress to pg_stat_progress_vacuum

2021-12-15 Thread Bossart, Nathan

On 12/1/21, 3:02 PM, "Imseih (AWS), Sami"  wrote:
> The current implementation of pg_stat_progress_vacuum does not
> provide progress on which index is being vacuumed making it
> difficult for a user to determine if the "vacuuming indexes" phase
> is making progress. By exposing which index is being scanned as well
> as the total progress the scan has made for the current cycle, a
> user can make better estimations on when the vacuum will complete.

+1

> The proposed patch adds 4 new columns to pg_stat_progress_vacuum:
>
> 1. indrelid - the relid of the index being vacuumed
> 2. index_blks_total - total number of blocks to be scanned in the
> current cycle
> 3. index_blks_scanned - number of blocks scanned in the current
> cycle
> 4. leader_pid - if the pid for the pg_stat_progress_vacuum entry is
> a leader or a vacuum worker. This patch places an entry for every
> worker pid ( if parallel ) as well as the leader pid

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed?  IMO it
is more analogous to heap_blks_vacuumed.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.
I think such information is necessary to truly understand the current
progress of vacuuming indexes, and I can think of a couple of ways we
might provide it:

  1. Make the new columns you've proposed return arrays.  This isn't
 very clean, but it would keep all the information for a given
 vacuum operation in a single row.  The indrelids column would be
 populated with all the indexes that have been vacuumed, need to
 be vacuumed, or are presently being vacuumed.  The other index-
 related columns would then have the associated stats and the
 worker PID (which might be the same as the pid column depending
 on whether parallel index vacuum was being done).  Alternatively,
 the index column could have an array of records, each containing
 all the information for a given index.
  2. Create a new view for just index vacuum progress information.
 This would have similar information as 1.  There would be an
 entry for each index that has been vacuumed, needs to be
 vacuumed, or is currently being vacuumed.  And there would be an
 easy way to join with pg_stat_progress_vacuum (e.g., leader_pid,
 which again might be the same as our index vacuum PID depending
 on whether we were doing parallel index vacuum).  Note that it
 would be possible for the PID of these entries to be null before
 and after we process the index.
  3. Instead of adding columns to pg_stat_progress_vacuum, adjust the
 current ones to be more general, and then add new entries for
 each of the indexes that have been, need to be, or currently are
 being vacuumed.  This is the most similar option to your current
 proposal, but instead of introducing a column like
 index_blks_total, we'd rename heap_blks_total to blks_total and
 use that for both the heap and indexes.  I think we'd still want
 to add a leader_pid column.  Again, we have to be prepared for
 the PID to be null in this case.  Or we could just make the pid
 column always refer to the leader, and we could introduce a
 worker_pid column.  That might create confusion, though.

I wish option #1 was cleaner, because I think it would be really nice
to have all this information in a single row.  However, I don't expect
much support for a 3-dimensional view, so I suspect option #2
(creating a separate view for index vacuum progress) is the way to go.
The other benefit of option #2 versus option #3 or your original
proposal is that it cleanly separates the top-level vacuum operations
and the index vacuum operations, which are related at the moment, but
which might not always be tied so closely together.

Nathan

Re: Support for NSS as a libpq TLS backend

2021-12-15 Thread Daniel Gustafsson

> On 25 Nov 2021, at 14:39, Joshua Brindle  
> wrote:
> On Wed, Nov 24, 2021 at 8:49 AM Joshua Brindle
>  wrote:
>> 
>> On Wed, Nov 24, 2021 at 8:46 AM Joshua Brindle
>>  wrote:

>> I don't know enough about NSS to know if this is problematic or not
>> but if I try verify-full without having the root CA in the certificate
>> store I get:
>> 
>> $ /usr/pgsql-15/bin/psql "host=localhost sslmode=verify-full user=postgres"
>> psql: error: SSL error: Issuer certificate is invalid.
>> unable to shut down NSS context: NSS could not shutdown. Objects are
>> still in use.

Fixed.

> Something is strange with ssl downgrading and a bad ssldatabase
> [postgres@11cdfa30f763 ~]$ /usr/pgsql-15/bin/psql "ssldatabase=oops
> sslcert=client_cert host=localhost"
> Password for user postgres:
> 
> 

Also fixed.

> On the server side:
> 2021-11-25 01:52:01.984 UTC [269] LOG:  unable to handshake:
> Encountered end of file (PR_END_OF_FILE_ERROR)

This is normal and expected, but to make it easier on users I've changed this
error message to be aligned with the OpenSSL implementation.

> Other than that and I still haven't tested --with-llvm I've gotten
> everything working, including with an openssl client. Attached is a
> dockerfile that gets to the point where a client can connect with
> clientcert=verify-full. I've removed some of the old cruft and
> debugging from the previous versions.

Very cool, thanks!  I've been unable to reproduce any issues with llvm but I'll
keep poking at that.  A new version will be posted shortly with the above and a
few more fixes.

--
Daniel Gustafsson   https://vmware.com/

Re: pg_upgrade should truncate/remove its logs before running

2021-12-15 Thread Andrew Dunstan



On 12/15/21 16:23, Bruce Momjian wrote:
> On Wed, Dec 15, 2021 at 04:17:23PM -0500, Tom Lane wrote:
>> Bruce Momjian  writes:
>>> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
 If pg_upgrade fails and is re-run, it appends to its logfiles, which is
 confusing since, if it fails again, it then looks like the original error
 recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.
>>> Uh, the database server doesn't erase its logs on crash/failure, so why
>>> should pg_upgrade do that?
>> The server emits enough information so that it's not confusing:
>> there are timestamps, and there's an identifiable startup line.
>> pg_upgrade does neither.  If you don't want to truncate as
>> Justin suggests, you should do that instead.
>>
>> Personally I like the idea of making a timestamped subdirectory
>> and dropping all the files in that, because the thing that most
>> annoys *me* about pg_upgrade is the litter it leaves behind in
>> $CWD.  A subdirectory would make it far easier to mop up the mess.
> Yes, lot of litter.  Putting it in a subdirectory makes a lot of sense.
> Justin, do you want to work on that patch, since you had an earlier
> version to fix this?
>



The directory name needs to be predictable somehow, or maybe optionally
set as a parameter. Having just a timestamped directory name would make
life annoying for a poor buildfarm maintainer. Also, please don't change
anything before I have a chance to adjust the buildfarm code to what is
going to be done.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: SQL/JSON: functions

2021-12-15 Thread Andrew Dunstan



On 12/9/21 09:04, Himanshu Upadhyaya wrote:
>
>
>
> 2)
> I am not sure if below is required as per SQL standard, ORACLE is
> allowing to construct JSON_OBJECT bases on the records in the table as
> below, but postgres parser is not allowing:
> create table test (id varchar(10), value int);
> insert into test values ('a',1);
> insert into test values ('b',2);
> insert into test values ('c',3);
> select json_object(*) from test; --postgres does not support
> postgres=# select json_object(*) from test;
> ERROR:  syntax error at or near "*"
> LINE 1: select json_object(*) from test;
>
>

You can spell that a bit differently today, e.g.

    select to_json(r) from test r;

I don't know either if it's in the spec, but building in support for *
in this context seems likely to be fairly complex and have very little
added utility.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-15 Thread Melanie Plageman

v18 attached.

On Thu, Dec 9, 2021 at 2:17 PM Andres Freund  wrote:
>
> Hi,
>
> On 2021-12-03 15:02:24 -0500, Melanie Plageman wrote:
> > From e0f7f3dfd60a68fa01f3c023bcdb69305ade3738 Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman 
> > Date: Mon, 11 Oct 2021 16:15:06 -0400
> > Subject: [PATCH v17 1/7] Read-only atomic backend write function
> >
> > For counters in shared memory which can be read by any backend but only
> > written to by one backend, an atomic is still needed to protect against
> > torn values, however, pg_atomic_fetch_add_u64() is overkill for
> > incrementing the counter. pg_atomic_inc_counter() is a helper function
> > which can be used to increment these values safely but without
> > unnecessary overhead.
> >
> > Author: Thomas Munro
> > ---
> >  src/include/port/atomics.h | 11 +++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
> > index 856338f161..3924dd 100644
> > --- a/src/include/port/atomics.h
> > +++ b/src/include/port/atomics.h
> > @@ -519,6 +519,17 @@ pg_atomic_sub_fetch_u64(volatile pg_atomic_uint64 
> > *ptr, int64 sub_)
> >   return pg_atomic_sub_fetch_u64_impl(ptr, sub_);
> >  }
> >
> > +/*
> > + * On modern systems this is really just *counter++.  On some older systems
> > + * there might be more to it, due to inability to read and write 64 bit 
> > values
> > + * atomically.
> > + */
> > +static inline void
> > +pg_atomic_inc_counter(pg_atomic_uint64 *counter)
> > +{
> > + pg_atomic_write_u64(counter, pg_atomic_read_u64(counter) + 1);
> > +}
>
> I wonder if it's worth putting something in the name indicating that this is
> not actual atomic RMW operation. Perhaps adding _unlocked?
>

Done.

>
> > From b0e193cfa08f0b8cf1be929f26fe38f06a39aeae Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman 
> > Date: Wed, 24 Nov 2021 10:32:56 -0500
> > Subject: [PATCH v17 2/7] Add IO operation counters to PgBackendStatus
> >
> > Add an array of counters in PgBackendStatus which count the buffers
> > allocated, extended, fsynced, and written by a given backend. Each "IO
> > Op" (alloc, fsync, extend, write) is counted per "IO Path" (direct,
> > local, shared, or strategy). "local" and "shared" IO Path counters count
> > operations on local and shared buffers. The "strategy" IO Path counts
> > buffers alloc'd/written/read/fsync'd as part of a BufferAccessStrategy.
> > The "direct" IO Path counts blocks of IO which are read, written, or
> > fsync'd using smgrwrite/extend/immedsync directly (as opposed to through
> > [Local]BufferAlloc()).
> >
> > With this commit, all backends increment a counter in their
> > PgBackendStatus when performing an IO operation. This is in preparation
> > for future commits which will persist these stats upon backend exit and
> > use the counters to provide observability of database IO operations.
> >
> > Note that this commit does not add code to increment the "direct" path.
> > A separate proposed patch [1] which would add wrappers for smgrwrite(),
> > smgrextend(), and smgrimmedsync() would provide a good location to call
> > pgstat_inc_ioop() for unbuffered IO and avoid regressions for future
> > users of these functions.
> >
> > [1] 
> > https://www.postgresql.org/message-id/CAAKRu_aw72w70X1P%3Dba20K8iGUvSkyz7Yk03wPPh3f9WgmcJ3g%40mail.gmail.com
>
> On longer thread it's nice for committers to already have Reviewed-By: in the
> commit message.

Done.

> > diff --git a/src/backend/utils/activity/backend_status.c 
> > b/src/backend/utils/activity/backend_status.c
> > index 7229598822..413cc605f8 100644
> > --- a/src/backend/utils/activity/backend_status.c
> > +++ b/src/backend/utils/activity/backend_status.c
> > @@ -399,6 +399,15 @@ pgstat_bestart(void)
> >   lbeentry.st_progress_command = PROGRESS_COMMAND_INVALID;
> >   lbeentry.st_progress_command_target = InvalidOid;
> >   lbeentry.st_query_id = UINT64CONST(0);
> > + for (int io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++)
> > + {
> > + IOOps  *io_ops = &lbeentry.io_path_stats[io_path];
> > +
> > + pg_atomic_init_u64(&io_ops->allocs, 0);
> > + pg_atomic_init_u64(&io_ops->extends, 0);
> > + pg_atomic_init_u64(&io_ops->fsyncs, 0);
> > + pg_atomic_init_u64(&io_ops->writes, 0);
> > + }
> >
> >   /*
> >* we don't zero st_progress_param here to save cycles; nobody should
>
> nit: I think we nearly always have a blank line before loops

Done.

> > diff --git a/src/backend/utils/init/postinit.c 
> > b/src/backend/utils/init/postinit.c
> > index 646126edee..93f1b4bcfc 100644
> > --- a/src/backend/utils/init/postinit.c
> > +++ b/src/backend/utils/init/postinit.c
> > @@ -623,6 +623,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const 
> > char *username,
> >   RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, 
> > ClientCheckTimeoutHandler);
> >   }
> >
> > + pgstat_beinit();
> >   /*
> >

Re: SQL/JSON: functions

2021-12-15 Thread Andrew Dunstan



On 12/9/21 09:04, Himanshu Upadhyaya wrote:
>
>
>
> 4)
> Are we intentionally allowing numeric keys in JSON_OBJECT but somehow
> these are not allowed in ORACLE?
> ‘postgres[151876]=#’select JSON_OBJECT( 3+1:2, 2+2:1);
>     json_object
> 
>  {"4" : 2, "4" : 1}
> (1 row)
>
> In ORACLE we are getting error("ORA-00932: inconsistent datatypes:
> expected CHAR got NUMBER") which seems to be more reasonable.
> "ORA-00932: inconsistent datatypes: expected CHAR got NUMBER"
>
> Postgres is also dis-allowing below then why allow numeric keys in
> JSON_OBJECT?
> ‘postgres[151876]=#’select '{
>   "track": {
>     "segments": [
>       {
>         "location":   [ 47.763, 13.4034 ],
>         "start time": "2018-10-14 10:05:14",
>         "HR": 73
>       },
>       {
>         "location":   [ 47.706, 13.2635 ],
>         "start time": "2018-10-14 10:39:21",
>         3: 135
>       }
>     ]
>   }
> }'::jsonb;
> ERROR:  22P02: invalid input syntax for type json
> LINE 1: select '{
>                ^
> DETAIL:  Expected string, but found "3".
> CONTEXT:  JSON data, line 12:         3...
> LOCATION:  json_ereport_error, jsonfuncs.c:621
>
> Also, JSON_OBJECTAGG is failing if we have any numeric key, however,
> the message is not very appropriate.
> SELECT JSON_OBJECTAGG(k: v ABSENT ON NULL) AS apt
> FROM (VALUES ('no', 5), ('area', 50), ('rooms', 2), ('foo', NULL),
> (5,5)) kv(k, v);
> ERROR:  22P02: invalid input syntax for type integer: "no"
> LINE 2: FROM (VALUES ('no', 5), ('area', 50), ('rooms', 2), ('foo', ...
>                       ^
> LOCATION:  pg_strtoint32, numutils.c:320
>
>
>

The literal above is simply not legal json, so the json parser is going
to reject it outright. However it is quite reasonable for JSON
constructors to convert non-string key values to strings. Otherwise we'd
be rejecting not just numbers but for example dates as key values. c.f.
json_build_object(), the documentation for which says "Key arguments are
coerced to text."


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pg_upgrade should truncate/remove its logs before running

2021-12-15 Thread Bruce Momjian

On Wed, Dec 15, 2021 at 04:17:23PM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
> >> If pg_upgrade fails and is re-run, it appends to its logfiles, which is
> >> confusing since, if it fails again, it then looks like the original error
> >> recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.
> 
> > Uh, the database server doesn't erase its logs on crash/failure, so why
> > should pg_upgrade do that?
> 
> The server emits enough information so that it's not confusing:
> there are timestamps, and there's an identifiable startup line.
> pg_upgrade does neither.  If you don't want to truncate as
> Justin suggests, you should do that instead.
> 
> Personally I like the idea of making a timestamped subdirectory
> and dropping all the files in that, because the thing that most
> annoys *me* about pg_upgrade is the litter it leaves behind in
> $CWD.  A subdirectory would make it far easier to mop up the mess.

Yes, lot of litter.  Putting it in a subdirectory makes a lot of sense.
Justin, do you want to work on that patch, since you had an earlier
version to fix this?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: pg_upgrade should truncate/remove its logs before running

2021-12-15 Thread Tom Lane

Bruce Momjian  writes:
> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
>> If pg_upgrade fails and is re-run, it appends to its logfiles, which is
>> confusing since, if it fails again, it then looks like the original error
>> recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.

> Uh, the database server doesn't erase its logs on crash/failure, so why
> should pg_upgrade do that?

The server emits enough information so that it's not confusing:
there are timestamps, and there's an identifiable startup line.
pg_upgrade does neither.  If you don't want to truncate as
Justin suggests, you should do that instead.

Personally I like the idea of making a timestamped subdirectory
and dropping all the files in that, because the thing that most
annoys *me* about pg_upgrade is the litter it leaves behind in
$CWD.  A subdirectory would make it far easier to mop up the mess.

regards, tom lane

Re: pg_upgrade should truncate/remove its logs before running

2021-12-15 Thread Justin Pryzby

On Wed, Dec 15, 2021 at 04:09:16PM -0500, Bruce Momjian wrote:
> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
> > I have seen this numerous times but had not dug into it, until now.
> > 
> > If pg_upgrade fails and is re-run, it appends to its logfiles, which is
> > confusing since, if it fails again, it then looks like the original error
> > recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.
> > 
> > I think it should either truncate the logfiles, or error early if any of the
> > files exist.  Or it could put all its output files into a newly-created
> > subdirectory.  Or this message could be output to the per-db logfiles, and 
> > not
> > just the static ones:
> > | "pg_upgrade run on %s".
> > 
> > For the per-db logfiels with OIDs in their name, changing open() from 
> > "append"
> > mode to truncate mode doesn't work, since they're written to in parallel.
> > They have to be removed/truncated in advance.
> > 
> > This is one possible fix.  You can test its effect by deliberately breaking 
> > one
> > of the calls to exec_progs(), like this.
> > 
> > -  "\"%s/pg_restore\" %s %s --exit-on-error --verbose "
> > +  "\"%s/pg_restore\" %s %s --exit-on-error --verboose "
> 
> Uh, the database server doesn't erase its logs on crash/failure, so why
> should pg_upgrade do that?

To avoid the presence of irrelevant errors from the previous invocation of
pg_upgrade.

Maybe you would prefer one of my other ideas , like "put all its output files
into a newly-created subdirectory" ?

-- 
Justin

Re: Buildfarm support for older versions

2021-12-15 Thread Tom Lane

Andrew Dunstan  writes:
> I have set up a test buildfarm client (which will eventually report
> under the name 'godwit') alongside crake (Fedora 34). So far testing has
> run smoothly, there are only two glitches:

>   * 9.3 and 9.2 don't have a show_dl_suffix make target. This would
> require backpatching b40cb99b85 and d9cdb1ba9e. That's a tiny
> change, and I propose to do it shortly unless there's an objection.

Not really user-visible, so I can't see a problem with it.

regards, tom lane

Re: pg_upgrade should truncate/remove its logs before running

2021-12-15 Thread Bruce Momjian

On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote:
> I have seen this numerous times but had not dug into it, until now.
> 
> If pg_upgrade fails and is re-run, it appends to its logfiles, which is
> confusing since, if it fails again, it then looks like the original error
> recurred and wasn't fixed.  The "append" behavior dates back to 717f6d608.
> 
> I think it should either truncate the logfiles, or error early if any of the
> files exist.  Or it could put all its output files into a newly-created
> subdirectory.  Or this message could be output to the per-db logfiles, and not
> just the static ones:
> | "pg_upgrade run on %s".
> 
> For the per-db logfiels with OIDs in their name, changing open() from "append"
> mode to truncate mode doesn't work, since they're written to in parallel.
> They have to be removed/truncated in advance.
> 
> This is one possible fix.  You can test its effect by deliberately breaking 
> one
> of the calls to exec_progs(), like this.
> 
> -  "\"%s/pg_restore\" %s %s --exit-on-error --verbose "
> +  "\"%s/pg_restore\" %s %s --exit-on-error --verboose "

Uh, the database server doesn't erase its logs on crash/failure, so why
should pg_upgrade do that?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: psql format output

2021-12-15 Thread Pavel Stehule

Hi

st 15. 12. 2021 v 21:16 odesílatel Florian Koch <
florian.murat.k...@gmail.com> napsal:

> Hello,
>
> I realized that the output of "\df+ func_name" has a formatting problem
> when a
> lot of arguments are used. The field 'Arguments data types' gets very long
> and
> destroys the whole formatting in the console. The field 'Source code' is
> most of
> the time multi-line and I thought that the output for the field 'Arguments
> data
> types' could also be multiline with one line for each argument.
>

try to use pager

https://github.com/okbob/pspg

Regards

Pavel


> Regards,
> Florian Koch
>

Re: Non-superuser subscription owners

2021-12-15 Thread Mark Dilger

> On Nov 24, 2021, at 4:30 PM, Jeff Davis  wrote:
> 
> We need to do permission checking for WITH CHECK OPTION and RLS. The
> patch right now allows the subscription to write data that an RLS
> policy forbids.

Version 4 of the patch, attached, no longer allows RLS to be circumvented, but 
does so in a course-grained fashion.  If the target table has row-level 
security policies which are enforced against the subscription owner, the 
replication draws an error, much as with a permissions failure.  This seems 
sufficient for now, as superusers, roles with bypassrls, and target table 
owners should be able to replicate as before.  We may want to revisit this 
later, perhaps if/when we address your ExecInsert question, below.

> 
> A couple other points:
> 
> * We shouldn't refer to the behavior of previous versions in the docs
> unless there's a compelling reason

Fixed.

> * Do we need to be smarter about partitioned tables, where an insert
> can turn into an update?

Indeed, the logic of apply_handle_tuple_routing() required a bit of 
refactoring.  Fixed in v4.

> * Should we refactor to borrow logic from ExecInsert so that it's less
> likely that we miss something in the future?

Let's just punt on this for now.

v4-0001-Respect-permissions-within-logical-replication.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

psql format output

2021-12-15 Thread Florian Koch

Hello,

I realized that the output of "\df+ func_name" has a formatting problem
when a
lot of arguments are used. The field 'Arguments data types' gets very long
and
destroys the whole formatting in the console. The field 'Source code' is
most of
the time multi-line and I thought that the output for the field 'Arguments
data
types' could also be multiline with one line for each argument.

Regards,
Florian Koch

Re: Privilege required for IF EXISTS event if the object already exists

2021-12-15 Thread David G. Johnston

On Wednesday, December 15, 2021, Shay Rojansky  wrote:
>
> . Now, before creating tables, the ORM generates CREATE SCHEMA IF NOT
> EXISTS, to ensure that the schema exists before CREATE TABLE; that's
> reasonable general-purpose behavior.
>

If the user hasn’t specified they want the schema created it’s arguable
that executing create schema anyway is reasonable.  The orm user should
know whether they are expected/able to create the schema as part of their
responsibilities and instruct the orm accordingly and expect it to only
create what it has been explicitly directed to create.

David J.

Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2021-12-15 Thread Bruce Momjian

On Thu, Dec  2, 2021 at 07:19:50PM +0530, Dilip Kumar wrote:
>From the patch:

> Currently, CREATE DATABASE forces a checkpoint, then copies all the files,
> then forces another checkpoint. The comments in the createdb() function
> explain the reasons for this. The attached patch fixes this problem by making
> create database completely WAL logged so that we can avoid the checkpoints.
> 
> This can also be useful for supporting the TDE. For example, if we need 
> different
> encryption for the source and the target database then we can not re-encrypt 
> the
> page data if we copy the whole directory.  But with this patch, we are copying
> page by page so we have an opportunity to re-encrypt the page before copying 
> that
> to the target database.

Uh, why is this true?  Why can't we just copy the heap/index files 8k at
a time and reencrypt them during the file copy, rather than using shared
buffers?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: archive modules

2021-12-15 Thread Bossart, Nathan

On 11/2/21, 8:07 AM, "Bossart, Nathan"  wrote:
> The main motivation is provide a way to archive without shelling out.
> This reduces the amount of overhead, which can improve archival rate
> significantly.  It should also make it easier to archive more safely.
> For example, many of the common shell commands used for archiving
> won't fsync the data, but it isn't too hard to do so via C.  The
> current proposal doesn't introduce any extra infrastructure for
> batching or parallelism, but it is probably still possible.  I would
> like to eventually add batching, but for now I'm only focused on
> introducing basic archive module support.

As noted above, the latest patch set (v11) doesn't add any batching or
parallelism.  Now that beb4e9b is committed (which causes the archiver
to gather multiple files to archive in each scan of archive_status),
it seems like a good time to discuss this a bit further.  I think
there are some interesting design considerations.

As is, the current archive module infrastructure in the v11 patch set
should help reduce the amount of overhead per-file quite a bit, and I
observed a noticeable speedup with a basic file-copying archive
strategy (although this is likely not representative of real-world
workloads).  I believe it would be possible for archive module authors
to implement batching/parallelism, but AFAICT it would still require
hacks similar to what folks do today with archive_command.  For
example, you could look ahead in archive_status, archive a bunch of
files in a batch or in parallel with background workers, and then
quickly return true when the archive_library is called for later files
in the batch.

Alternatively, we could offer some kind of built-in batching support
in the archive module infrastructure.  One simple approach would be to
just have pgarch_readyXlog() optionally return the entire list of
files gathered from the directory scan of archive_status (presently up
to 64 files).  Or we could provide a GUC like archive_batch_size that
would allow users to limit how many files are sent to the
archive_library each time.  This list would be given to
pgarch_archiveXlog(), which would return which files were successfully
archived and which failed.  I think this could be done for
archive_command as well, although it might be tricky to determine
which files were archived successfully.  To handle that, we might just
need to fail the whole batch if the archive_command return value
indicates failure.

Another interesting change is that the special timeline file handling
added in beb4e9b becomes less useful.  Presently, if a timeline
history file is marked ready for archival, we force pgarch_readyXlog()
to do a new scan of archive_status the next time it is called in order
to pick it up as soon as possible (ordinarily it just returns the
files gathered in a previous scan until it runs out).  If we are
sending a list of files to the archive module, it will be more
difficult to ensure timeline history files are picked up so quickly.
Perhaps this is a reasonable tradeoff to make when archive batching is
enabled.

I think the retry logic can stay roughly the same.  If any files in a
batch cannot be archived, wait a second before retrying.  If that
happens a few times in a row, stop archiving for a bit.  It wouldn't
be quite as precise as what's there today because the failures could
be for different files each time, but I don't know if that is terribly
important.

Finally, I wonder if batching support is something we should bother
with at all for the first round of archive module support.  I believe
it is something that could be easily added later, although it might
require archive modules to adjust the archiving callback to accept and
return a list of files.  IMO the archive modules infrastructure is
still an improvement even without batching, and it seems to fit nicely
into the existing behavior of the archiver process.  I'm curious what
others think about all this.

Nathan

Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-12-15 Thread Mark Dilger

> On Dec 15, 2021, at 10:02 AM, Joshua Brindle  
> wrote:
> 
> Ah, I was actually requesting a hook where the acl check was done for
> setting a GUC, such that we could deny setting them in a hook,
> something that would be useful for the set_user extension
> (github.com/pgaudit/set_user)

Hmm, this seems orthogonal to the patch under discussion.  This patch only adds 
a pg_setting_acl_aclcheck in ExecSetVariableStmt() for settings which have been 
explicitly granted, otherwise it works the traditional way (checking whether 
the setting is suset/userset).  I don't think you'd get MAC support without 
finding a way to fire the hook for all settings, regardless of their presence 
in the new pg_setting_acl table.  That is hard, because 
InvokeObjectPostAlterHook expects the classId (SettingAclRelationId) and the 
objectId (pg_setting_acl.oid), but you don't have those for many (most?) 
settings.  As discussed upthread, we *do not* want to force an entry into the 
table for all settings, only for ones that have been explicitly granted.

Do you agree?  I'm happy to support MAC in this patch if can explain a simple 
way of doing so.

> but having a hook for grant/revoke is
> also helpful.

Yes, I see no reason to rip this out.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Privilege required for IF EXISTS event if the object already exists

2021-12-15 Thread Shay Rojansky

> I would say it is reasonable in theory.  But I cannot think of an actual
scenario that would benefit from such a change.  Your stated use case is
rejected since you explicitly do not want tenants to be able to create
schemas - so the simple act of issuing "CREATE SCHEMA" is disallowed.
> [...]
> Because tenants are not allowed to CREATE SCHEMA you should replace
"CREATE SCHEMA" in the body of that DO block with "RAISE ERROR 'Schema foo
required but not present!';"  Or, just tell them to create objects in the
presumed present schema and let them see the "schema not found" error that
would occur in rare case the schema didn't exist.

The point here is when layers/ORMs are used, and are not necessarily aware
of the multi-tenant scenario. In my concrete real-world complaints here,
users instruct the ORM to generate the database schema for them. Now,
before creating tables, the ORM generates CREATE SCHEMA IF NOT EXISTS, to
ensure that the schema exists before CREATE TABLE; that's reasonable
general-purpose behavior (again, it does not know about multi-tenancy).
It's the user's responsibility to have already created the schema and
assigned rights to the right PG user, at which point everything could work
transparently (schema creation is skipped because it already exists, CREATE
TABLE succeeds).

Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-12-15 Thread Joshua Brindle

On Wed, Dec 15, 2021 at 10:56 AM Mark Dilger
 wrote:
>
>
>
> > On Dec 14, 2021, at 2:26 PM, Joshua Brindle 
> >  wrote:
> >
> > currently there is a failure in check-world (not sure if it's known):
>
> That one is definitely my fault.  'en_US.UTF-8' exists on my platform, so I 
> hadn't noticed.  I've changed it to use 'C', which should be portable.
>
> > One thing that seems like an omission to me is the absence of a
> > InvokeObjectPostAlterHook in pg_setting_acl_aclcheck or
> > pg_setting_acl_aclmask so that MAC extensions can also block this,
> > InvokeObjectPostCreateHook is already in the create path so a
> > PostAlter hook seems appropriate.
>
> Good catch, but that seems like a strange place to put a PostAlterHook, so I 
> added it to ExecGrant_Setting for v6, instead.  This seems more consistent 
> with the hook in SetDefaultACL.

Ah, I was actually requesting a hook where the acl check was done for
setting a GUC, such that we could deny setting them in a hook,
something that would be useful for the set_user extension
(github.com/pgaudit/set_user) but having a hook for grant/revoke is
also helpful.

> (If you are really trying to do Managed Access Control (MAC), wouldn't that 
> be a separate patch which adds security hooks into all *_aclcheck functions?)

MAC is mandatory access controls, so something that can be layered on
top of DAC and can only be enforced even on object owners and
superuser. sepgsql is a MAC extension, for example. It uses the object
access hooks to enforce SELinux policy on PG objects.

Re: Allow escape in application_name

2021-12-15 Thread Fujii Masao




On 2021/12/10 16:35, kuroda.hay...@fujitsu.com wrote:

How about adding that new paragraph just after the existing one, instead?


Fixed.


Thanks for the fix! Attached is the updated version of 0001 patch.
I added "See  for details."
into the description. Barring any objection, I will commit
this patch at first.


Thanks! But, since the term "local server" is already used in the docs, we can 
use
"the setting value of application_name in local server" etc?


I like the word "local server," so I reworte descriptions.


Thanks! Attached is the updated version of 0002 patch. I applied
some cosmetic changes, improved comments and docs. Could you review
this version?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 41c952fbe3..f4210c878f 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -936,6 +936,16 @@ postgres=# SELECT postgres_fdw_disconnect_all();
   Note that change of this parameter doesn't affect any existing
   connections until they are re-established.
  
+ 
+ postgres_fdw.application_name can be any string
+ of any length and contain even non-ASCII characters.  However when
+ it's passed to and used as application_name
+ in a foreign server, note that it will be truncated to less than
+ NAMEDATALEN characters and any characters other
+ than printable ASCII ones in it will be replaced with question
+ marks (?).
+ See  for details.
+ 
 

   
diff --git a/contrib/postgres_fdw/connection.c 
b/contrib/postgres_fdw/connection.c
index 6bac4ad23e..664487b013 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -348,6 +348,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
{
const char **keywords;
const char **values;
+   char   *appname = NULL;
int n;
 
/*
@@ -383,6 +384,39 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
n++;
}
 
+   /*
+* Search the parameter arrays to find application_name 
setting, and
+* replace escape sequences in it with status information if 
found.
+* The arrays are searched backwards because the last value is 
used if
+* application_name is repeatedly set.
+*/
+   for (int i = n - 1; i >= 0; i--)
+   {
+   if (strcmp(keywords[i], "application_name") == 0 &&
+   values[i] != NULL && *(values[i]) != '\0')
+   {
+   /*
+* Use this application_name setting if it's 
not empty string
+* even after any escape sequences in it are 
replaced.
+*/
+   appname = process_pgfdw_appname(values[i]);
+   if (appname[0] != '\0')
+   {
+   values[i] = appname;
+   break;
+   }
+
+   /*
+* This empty application_name is not used, so 
we set
+* values[i] to NULL and keep searching the 
array to find the
+* next one.
+*/
+   values[i] = NULL;
+   pfree(appname);
+   appname = NULL;
+   }
+   }
+
/* Use "postgres_fdw" as fallback_application_name */
keywords[n] = "fallback_application_name";
values[n] = "postgres_fdw";
@@ -452,6 +486,8 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
/* Prepare new session for use */
configure_remote_session(conn);
 
+   if (appname != NULL)
+   pfree(appname);
pfree(keywords);
pfree(values);
}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 5196e4797a..ca6ae6749b 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -10822,3 +10822,35 @@ ERROR:  invalid value for integer option "batch_size": 
100$%$#$#
 ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
 ERROR:  invalid option "nonexistent"
 HINT:  There are no valid options in this context.
+-- ===
+-- test postgres_fdw.application_name GUC
+

Buildfarm support for older versions

2021-12-15 Thread Andrew Dunstan

OK, old_branches_of_interest.txt now exists on the buildfarm server, and
the code has been modified to take notice of it (i.e. to accept builds
for branches listed there). The contents are the non-live versions from
9.2 on.

I have set up a test buildfarm client (which will eventually report
under the name 'godwit') alongside crake (Fedora 34). So far testing has
run smoothly, there are only two glitches:

  * 9.3 and 9.2 don't have a show_dl_suffix make target. This would
require backpatching b40cb99b85 and d9cdb1ba9e. That's a tiny
change, and I propose to do it shortly unless there's an objection.
  * I need to undo the removal of client logic that supported 9.2's
unix_socket_directory setting as opposed to the later
unix_socket_directories.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)

2021-12-15 Thread tushar


On 12/15/21 12:09 AM, tushar wrote:

I spent much of today reviewing 0001. Here's an updated version, so
far only lightly tested. Please check whether I've broken anything.

Thanks Robert, I tested from v96/12/13/v14 -> v15( with patch)
things are working fine i.e table /index relfilenode is preserved,
not changing after pg_upgrade. 
I covered tablespace OIDs testing scenarios and that is also preserved 
after pg_upgrade.


--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: logical decoding and replication of sequences

2021-12-15 Thread Tomas Vondra




On 12/15/21 17:42, Alvaro Herrera wrote:
> Looking at 0003,
> 
> On 2021-Dec-14, Tomas Vondra wrote:
> 
>> diff --git a/doc/src/sgml/ref/alter_publication.sgml 
>> b/doc/src/sgml/ref/alter_publication.sgml
>> index bb4ef5e5e22..4d166ad3f9c 100644
>> --- a/doc/src/sgml/ref/alter_publication.sgml
>> +++ b/doc/src/sgml/ref/alter_publication.sgml
>> @@ -31,7 +31,9 @@ ALTER PUBLICATION > class="parameter">name RENAME TO >  where > class="parameter">publication_object is one of:
>>  
>>  TABLE [ ONLY ] table_name 
>> [ * ] [, ... ]
>> +SEQUENCE sequence_name [ * 
>> ] [, ... ]
>>  ALL TABLES IN SCHEMA { > class="parameter">schema_name | CURRENT_SCHEMA } [, ... ]
>> +ALL SEQUENCE IN SCHEMA { > class="parameter">schema_name | CURRENT_SCHEMA } [, ... ]
> 
> Note that this says ALL SEQUENCE; I think it should be ALL SEQUENCES.
> 
>> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
>> index 3d4dd43e47b..f037c17985b 100644
>> --- a/src/backend/parser/gram.y
>> +++ b/src/backend/parser/gram.y
>> @@ -9762,6 +9762,26 @@ PublicationObjSpec:
> ...
>> +| ALL SEQUENCE IN_P SCHEMA ColId
>> +{
>> +$$ = makeNode(PublicationObjSpec);
>> +$$->pubobjtype = 
>> PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA;
>> +$$->name = $5;
>> +$$->location = @5;
>> +}
>> +| ALL SEQUENCES IN_P SCHEMA CURRENT_SCHEMA
>> +{
>> +$$ = makeNode(PublicationObjSpec);
>> +$$->pubobjtype = 
>> PUBLICATIONOBJ_SEQUENCE_IN_CUR_SCHEMA;
>> +$$->location = @5;
>> +}
> 
> And here you have ALL SEQUENCE in one spot and ALL SEQUENCES in the
> other.
> 
> BTW I think these enum values should use the plural too,
> PUBLICATIONOBJ_SEQUENCES_IN_CUR_SCHEMA (not SEQUENCE).  I suppose you
> copied from PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA, but that too seems to be
> a mistake: should be PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA.
> 
> 
>> @@ -10097,6 +10117,12 @@ UnlistenStmt:
>>  }
>>  ;
>>  
>> +/*
>> + * FIXME
>> + *
>> + * opt_publication_for_sequences and publication_for_sequences should be
>> + * copies for sequences
>> + */
> 
> Not sure if this FIXME is relevant or should just be removed.
> 
>> @@ -10105,6 +10131,12 @@ UnlistenStmt:
>>   *  BEGIN / COMMIT / ROLLBACK
>>   *  (also older versions END / ABORT)
>>   *
>> + * ALTER PUBLICATION name ADD SEQUENCE sequence [, sequence2]
>> + *
>> + * ALTER PUBLICATION name DROP SEQUENCE sequence [, sequence2]
>> + *
>> + * ALTER PUBLICATION name SET SEQUENCE sequence [, sequence2]
>> + *
> 
> This comment addition seems misplaced?
> 
>> diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
>> index 2f412ca3db3..e30bf7b1b55 100644
>> --- a/src/bin/psql/tab-complete.c
>> +++ b/src/bin/psql/tab-complete.c
>> @@ -1647,13 +1647,13 @@ psql_completion(const char *text, int start, int end)
>>  COMPLETE_WITH("ADD", "DROP", "OWNER TO", "RENAME TO", "SET");
>>  /* ALTER PUBLICATION  ADD */
>>  else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD"))
>> -COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE");
>> +COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE|SEQUENCE");
>>  /* ALTER PUBLICATION  DROP */
>>  else if (Matches("ALTER", "PUBLICATION", MatchAny, "DROP"))
>> -COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE");
>> +COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE|SEQUENCE");
>>  /* ALTER PUBLICATION  SET */
>>  else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
>> -COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
>> +COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE|SEQUENCE");
> 
> I think you should also add "ALL SEQUENCES IN SCHEMA" to these lists.
> 
> 
>>  else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", 
>> "ALL", "TABLES", "IN", "SCHEMA"))
> 
> ... and perhaps make this "ALL", "TABLES|SEQUENCES", "IN", "SCHEMA".
> 

Thanks for the review. I'm aware 0003 is still incomplete and subject to
change - it's certainly not meant for commit yet. The current 0003 patch
is sufficient for testing the infrastructure, but we need to figure out
how to make it easier to use, what to do with implicit sequences and
similar things. Peter had some ideas in [1].

[1]
https://www.postgresql.org/message-id/359bf6d0-413d-292a-4305-e99eeafead39%40enterprisedb.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Privilege required for IF EXISTS event if the object already exists

2021-12-15 Thread Chapman Flack

On 12/15/21 11:10, David G. Johnston wrote:
>> IF NOT EXISTS(SELECT 1 FROM pg_namespace WHERE nspname = 'foo') THEN

Orthogonally to any other comments,

IF pg_catalog.to_regnamespace('foo') IS NULL THEN

might be tidier, if you don't need to support PG < 9.5.

Regards,
-Chap

Re: speed up text_position() for utf-8

2021-12-15 Thread John Naylor

I wrote:

> The test is attached and the test function is part of the patch. It's
> based on the test used in the commit above. The test searches for a
> string that's at the end of a ~1 million byte string. This is on gcc
> 11 with 2-3 runs to ensure repeatability, but I didn't bother with
> statistics because the differences are pretty big:
>
>   patch  | no match | ascii | mulitbyte
> -+--+---+---
>  PG11| 1120 |  1100 |   900
>  master  |  381 |  2350 |  1900
>  DFA |  386 |  1640 |  1640
>  branchless utf mblen|  387 |  4100 |  2600
>  inline pg_utf_mblen()   |  380 |  1080 |   920
>  inline pg_utf_mblen() + ascii fast path |  382 |   470 |   918

I failed to mention that the above numbers are milliseconds, so
smaller is better.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: logical decoding and replication of sequences

2021-12-15 Thread Alvaro Herrera

Looking at 0003,

On 2021-Dec-14, Tomas Vondra wrote:

> diff --git a/doc/src/sgml/ref/alter_publication.sgml 
> b/doc/src/sgml/ref/alter_publication.sgml
> index bb4ef5e5e22..4d166ad3f9c 100644
> --- a/doc/src/sgml/ref/alter_publication.sgml
> +++ b/doc/src/sgml/ref/alter_publication.sgml
> @@ -31,7 +31,9 @@ ALTER PUBLICATION  class="parameter">name RENAME TO   where  class="parameter">publication_object is one of:
>  
>  TABLE [ ONLY ] table_name [ 
> * ] [, ... ]
> +SEQUENCE sequence_name [ * 
> ] [, ... ]
>  ALL TABLES IN SCHEMA {  class="parameter">schema_name | CURRENT_SCHEMA } [, ... ]
> +ALL SEQUENCE IN SCHEMA {  class="parameter">schema_name | CURRENT_SCHEMA } [, ... ]

Note that this says ALL SEQUENCE; I think it should be ALL SEQUENCES.

> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index 3d4dd43e47b..f037c17985b 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y
> @@ -9762,6 +9762,26 @@ PublicationObjSpec:
...
> + | ALL SEQUENCE IN_P SCHEMA ColId
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->pubobjtype = 
> PUBLICATIONOBJ_SEQUENCE_IN_SCHEMA;
> + $$->name = $5;
> + $$->location = @5;
> + }
> + | ALL SEQUENCES IN_P SCHEMA CURRENT_SCHEMA
> + {
> + $$ = makeNode(PublicationObjSpec);
> + $$->pubobjtype = 
> PUBLICATIONOBJ_SEQUENCE_IN_CUR_SCHEMA;
> + $$->location = @5;
> + }

And here you have ALL SEQUENCE in one spot and ALL SEQUENCES in the
other.

BTW I think these enum values should use the plural too,
PUBLICATIONOBJ_SEQUENCES_IN_CUR_SCHEMA (not SEQUENCE).  I suppose you
copied from PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA, but that too seems to be
a mistake: should be PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA.


> @@ -10097,6 +10117,12 @@ UnlistenStmt:
>   }
>   ;
>  
> +/*
> + * FIXME
> + *
> + * opt_publication_for_sequences and publication_for_sequences should be
> + * copies for sequences
> + */

Not sure if this FIXME is relevant or should just be removed.

> @@ -10105,6 +10131,12 @@ UnlistenStmt:
>   *   BEGIN / COMMIT / ROLLBACK
>   *   (also older versions END / ABORT)
>   *
> + * ALTER PUBLICATION name ADD SEQUENCE sequence [, sequence2]
> + *
> + * ALTER PUBLICATION name DROP SEQUENCE sequence [, sequence2]
> + *
> + * ALTER PUBLICATION name SET SEQUENCE sequence [, sequence2]
> + *

This comment addition seems misplaced?

> diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
> index 2f412ca3db3..e30bf7b1b55 100644
> --- a/src/bin/psql/tab-complete.c
> +++ b/src/bin/psql/tab-complete.c
> @@ -1647,13 +1647,13 @@ psql_completion(const char *text, int start, int end)
>   COMPLETE_WITH("ADD", "DROP", "OWNER TO", "RENAME TO", "SET");
>   /* ALTER PUBLICATION  ADD */
>   else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD"))
> - COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE");
> + COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE|SEQUENCE");
>   /* ALTER PUBLICATION  DROP */
>   else if (Matches("ALTER", "PUBLICATION", MatchAny, "DROP"))
> - COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE");
> + COMPLETE_WITH("ALL TABLES IN SCHEMA", "TABLE|SEQUENCE");
>   /* ALTER PUBLICATION  SET */
>   else if (Matches("ALTER", "PUBLICATION", MatchAny, "SET"))
> - COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE");
> + COMPLETE_WITH("(", "ALL TABLES IN SCHEMA", "TABLE|SEQUENCE");

I think you should also add "ALL SEQUENCES IN SCHEMA" to these lists.


>   else if (Matches("ALTER", "PUBLICATION", MatchAny, "ADD|DROP|SET", 
> "ALL", "TABLES", "IN", "SCHEMA"))

... and perhaps make this "ALL", "TABLES|SEQUENCES", "IN", "SCHEMA".


-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Nunca confiaré en un traidor.  Ni siquiera si el traidor lo he creado yo"
(Barón Vladimir Harkonnen)

Re: Privilege required for IF EXISTS event if the object already exists

2021-12-15 Thread David G. Johnston

On Wed, Dec 15, 2021 at 5:35 AM Shay Rojansky  wrote:

> Hi all,
>
> I've received numerous complaints about CREATE SCHEMA IF NOT EXISTS
> failing when the user lacks CREATE privileges on the database - even if the
> schema already exists. A typical scenario would be a multi-tenant
> schema-per-tenant setup, where the schema and tenant user are created
> beforehand, but then some database layer or ORM wants to ensure that the
> schema is there so the above is issued.
>
> Would it be reasonable to have the above no error if the schema already
> exists?
>

I would say it is reasonable in theory.  But I cannot think of an actual
scenario that would benefit from such a change.  Your stated use case is
rejected since you explicitly do not want tenants to be able to create
schemas - so the simple act of issuing "CREATE SCHEMA" is disallowed.

That would make it similar to the following (which I'm switching to in the
> Entity Framework Core ORM):
>
> DO $$
> BEGIN
> IF NOT EXISTS(SELECT 1 FROM pg_namespace WHERE nspname = 'foo') THEN
> CREATE SCHEMA "foo";
> END IF;
> END $$;
>
>
Because tenants are not allowed to CREATE SCHEMA you should replace "CREATE
SCHEMA" in the body of that DO block with "RAISE ERROR 'Schema foo required
but not present!';"  Or, just tell them to create objects in the presumed
present schema and let them see the "schema not found" error that would
occur in rare case the schema didn't exist.

David J.

Re: Add id's to various elements in protocol.sgml

2021-12-15 Thread Brar Piening


On Dec 15, 2021 at 15:49, Alvaro Herrera wrote:

On 2021-Dec-15, Brar Piening wrote:

Since I can't argue towards some general utility for the xreflabels
and don't have any other solid argument in favor of adding more, I
will remove them from my current patch but leave the existing ones
intact.

Yeah, I think not adding them until we have a use for them might be
wisest.

A new version of the patch that doesn't add xreflabels is attached.
Thanks for your support.
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 34a7034282..0bca5831b1 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1810,7 +1810,7 @@ Replication commands are logged in the server log when
 
 The commands accepted in replication mode are:
 
-  
+  
 IDENTIFY_SYSTEM
  IDENTIFY_SYSTEM
 
@@ -1875,7 +1875,7 @@ The commands accepted in replication mode are:
 
   
 
-  
+  
 SHOW name
  SHOW
 
@@ -1899,7 +1899,7 @@ The commands accepted in replication mode are:
 
   
 
-  
+  
 TIMELINE_HISTORY tli
  TIMELINE_HISTORY
 
@@ -2084,7 +2084,7 @@ The commands accepted in replication mode are:
 
   
 
-  
+  
 CREATE_REPLICATION_SLOT slot_name [ TEMPORARY ] { 
PHYSICAL [ RESERVE_WAL ] | 
LOGICAL output_plugin [ 
EXPORT_SNAPSHOT | NOEXPORT_SNAPSHOT | 
USE_SNAPSHOT | TWO_PHASE ] }
 
 
@@ -2095,7 +2095,7 @@ The commands accepted in replication mode are:
 
   
 
-  
+  
 READ_REPLICATION_SLOT slot_name
   READ_REPLICATION_SLOT
 
@@ -2143,7 +2143,7 @@ The commands accepted in replication mode are:
 
   
 
-  
+  
 START_REPLICATION [ SLOT 
slot_name ] [ 
PHYSICAL ] XXX/XXX [ TIMELINE 
tli ]
  START_REPLICATION
 
@@ -2201,7 +2201,7 @@ The commands accepted in replication mode are:
 
  
   
-  
+  
   
   XLogData (B)
   
@@ -2270,7 +2270,7 @@ The commands accepted in replication mode are:
   
   
   
-  
+  
   
   Primary keepalive message (B)
   
@@ -2334,7 +2334,7 @@ The commands accepted in replication mode are:
 
  
   
-  
+  
   
   Standby status update (F)
   
@@ -2415,7 +2415,7 @@ The commands accepted in replication mode are:
 
  
   
-  
+  
   
   Hot Standby feedback message (F)
   
@@ -2497,7 +2497,7 @@ The commands accepted in replication mode are:
  
 
   
-  
+  
 START_REPLICATION SLOT 
slot_name 
LOGICAL XXX/XXX 
[ ( option_name [ 
option_value ] [, ...] ) ]
 
  
@@ -2572,7 +2572,7 @@ The commands accepted in replication mode are:
 
   
 
-  
+  
 
  DROP_REPLICATION_SLOT slot_name  WAIT 

  DROP_REPLICATION_SLOT
@@ -2886,7 +2886,7 @@ The commands accepted in replication mode are:
 
   
 
-  
+  
 BASE_BACKUP [ LABEL 
'label' ] [ PROGRESS ] [ 
FAST ] [ WAL ] [ 
NOWAIT ] [ MAX_RATE 
rate ] [ TABLESPACE_MAP ] [ 
NOVERIFY_CHECKSUMS ] [ MANIFEST 
manifest_option ] [ 
MANIFEST_CHECKSUMS 
checksum_algorithm ]
 
 
@@ -3138,7 +3138,7 @@ of any individual CopyData message cannot be 
interpretable on their own.)
 
 
 
-
+
 
 AuthenticationOk (B)
 
@@ -3183,7 +3183,7 @@ AuthenticationOk (B)
 
 
 
-
+
 
 AuthenticationKerberosV5 (B)
 
@@ -3227,7 +3227,7 @@ AuthenticationKerberosV5 (B)
 
 
 
-
+
 
 AuthenticationCleartextPassword (B)
 
@@ -3271,7 +3271,7 @@ AuthenticationCleartextPassword (B)
 
 
 
-
+
 
 AuthenticationMD5Password (B)
 
@@ -3326,7 +3326,7 @@ AuthenticationMD5Password (B)
 
 
 
-
+
 
 AuthenticationSCMCredential (B)
 
@@ -3371,7 +3371,7 @@ AuthenticationSCMCredential (B)
 
 
 
-
+
 
 AuthenticationGSS (B)
 
@@ -3416,7 +3416,7 @@ AuthenticationGSS (B)
 
 
 
-
+
 
 AuthenticationGSSContinue (B)
 
@@ -3471,7 +3471,7 @@ AuthenticationGSSContinue (B)
 
 
 
-
+
 
 AuthenticationSSPI (B)
 
@@ -3516,7 +3516,7 @@ AuthenticationSSPI (B)
 
 
 
-
+
 
 AuthenticationSASL (B)
 
@@ -3577,7 +3577,7 @@ following:
 
 
 
-
+
 
 AuthenticationSASLContinue (B)
 
@@ -3632,7 +3632,7 @@ AuthenticationSASLContinue (B)
 
 
 
-
+
 
 AuthenticationSASLFinal (B)
 
@@ -3688,7 +3688,7 @@ AuthenticationSASLFinal (B)
 
 
 
-
+
 
 BackendKeyData (B)
 
@@ -3745,7 +3745,7 @@ BackendKeyData (B)
 
 
 
-
+
 
 Bind (F)
 
@@ -3898,7 +3898,7 @@ Bind (F)
 
 
 
-
+
 
 BindComplete (B)
 
@@ -3933,7 +3933,7 @@ BindComplete (B)
 
 
 
-
+
 
 CancelRequest (F)
 
@@ -3991,7 +3991,7 @@ CancelRequest (F)
 
 
 
-
+
 
 Close (F)
 
@@ -4048,7 +4048,7 @@ Close (F)
 
 
 
-
+
 
 CloseComplete (B)
 
@@ -4083,7 +4083,7 @@ CloseComplete (B)
 
 
 
-
+
 
 CommandComplete (B)
 
@@ -4182,7 +4182,7 @@ CommandComplete (B)
 
 
 
-
+
 
 CopyData (F & B)
 
@@ -4228,7 +4228,7 @@ CopyData (F & B)
 
 
 
-
+
 
 CopyDone (F & B)
 
@@ -4263,7 +4263,7 @@ CopyDone (F & B)
 
 
 
-
+
 
 CopyFail (F)
 
@@ -4308,7 +4308,7 @@ CopyFail (F)
 
 
 
-
+
 
 CopyInResponse (B)
 
@@ -4384,7 +4384,7 @@ CopyInResponse (B)
 
 
 
-
+
 
 CopyOutResponse (B)
 
@@ -4457,7 +4457,7 @@ CopyOutResponse (B)
 
 
 
-

Re: conchuela has some SSL issues

2021-12-15 Thread Mikael Kjellström





On 2021-12-14 15:05, Mikael Kjellström wrote:

I will take a look at it.

At the moment I have disabled new builds until I have fixed it.

Sorry for the inconvenience.


Should be fixed now.

/Mikael

Re: Adding CI to our tree

2021-12-15 Thread Daniel Gustafsson

> On 15 Dec 2021, at 16:21, Tom Lane  wrote:
> 
> Justin Pryzby  writes:
>> On Mon, Dec 13, 2021 at 03:45:23PM -0800, Andres Freund wrote:
>>> On 2021-12-13 18:14:52 -0500, Tom Lane wrote:
 I'm with Justin on this one.  I would view a script trying to
 mess with /cores as a hostile act.
> 
>>> I'm not quite following. This is a ephemeral CI instance?
> 
>> As for myself, all I meant is that it's better to write it with zero sudos 
>> than
>> one (for the same reason that it's better to write with one than with two).
> 
> What I'm concerned about is that it's unsafe to run the script in
> any non-throwaway environment.  That doesn't seem desirable.

I don't think anyone should expect any part of the .cirrus.yml script to be
safe or applicable to a local environment, so I think this is something we can
solve with documentation.

--
Daniel Gustafsson   https://vmware.com/

Re: Privilege required for IF EXISTS event if the object already exists

2021-12-15 Thread Tom Lane

Shay Rojansky  writes:
> I've received numerous complaints about CREATE SCHEMA IF NOT EXISTS failing
> when the user lacks CREATE privileges on the database - even if the schema
> already exists. A typical scenario would be a multi-tenant
> schema-per-tenant setup, where the schema and tenant user are created
> beforehand, but then some database layer or ORM wants to ensure that the
> schema is there so the above is issued.

> Would it be reasonable to have the above no error if the schema already
> exists?

Ummm ... why?  What's the point of issuing such a command from a role
that lacks the privileges to actually do the creation?  It seems to
me that you're asking us to design around very-badly-written apps.

> The same could apply to other CREATE ... IF NOT EXISTS variations.

Yeah, it would only make sense if we did it across the board.
For all of them, though, this seems like it'd just move the needle
even further in terms of not having certainty about the properties
of the object.  I'll spare you my customary rant about that, and
just note that not knowing who owns a schema you're using is a
large security hazard.

regards, tom lane

Re: generalized conveyor belt storage

2021-12-15 Thread Robert Haas

On Wed, Dec 15, 2021 at 10:03 AM Matthias van de Meent
 wrote:
> > +Conceptually, a relation fork organized as a conveyor belt has three parts:
> > +
> > +- Payload. The payload is whatever data the user of this module wishes
> > +  to store. The conveyor belt doesn't care what you store in a payload 
> > page,
> > +  but it does require that you store something: each time a payload page is
> > +  initialized, it must end up with either pd_lower > SizeOfPageHeaderData,
> > +  or pd_lower < BLCKSZ.
>
> As SizeOfPageHeaderData < BLCKSZ, isn't this condition always true? Or
> at least, this currently allows for either any value of pd_lower, or
> the (much clearer) 'pd_lower <= SizeOfPageHeaderData or pd_lower >=
> BLCKSZ', depending on exclusiveness of the either_or clause.

The second part of the condition should say pd_upper < BLCKSZ, not
pd_lower. Woops. It's intended to be false for uninitialized pages and
also for pages that are initialized but completely empty with no line
pointers and no special space.

> You mentioned that this is meant to be used as a "relation fork", but
> I couldn't find new code in relpath.h (where ForkNumber etc. are
> defined) that allows one more fork per relation. Is that too on the
> missing features list, or did I misunderstand what you meant with
> "relation fork"?

It's up to whoever is using the code to provide the relation fork -
and it could be the main fork of some new kind of relation, or it
could use some other existing fork number, or it could be an entirely
new fork. But it would be up to the calling code to figure that stuff
out.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Adding CI to our tree

2021-12-15 Thread Tom Lane

Justin Pryzby  writes:
> On Mon, Dec 13, 2021 at 03:45:23PM -0800, Andres Freund wrote:
>> On 2021-12-13 18:14:52 -0500, Tom Lane wrote:
>>> I'm with Justin on this one.  I would view a script trying to
>>> mess with /cores as a hostile act.

>> I'm not quite following. This is a ephemeral CI instance?

> As for myself, all I meant is that it's better to write it with zero sudos 
> than
> one (for the same reason that it's better to write with one than with two).

What I'm concerned about is that it's unsafe to run the script in
any non-throwaway environment.  That doesn't seem desirable.

regards, tom lane

Re: Probable memory leak with ECPG and AIX

2021-12-15 Thread Benoit Lobréau

Our client confirmed that the more he fetches the more memory is consumed.
The segfault was indeed caused by the absence of LDR_CNTRL.

The tests show that:

* without LDR_CNTRL, we reach 256Mb and segfault ;
* with LDR_CNTRL=MAXDATA=0x1000, we reach 256Mo but there is no
segfault, the program just continues running ;
* with LDR_CNTRL=MAXDATA=0x8000, we reach 2Go and there is no segfault
either, the program just continues running.

Re: generalized conveyor belt storage

2021-12-15 Thread Matthias van de Meent

On Wed, 15 Dec 2021 at 00:01, Robert Haas  wrote:
>
> Hi!
[...]
> So here's a patch. Basically, it lets you initialize a relation fork
> as a "conveyor belt," and then you can add pages of basically
> arbitrary data to the conveyor belt and then throw away old ones and,
> modulo bugs, it will take care of recycling space for you. There's a
> fairly detailed README in the patch if you want a more detailed
> description of how the whole thing works.

I was reading through this README when I hit the following:

> +Conceptually, a relation fork organized as a conveyor belt has three parts:
> +
> +- Payload. The payload is whatever data the user of this module wishes
> +  to store. The conveyor belt doesn't care what you store in a payload page,
> +  but it does require that you store something: each time a payload page is
> +  initialized, it must end up with either pd_lower > SizeOfPageHeaderData,
> +  or pd_lower < BLCKSZ.

As SizeOfPageHeaderData < BLCKSZ, isn't this condition always true? Or
at least, this currently allows for either any value of pd_lower, or
the (much clearer) 'pd_lower <= SizeOfPageHeaderData or pd_lower >=
BLCKSZ', depending on exclusiveness of the either_or clause.

>  It's missing some features
> that I want it to have: for example, I'd like to have on-line
> compaction, where whatever logical page numbers of data currently
> exist can be relocated to lower physical page numbers thus allowing
> you to return space to the operating system, hopefully without
> requiring a strong heavyweight lock. But that's not implemented yet,
> and it's also missing a few other things, like test cases, performance
> results, more thorough debugging, better write-ahead logging
> integration, and some code to use it to do something useful. But
> there's enough here, I think, for you to form an opinion about whether
> you think this is a reasonable direction, and give any design-level
> feedback that you'd like to give. My colleagues Dilip Kumar and Mark
> Dilger have contributed to this effort with some testing help, but all
> the code in this patch is mine.

You mentioned that this is meant to be used as a "relation fork", but
I couldn't find new code in relpath.h (where ForkNumber etc. are
defined) that allows one more fork per relation. Is that too on the
missing features list, or did I misunderstand what you meant with
"relation fork"?

Kind regards,

Matthias van de Meent

Re: Skipping logical replication transactions on subscriber side

2021-12-15 Thread Masahiko Sawada

On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila  wrote:
>
> On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada  wrote:
> >
> > On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow  wrote:
> > >
> > > On Tue, Dec 14, 2021 at 3:23 PM vignesh C  wrote:
> > > >
> > > > While the worker is skipping one of the skip transactions specified by
> > > > the user and immediately if the user specifies another skip
> > > > transaction while the skipping of the transaction is in progress this
> > > > new value will be reset by the worker while clearing the skip xid. I
> > > > felt once the worker has identified the skip xid and is about to skip
> > > > the xid, the worker can acquire a lock to prevent concurrency issues:
> > >
> > > That's a good point.
> > > If only the last_error_xid could be skipped, then this wouldn't be an
> > > issue, right?
> > > If a different xid to skip is specified while the worker is currently
> > > skipping a transaction, should that even be allowed?
> > >
> >
> > We don't expect such usage but yes, it could happen and seems not
> > good. I thought we can acquire Share lock on pg_subscription during
> > the skip but not sure it's a good idea. It would be better if we can
> > find a way to allow users to specify only XID that has failed.
> >
>
> Yeah, but as we don't have a definite way to allow specifying only
> failed XID, I think it is better to use share lock on that particular
> subscription. We are already using it for add/update rel state (see,
> AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
> another place to use a similar technique.

Yes, but it seems to mean that we disallow users to change skip_xid
while the apply worker is skipping changes so we will end up having
the same problem we discussed so far;

In the current patch, we don't clear skip_xid at prepare time but do
that at commit-prepare time. But we cannot keep holding the lock until
commit-prepared comes because we don’t know when commit-prepared
comes. It’s possible that another conflict occurs before the
commit-prepared comes. We also cannot only clear skip_xid at prepare
time because it doesn’t solve the concurrency problem at
commit-prepared time. So if my understanding is correct, we need to
both clear skip_xid and unlock the lock at prepare time, and commit
the prepared (empty) transaction at commit-prepared time (I assume
that we prepare even empty transactions).

Suppose that at prepare time, we clear skip_xid (and release the lock)
and then prepare the transaction, if the server crashes right after
clearing skip_xid, skip_xid is already cleared but the transaction
will be sent again. The user has to specify skip_xid again. So let’s
change the order; we prepare the transaction and then clear skip_xid.
But if the server crashes between them, the transaction won’t be sent
again, but skip_xid is left. The user has to clear it. The left
skip_xid can automatically be cleared at commit-prepared time if XID
in the commit-prepared message matches skip_xid, but this actually
doesn’t solve the concurrency problem. If the user changed skip_xid
before commit-prepared, we would end up clearing the value. So we
might want to hold the lock until we clear skip_xid but we want to
avoid that as I explained first. It seems like we entered a loop.

It sounds better among these ideas that we clear skip_xid and then
prepare the transaction. Or we might want to revisit the idea of
storing skip_xid on shmem (e.g., ReplicationState) instead of the
catalog.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Add id's to various elements in protocol.sgml

2021-12-15 Thread Alvaro Herrera

On 2021-Dec-15, Brar Piening wrote:

> On Dec 14, 2021 at 20:47, Alvaro Herrera wrote:
> > 
> > Hmm, I think we tend to avoid xreflabels; see
> > https://www.postgresql.org/message-id/8315c0ca-7758-8823-fcb6-f37f9413e...@2ndquadrant.com
> 
> Ok, thank you for the hint.
> I added them because  doesn't automatically generate
> labels and they were present in the current docs for
> CREATE_REPLICATION_SLOT
> (https://github.com/postgres/postgres/blob/22bd3cbe0c284758d7174321f5596763095cdd55/doc/src/sgml/protocol.sgml#L1944).

Hmm, now that you mention it, we do have xreflabels for varlistentrys in
quite a few places.  Maybe we need to update README.links to mention
this.

> Since I can't argue towards some general utility for the xreflabels and
> don't have any other solid argument in favor of adding more, I will
> remove them from my current patch but leave the existing ones intact.

Yeah, I think not adding them until we have a use for them might be
wisest.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"

[PATCH] sort leaf pages by ctid for gist indexes built using sorted method

2021-12-15 Thread Aliaksandr Kalenik

Hi!

With the current implementation, for GiST indexes created by doing multiple
inserts, index tuples match heap tuples order, but it doesn't work that way
for sorted method where index tuples on all levels are ordered using
comparator provided in sortsupport (z-order for geometry type, for
example). This means two tuples that are on the same heap page can be far
apart from one another on an index page, and the heap page may be read
twice and prefetch performance will degrade.

I've created a patch intended to improve that by sorting index tuples by
heap tuples TID order on leaf pages.


gist_sortsupport_sort_leaf_pages_by_ctid.patch
Description: Binary data

Re: Adding CI to our tree

2021-12-15 Thread Justin Pryzby

On Mon, Dec 13, 2021 at 03:45:23PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2021-12-13 18:14:52 -0500, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2021-12-13 16:02:50 -0600, Justin Pryzby wrote:
> > >> sudo is used exactly twice; maybe it's not needed at all ?
> >
> > > The macos one is needed, but the freebsd one indeed isn't.
> >
> > I'm with Justin on this one.  I would view a script trying to
> > mess with /cores as a hostile act.  PG cores on macOS tend to
> > be extremely large and can fill up your disk fairly quickly
> > if you don't know they're being accumulated.  I think it's okay
> > to suggest in the documentation that people might want to allow
> > cores to be dropped, but the script has NO business trying to
> > force that.
> 
> I'm not quite following. This is a ephemeral CI instance?

As for myself, all I meant is that it's better to write it with zero sudos than
one (for the same reason that it's better to write with one than with two).

-- 
Justin

Re: port conflicts when running tests concurrently on windows.

2021-12-15 Thread Peter Eisentraut


On 13.12.21 20:33, Andres Freund wrote:

pg_regress.c at remove_temp() is still there.  These things should probably
be addressed before we can consider making this the default.


Hm, not immediately obvious what to do about this. Do you know if windows has
restrictions around the length of unix domain sockets? If not, I wonder if it
could be worth using the data directory as the socket path on windows instead
of the separate temp directory?


According to src/include/port/win32.h, it's the same 108 or so bytes 
that everyone else uses.  That might not be the kernel limit, however, 
just the way the external API is defined.


After the reading the material again, I think that comment might be 
overly cautious.  The list of things not to do in a signal handler on 
Windows is not significantly different than those for say Linux, yet we 
do a lot of them anyway.  I'm tempted to just ignore the advice and do 
it anyway, while ignoring errors, which is what it already does.

Re: logical decoding and replication of sequences

2021-12-15 Thread Tomas Vondra




On 12/15/21 14:51, Tomas Vondra wrote:
> On 12/15/21 14:20, Amit Kapila wrote:
>> On Tue, Dec 14, 2021 at 7:02 AM Tomas Vondra
>>  wrote:
>>>
>>> Hi,
>>>
>>> here's an updated version of the patches, dealing with almost all of the
>>> issues (at least in the 0001 and 0002 parts). The main changes:
>>>
>>> 1) I've removed the  'created' flag from fill_seq_with_data, as
>>> discussed. I don't think it's needed by any of the parts (not even 0003,
>>> AFAICS). We still need it in xl_seq_rec, though.
>>>
>>> 2) GetCurrentTransactionId() added to sequence.c are called only with
>>> wal_level=logical, to minimize the overhead.
>>>
>>>
>>> There's still one remaining problem, that I already explained in [1].
>>> The problem is that with this:
>>>
>>>   BEGIN;
>>>   SELECT nextval('s') FROM generate_series(1,100);
>>>   ROLLBACK;
>>>
>>>
>>> The root cause is that pg_current_wal_lsn() uses the LogwrtResult.Write,
>>> which is updated by XLogFlush() - but only in RecordTransactionCommit.
>>> Which makes sense, because only the committed stuff is "visible".
>>>
>>> But the non-transactional behavior of sequence decoding disagrees with
>>> this, because now some of the changes from aborted transactions may be
>>> replicated. Which means the wait_for_catchup() ends up not waiting for
>>> the sequence change to be replicated. This is an issue for tests in
>>> patch 0003, at least.
>>>
>>> My concern is this actually affects other places waiting for things
>>> getting replicated :-/
>>>
>>
>> By any chance, will this impact synchronous replication as well which
>> waits for commits to be replicated?
>>
> 
> Physical or logical replication? Physical is certainly not replicated.
> 
> For logical replication, it's more complicated.
> 

Apologies, sent too early ... I think it's more complicated for logical
sync replication, because of a scenario like this:

  BEGIN;
  SELECT nextval('s') FROM generate_series(1,100);  <-- writes WAL
  ROLLBACK;

  SELECT nextval('s');

The first transaction advances the sequence enough to generate a WAL,
which we do every 32 values. But it's rolled back, so it does not update
LogwrtResult.Write, because that happens only at commit.

And then the nextval() generates a value from the sequence without
generating WAL, so it doesn't update the LSN either (IIRC). That'd mean
a sync replication may not wait for this change to reach the subscriber.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: logical decoding and replication of sequences

2021-12-15 Thread Tomas Vondra

On 12/15/21 14:20, Amit Kapila wrote:
> On Tue, Dec 14, 2021 at 7:02 AM Tomas Vondra
>  wrote:
>>
>> Hi,
>>
>> here's an updated version of the patches, dealing with almost all of the
>> issues (at least in the 0001 and 0002 parts). The main changes:
>>
>> 1) I've removed the  'created' flag from fill_seq_with_data, as
>> discussed. I don't think it's needed by any of the parts (not even 0003,
>> AFAICS). We still need it in xl_seq_rec, though.
>>
>> 2) GetCurrentTransactionId() added to sequence.c are called only with
>> wal_level=logical, to minimize the overhead.
>>
>>
>> There's still one remaining problem, that I already explained in [1].
>> The problem is that with this:
>>
>>   BEGIN;
>>   SELECT nextval('s') FROM generate_series(1,100);
>>   ROLLBACK;
>>
>>
>> The root cause is that pg_current_wal_lsn() uses the LogwrtResult.Write,
>> which is updated by XLogFlush() - but only in RecordTransactionCommit.
>> Which makes sense, because only the committed stuff is "visible".
>>
>> But the non-transactional behavior of sequence decoding disagrees with
>> this, because now some of the changes from aborted transactions may be
>> replicated. Which means the wait_for_catchup() ends up not waiting for
>> the sequence change to be replicated. This is an issue for tests in
>> patch 0003, at least.
>>
>> My concern is this actually affects other places waiting for things
>> getting replicated :-/
>>
> 
> By any chance, will this impact synchronous replication as well which
> waits for commits to be replicated?
> 

Physical or logical replication? Physical is certainly not replicated.

For logical replication, it's more complicated.

> How is this patch dealing with prepared transaction case where at
> prepare we will send transactional changes and then later if rollback
> prepared happens then the publisher will rollback changes related to
> new relfilenode but subscriber would have already replayed the updated
> seqval change which won't be rolled back?
> 

I'm not sure what exact scenario you are describing, but in general we
don't rollback sequence changes anyway, so this should not cause any
divergence between the publisher and subscriber.

Or are you talking about something like ALTER SEQUENCE? I think that
should trigger the transactional behavior for the new relfilenode, so
the subscriber won't see that changes.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: logical decoding and replication of sequences

2021-12-15 Thread Peter Eisentraut


On 14.12.21 02:31, Tomas Vondra wrote:

There's still one remaining problem, that I already explained in [1].
The problem is that with this:

   BEGIN;
   SELECT nextval('s') FROM generate_series(1,100);
   ROLLBACK;


The root cause is that pg_current_wal_lsn() uses the LogwrtResult.Write,
which is updated by XLogFlush() - but only in RecordTransactionCommit.
Which makes sense, because only the committed stuff is "visible".

But the non-transactional behavior of sequence decoding disagrees with
this, because now some of the changes from aborted transactions may be
replicated. Which means the wait_for_catchup() ends up not waiting for
the sequence change to be replicated.


I can't think of a reason why this might be a problem.

Re: logical decoding and replication of sequences

2021-12-15 Thread Amit Kapila

On Tue, Dec 14, 2021 at 7:02 AM Tomas Vondra
 wrote:
>
> Hi,
>
> here's an updated version of the patches, dealing with almost all of the
> issues (at least in the 0001 and 0002 parts). The main changes:
>
> 1) I've removed the  'created' flag from fill_seq_with_data, as
> discussed. I don't think it's needed by any of the parts (not even 0003,
> AFAICS). We still need it in xl_seq_rec, though.
>
> 2) GetCurrentTransactionId() added to sequence.c are called only with
> wal_level=logical, to minimize the overhead.
>
>
> There's still one remaining problem, that I already explained in [1].
> The problem is that with this:
>
>   BEGIN;
>   SELECT nextval('s') FROM generate_series(1,100);
>   ROLLBACK;
>
>
> The root cause is that pg_current_wal_lsn() uses the LogwrtResult.Write,
> which is updated by XLogFlush() - but only in RecordTransactionCommit.
> Which makes sense, because only the committed stuff is "visible".
>
> But the non-transactional behavior of sequence decoding disagrees with
> this, because now some of the changes from aborted transactions may be
> replicated. Which means the wait_for_catchup() ends up not waiting for
> the sequence change to be replicated. This is an issue for tests in
> patch 0003, at least.
>
> My concern is this actually affects other places waiting for things
> getting replicated :-/
>

By any chance, will this impact synchronous replication as well which
waits for commits to be replicated?

How is this patch dealing with prepared transaction case where at
prepare we will send transactional changes and then later if rollback
prepared happens then the publisher will rollback changes related to
new relfilenode but subscriber would have already replayed the updated
seqval change which won't be rolled back?

--
With Regards,
Amit Kapila.

Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace

2021-12-15 Thread Gilles Darold


Le 15/12/2021 à 13:41, Peter Eisentraut a écrit :

On 03.08.21 19:10, Tom Lane wrote:

Gilles Darold  writes:

Sorry I have missed that, but I'm fine with this implemenation so let's
keep the v6 version of the patch and drop this one.


Pushed, then.  There's still lots of time to tweak the behavior of 
course.


I have a documentation follow-up to this.  It seems that these new 
functions are almost a de facto standard, whereas the SQL-standard 
functions are not implemented anywhere.  I propose the attached patch 
to update the subsection in the pattern-matching section to give more 
detail on this and suggest equivalent functions among these newly 
added ones.  What do you think?



I'm in favor to apply your changes to documentation. It is a good thing 
to precise the relation between this implementation of the regex_* 
functions and the SQL stardard.


--
Gilles Darold

Re: Failed transaction statistics to measure the logical replication progress

2021-12-15 Thread vignesh C

On Tue, Dec 14, 2021 at 7:58 AM Amit Kapila  wrote:
>
> On Mon, Dec 13, 2021 at 5:48 PM osumi.takami...@fujitsu.com
>  wrote:
> >
> > On Monday, December 13, 2021 6:19 PM Amit Kapila  
> > wrote:
> > > On Tue, Dec 7, 2021 at 3:12 PM osumi.takami...@fujitsu.com
> > >  wrote:
> > >
> > > Few questions and comments:
> > Thank you for your comments !
> >
> > > 
> > > 1.
> > > The pg_stat_subscription_workers view will
> > > contain
> > > one row per subscription worker on which errors have occurred, for 
> > > workers
> > > applying logical replication changes and workers handling the initial 
> > > data
> > > -   copy of the subscribed tables.  The statistics entry is removed when 
> > > the
> > > -   corresponding subscription is dropped.
> > > +   copy of the subscribed tables. Also, the row corresponding to the 
> > > apply
> > > +   worker shows all transaction statistics of both types of workers on 
> > > the
> > > +   subscription. The statistics entry is removed when the corresponding
> > > +   subscription is dropped.
> > >
> > > Why did you choose to show stats for both types of workers in one row?
> > This is because if we have hundreds or thousands of tables for table sync,
> > we need to create many entries to cover them and store the entries for all 
> > tables.
> >
>
> If we fear a large number of entries for such workers then won't it be
> better to show the value of these stats only for apply workers. I
> think normally the table sync workers perform only copy operation or
> maybe a fixed number of xacts, so, one might not be interested in the
> transaction stats of these workers. I find merging only specific stats
> of two different types of workers confusing.
>
> What do others think about this?

We can remove the table sync workers transaction stats count to avoid
confusion, take care of the documentation changes too accordingly.

Regards,
Vignesh

Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace

2021-12-15 Thread Peter Eisentraut


On 03.08.21 19:10, Tom Lane wrote:

Gilles Darold  writes:

Sorry I have missed that, but I'm fine with this implemenation so let's
keep the v6 version of the patch and drop this one.


Pushed, then.  There's still lots of time to tweak the behavior of course.


I have a documentation follow-up to this.  It seems that these new 
functions are almost a de facto standard, whereas the SQL-standard 
functions are not implemented anywhere.  I propose the attached patch to 
update the subsection in the pattern-matching section to give more 
detail on this and suggest equivalent functions among these newly added 
ones.  What do you think?From a2dbd0e24a30b945a5d641ed773dc44f5e6b50c1 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 15 Dec 2021 11:02:59 +0100
Subject: [PATCH] doc: More documentation on regular expressions and SQL
 standard

---
 doc/src/sgml/func.sgml   | 91 +---
 src/backend/catalog/sql_features.txt | 10 +--
 2 files changed, 88 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5801299b27..e58efce586 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -7353,10 +7353,26 @@ Basic Regular Expressions
 
 

-   Differences from XQuery (LIKE_REGEX)
+   Differences from SQL Standard and XQuery
 

-LIKE_REGEX
+LIKE_REGEX
+   
+
+   
+OCCURRENCES_REGEX
+   
+
+   
+POSITION_REGEX
+   
+
+   
+SUBSTRING_REGEX
+   
+
+   
+TRANSLATE_REGEX

 

@@ -7364,16 +7380,75 @@ Differences from XQuery 
(LIKE_REGEX)

 
 
- Since SQL:2008, the SQL standard includes
- a LIKE_REGEX operator that performs pattern
+ Since SQL:2008, the SQL standard includes regular expression operators
+ and functions that performs pattern
  matching according to the XQuery regular expression
- standard.  PostgreSQL does not yet
- implement this operator, but you can get very similar behavior using
- the regexp_match() function, since XQuery
- regular expressions are quite close to the ARE syntax described above.
+ standard:
+ 
+  LIKE_REGEX
+  OCCURRENCES_REGEX
+  POSITION_REGEX
+  SUBSTRING_REGEX
+  TRANSLATE_REGEX
+ 
+ PostgreSQL does not currently implement these
+ operators and functions.  You can get approximately equivalent
+ functionality in each case as shown in .  (Various optional clauses on
+ both sides have been omitted in this table.)
+
+
+
+ Regular Expression Functions Equivalencies
+
+ 
+  
+   
+SQL standard
+PostgreSQL
+   
+  
+
+  
+   
+string LIKE_REGEX 
pattern
+regexp_like(string, 
pattern) or 
string ~ 
pattern
+   
+
+   
+OCCURRENCES_REGEX(pattern 
IN string
+regexp_count(string, 
pattern)
+   
+
+   
+POSITION_REGEX(pattern IN 
string
+regexp_instr(string, 
pattern)
+   
+
+   
+SUBSTRING_REGEX(pattern IN 
string
+regexp_substr(string, 
pattern)
+   
+
+   
+TRANSLATE_REGEX(pattern IN 
string WITH 
replacement
+regexp_replace(string, 
pattern, 
replacement)
+   
+  
+ 
+
+
+
+ Regular expression functions similar to those provided by PostgreSQL are
+ also available in a number of other SQL implementations, whereas the
+ SQL-standard functions are not as widely implemented.  Some of the
+ details of the regular expression syntax will likely differ in each
+ implementation.
 
 
 
+ The SQL-standard operators and functions use XQuery regular expressions,
+ which are quite close to the ARE syntax described above.
  Notable differences between the existing POSIX-based
  regular-expression feature and XQuery regular expressions include:
 
diff --git a/src/backend/catalog/sql_features.txt 
b/src/backend/catalog/sql_features.txt
index 9f424216e2..b8a78f4d41 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -323,11 +323,11 @@ F821  Local table references  NO
 F831   Full cursor update  NO  
 F831   Full cursor update  01  Updatable scrollable cursorsNO  
 F831   Full cursor update  02  Updatable ordered cursors   NO  
-F841   LIKE_REGEX predicateNO  
-F842   OCCURRENCES_REGEX function  NO  
-F843   POSITION_REGEX function NO  
-F844   SUBSTRING_REGEX functionNO  
-F845   TRANSLATE_REGEX functionNO  
+F841   LIKE_REGEX predicateNO  consider regexp_like()
+F842   OCCURRENCES_REGEX function  NO  consider 
regexp_matches()
+F843   POSITION_REGEX function NO  consider regexp_instr()
+F844   SUBSTRING_REGEX functionNO  consid

Privilege required for IF EXISTS event if the object already exists

2021-12-15 Thread Shay Rojansky

Hi all,

I've received numerous complaints about CREATE SCHEMA IF NOT EXISTS failing
when the user lacks CREATE privileges on the database - even if the schema
already exists. A typical scenario would be a multi-tenant
schema-per-tenant setup, where the schema and tenant user are created
beforehand, but then some database layer or ORM wants to ensure that the
schema is there so the above is issued.

Would it be reasonable to have the above no error if the schema already
exists? That would make it similar to the following (which I'm switching to
in the Entity Framework Core ORM):

DO $$
BEGIN
IF NOT EXISTS(SELECT 1 FROM pg_namespace WHERE nspname = 'foo') THEN
CREATE SCHEMA "foo";
END IF;
END $$;

The same could apply to other CREATE ... IF NOT EXISTS variations.

Shay

Re: Add id's to various elements in protocol.sgml

2021-12-15 Thread Brar Piening


On Dec 14, 2021 at 20:47, Alvaro Herrera wrote:


Hmm, I think we tend to avoid xreflabels; see
https://www.postgresql.org/message-id/8315c0ca-7758-8823-fcb6-f37f9413e...@2ndquadrant.com


Ok, thank you for the hint.
I added them because  doesn't automatically generate
labels and they were present in the current docs for
CREATE_REPLICATION_SLOT
(https://github.com/postgres/postgres/blob/22bd3cbe0c284758d7174321f5596763095cdd55/doc/src/sgml/protocol.sgml#L1944).

After reading the aforementioned thread to
https://www.postgresql.org/message-id/20200611223836.GA2507%40momjian.us
I infer the following conclusions:
a) Do *not* include xreflabel for elements that get numbered.
b) There should be some general utility for the xreflabel, not just the
linking needs of one particular source location.
c) Generally, xreflabels are a bit of antipattern, so there need to be
solid arguments in favor of adding more.

Since I can't argue towards some general utility for the xreflabels and
don't have any other solid argument in favor of adding more, I will
remove them from my current patch but leave the existing ones intact.

Objections?

Re: Windows default locale vs initdb

2021-12-15 Thread Juan José Santamaría Flecha

On Sun, May 16, 2021 at 6:29 AM Noah Misch  wrote:

> On Mon, Apr 19, 2021 at 05:42:51PM +1200, Thomas Munro wrote:
>
> > The question we asked ourselves
> > multiple times in the other thread was how we're supposed to get to
> > the modern BCP 47 form when creating the template databases.  It looks
> > like one possibility, since Vista, is to call
> > GetUserDefaultLocaleName()[2]
>
> > No patch, but I wondered if any Windows hackers have any feedback on
> > relative sanity of trying to fix all these problems this way.
>
> Sounds reasonable.  If PostgreSQL v15 would otherwise run on Windows Server
> 2003 R2, this is a good time to let that support end.
>
> The value returned by GetUserDefaultLocaleName() is a system configured
parameter, independent of what you set with setlocale(). It might be
reasonable for initdb but not for a backend in most cases.

You can get the locale POSIX-ish name using GetLocaleInfoEx(), but this is
no longer recommended, because using LCIDs is no longer recommended [1].
Although, this would work for legacy locales. Please find attached a POC
patch showing this approach.

[1] https://docs.microsoft.com/en-us/globalization/locale/locale-names

Regards,

Juan José Santamaría Flecha


0001-POC-Make-Windows-locale-POSIX-looking.patch
Description: Binary data

Re: row filtering for logical replication

2021-12-15 Thread Amit Kapila

On Wed, Dec 15, 2021 at 1:52 PM Greg Nancarrow  wrote:
>
> On Wed, Dec 15, 2021 at 5:25 PM Amit Kapila  wrote:
> >
> > > "If a subscriber is a pre-15 version, the initial table
> > > synchronization won't use row filters even if they are defined in the
> > > publisher."
> > >
> > > Won't this lead to data inconsistencies or errors that otherwise
> > > wouldn't happen?
> > >
> >
> > How? The subscribers will get all the initial data.
> >
>
> But couldn't getting all the initial data (i.e. not filtering) break
> the rules used by the old/new row processing (see v46-0003 patch)?
> Those rules effectively assume rows have been previously published
> with filtering.
> So, for example, for the following case for UPDATE:
> old-row (no match)new row (match)  -> INSERT
> the old-row check (no match) infers that the old row was never
> published, but that row could in fact have been in the initial
> unfiltered rows, so in that case an INSERT gets erroneously published
> instead of an UPDATE, doesn't it?
>

But this can happen even when both the publisher and subscriber are
from v15, say if the user defines filter at some later point or change
the filter conditions by Alter Publication. So, not sure if we need to
invent something new for this.

> > > Should such subscriptions be allowed?
> > >
> >
> > I am not sure what you have in mind here? How can we change the
> > already released code pre-15 for this new feature?
> >
>
> I was thinking such subscription requests could be rejected by the
> server, based on the subscriber version and whether the publications
> use filtering etc.
>

Normally, the client sends some parameters to the server like
(streaming, two_pc, etc.) based on which server can take such
decisions. We may need to include some such thing which I am not sure
is required for this particular case especially because that can
happen otherwise as well.

-- 
With Regards,
Amit Kapila.

RE: Confused comment about drop replica identity index

2021-12-15 Thread wangw.f...@fujitsu.com

On Tue, Dec 15, 2021 at 11:25AM, Michael Paquier wrote:
> Yeah, the comment is wrong.  If the index of a REPLICA_IDENTITY_INDEX is
> dropped, I recall that the behavior is the same as REPLICA_IDENTITY_NOTHING.

Thank you for your response.
I agreed that the comment is wrong.


> Not sure about the DROP INDEX page, but I'd be fine with mentioning that in 
> the
> ALTER TABLE page in the paragraph related to REPLICA IDENTITY.  While on it, I
> would be tempted to switch this stuff to use a list of  for all 
> the option
> values.  That would be much easier to read.

Yeah, if we can add some details to pg-doc and code comments, I think it will
be more friendly to PG users and developers.

Regards,
Wang wei

Re: A test for replay of regression tests

2021-12-15 Thread Michael Paquier

On Fri, Dec 10, 2021 at 12:58:01PM +1300, Thomas Munro wrote:
> -# required for 017_shm.pl
> +# required for 017_shm.pl and 027_stream_regress.pl
>  REGRESS_SHLIB=$(abs_top_builddir)/src/test/regress/regress$(DLSUFFIX)
>  export REGRESS_SHLIB

Hmm.  FWIW, I am working on doing similar for pg_upgrade to switch to
TAP there, and we share a lot in terms of running pg_regress on an
exising cluster.  Wouldn't it be better to move this definition to
src/Makefile.global.in rather than src/test/recovery/?

My pg_regress command is actually very similar to yours, so I am
wondering if this would be better if somehow centralized, perhaps in
Cluster.pm.
--
Michael

signature.asc
Description: PGP signature

Re: Strange behavior with polygon and NaN

2021-12-15 Thread Kyotaro Horiguchi

At Wed, 15 Dec 2021 16:20:55 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> adjusted so that it treats null as false. On the way doing this, the
> bug #17334 [2] and another bug raised earlier [3] are naturally fixed.

That being said, even if this patch were committed to the master
branch, we won't apply the whole patch set for backbranches.

I guess applying the computeDistance part in the patch works.

# But I found that the part contains a bugX(

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: row filtering for logical replication

2021-12-15 Thread Greg Nancarrow

On Wed, Dec 15, 2021 at 5:25 PM Amit Kapila  wrote:
>
> > "If a subscriber is a pre-15 version, the initial table
> > synchronization won't use row filters even if they are defined in the
> > publisher."
> >
> > Won't this lead to data inconsistencies or errors that otherwise
> > wouldn't happen?
> >
>
> How? The subscribers will get all the initial data.
>

But couldn't getting all the initial data (i.e. not filtering) break
the rules used by the old/new row processing (see v46-0003 patch)?
Those rules effectively assume rows have been previously published
with filtering.
So, for example, for the following case for UPDATE:
old-row (no match)new row (match)  -> INSERT
the old-row check (no match) infers that the old row was never
published, but that row could in fact have been in the initial
unfiltered rows, so in that case an INSERT gets erroneously published
instead of an UPDATE, doesn't it?

> > Should such subscriptions be allowed?
> >
>
> I am not sure what you have in mind here? How can we change the
> already released code pre-15 for this new feature?
>

I was thinking such subscription requests could be rejected by the
server, based on the subscriber version and whether the publications
use filtering etc.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: parallel vacuum comments

2021-12-15 Thread Masahiko Sawada

On Tue, Dec 14, 2021 at 12:03 PM Amit Kapila  wrote:
>
> On Tue, Dec 14, 2021 at 7:40 AM tanghy.f...@fujitsu.com
>  wrote:
> >
> > On Monday, December 13, 2021 2:12 PM Masahiko Sawada 
> >  wrote:
> > >
> > > On Mon, Dec 13, 2021 at 2:09 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Mon, Dec 13, 2021 at 10:33 AM Masahiko Sawada
> > >  wrote:
> > > > >
> > > > > On Fri, Dec 10, 2021 at 9:08 PM Amit Kapila 
> > > wrote:
> > > > > >
> > > > > > On Thu, Dec 9, 2021 at 6:05 PM Masahiko Sawada
> > >  wrote:
> > > > > > >
> > > > > > > On Thu, Dec 9, 2021 at 7:44 PM Amit Kapila 
> > > > > > > 
> > > wrote:
> > > > > > >
> > > > > > > Agreed with the above two points.
> > > > > > >
> > > > > > > I've attached updated patches that incorporated the above comments
> > > > > > > too. Please review them.
> > > > > > >
> > > > > >
> > > > > > I have made the following minor changes to the 0001 patch: (a) An
> > > > > > assert was removed from dead_items_max_items() which I added back. 
> > > > > > (b)
> > > > > > Removed an unnecessary semicolon from one of the statements in
> > > > > > compute_parallel_vacuum_workers(). (c) Changed comments at a few
> > > > > > places. (d) moved all parallel_vacuum_* related functions together.
> > > > > > (e) ran pgindent and slightly modify the commit message.
> > > > > >
> > > > > > Let me know what you think of the attached?
> > > > >
> > > > > Thank you for updating the patch!
> > > > >
> > > > > The patch also moves some functions, e.g., update_index_statistics()
> > > > > is moved without code changes. I agree to move functions for
> > > > > consistency but that makes the review hard and the patch complicated.
> > > > > I think it's better to do improving the parallel vacuum code and
> > > > > moving functions in separate patches.
> > > > >
> > > >
> > > > Okay, I thought it might be better to keep all parallel_vacuum_*
> > > > related functions together but we can keep that in a separate patch
> > > > Feel free to submit without those changes.
> > >
> > > I've attached the patch. I've just moved some functions back but not
> > > done other changes.
> > >
> >
> > Thanks for your patch.
> >
> > I tested your patch and tried some cases, like large indexes, different 
> > types of indexes, it worked well.
> >
> > Besides, I noticed a typo as follows:
> >
> > +   /* Estimate size for index vacuum stats -- 
> > PARALLEL_VACUUM_KEY_STATS */
> >
> > "PARALLEL_VACUUM_KEY_STATS" should be "PARALLEL_VACUUM_KEY_INDEX_STATS".
> >
>
> Thanks, I can take care of this before committing. The v9-0001* looks
> good to me as well, so, I am planning to commit that tomorrow unless I
> see more comments or any objection to that.

Thanks!

> There is still pending
> work related to moving parallel vacuum code to a separate file and a
> few other pending comments that are still under discussion. We can
> take care of those in subsequent patches. Do, let me know if you or
> others think differently?

I'm on the same page.

I've attached an updated patch. The patch incorporated several changes
from the last version:

* Rename parallel_vacuum_begin() to parallel_vacuum_init()
* Unify the terminology; use "index bulk-deletion" and "index cleanup"
instead of "index vacuum" and "index cleanup".
* Fix the comment of parallel_vacuum_init() pointed out by Andres
* Fix a typo that is left in commit 22bd3cbe0c (pointed out by Hou)

Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v8-0001-Move-parallel-vacuum-code-to-vacuumparallel.c.patch
Description: Binary data

97 matches

Mail list logo