Remove extra Logging_collector check before calling SysLogger_Start()

2021-12-02 Thread Bharath Rupireddy
Hi,

It seems like there's an extra Logging_collector check before calling
SysLogger_Start(). Note that the SysLogger_Start() has a check to
return 0 if Logging_collector is off. This change is consistent with
the other usage of SysLogger_Start().

/* If we have lost the log collector, try to start a new one */
-   if (SysLoggerPID == 0 && Logging_collector)
+   if (SysLoggerPID == 0)
SysLoggerPID = SysLogger_Start();

Attaching a tiny patch to fix. Thoughts?

Regards,
Bharath Rupireddy.


v1-0001-remove-extra-Logging_collector-check-before-SysLo.patch
Description: Binary data


RE: Alter all tables in schema owner fix

2021-12-02 Thread tanghy.f...@fujitsu.com
On Friday, December 3, 2021 1:31 PM vignesh C  wrote:
> 
> On Fri, Dec 3, 2021 at 9:53 AM Greg Nancarrow  wrote:
> >
> > On Fri, Dec 3, 2021 at 2:06 PM vignesh C  wrote:
> > >
> > > Currently while changing the owner of ALL TABLES IN SCHEMA
> > > publication, it is not checked if the new owner has superuser
> > > permission or not. Added a check to throw an error if the new owner
> > > does not have superuser permission.
> > > Attached patch has the changes for the same. Thoughts?
> > >
> >
> > It looks OK to me, but just two things:
> >
> > 1) Isn't it better to name "CheckSchemaPublication" as
> > "IsSchemaPublication", since it has a boolean return and also
> > typically CheckXXX type functions normally do checking and error-out
> > if they find a problem.
> 
> Modified
> 
> > 2) Since superuser_arg() caches previous input arg (last_roleid) and
> > has a fast-exit, and has been called immediately before for the FOR
> > ALL TABLES case, it would be better to write:
> >
> > + if (CheckSchemaPublication(form->oid) && !superuser_arg(newOwnerId))
> >
> > as:
> >
> > + if (!superuser_arg(newOwnerId) && IsSchemaPublication(form->oid))
> 
> Modified
> 
> Thanks for the comments, the attached v2 patch has the changes for the same.
> 

Thanks for your patch.
I tested it and it fixed this problem as expected. It also passed "make 
check-world".

Regards,
Tang


Re: PoC: using sampling to estimate joins / complex conditions

2021-12-02 Thread Michael Paquier
On Sun, Jun 27, 2021 at 07:55:24PM +0200, Tomas Vondra wrote:
> estimating joins is one of the significant gaps related to extended
> statistics, and I've been regularly asked about what we might do about
> that. This is an early experimental patch that I think might help us
> with improving this, possible even in PG15.

The patch does not apply, so a rebase would be in place.  I have
switched that as waiting on author for now, moving it to the next CF.
--
Michael


signature.asc
Description: PGP signature


Re: Logical replication - schema change not invalidating the relation cache

2021-12-02 Thread Michael Paquier
On Thu, Aug 26, 2021 at 09:00:39PM +0530, vignesh C wrote:
> The previous patch was failing because of the recent test changes made
> by commit 201a76183e2 which unified new and get_new_node, attached
> patch has the changes to handle the changes accordingly.

Please note that the CF app is complaining about this patch, so a
rebase is required.  I have moved it to next CF, waiting on author,
for now.
--
Michael


signature.asc
Description: PGP signature


Re: Add missing function abs (interval)

2021-12-02 Thread Michael Paquier
On Fri, Nov 05, 2021 at 12:06:05AM -0400, Isaac Morland wrote:
> Not yet, but thanks for the reminder. I will try to get this done on the
> weekend.

Seeing no updates, this has been switched to returned with feedback in
the CF app.
--
Michael


signature.asc
Description: PGP signature


Re: [PROPOSAL] new diagnostic items for the dynamic sql

2021-12-02 Thread Michael Paquier
On Wed, Nov 10, 2021 at 01:58:58PM +0530, Dinesh Chemuduru wrote:
> The proposed statements are much clear, but will wait for other’s
> suggestion, and will fix it accordingly.

This update was three weeks ago, and no new version has been
provided, so I am marking this as returned with feedback in the CF
app.  If you can work more on this proposal and send an updated patch,
please feel free to resubmit.
--
Michael


signature.asc
Description: PGP signature


Re: Speed up transaction completion faster after many relations are accessed in a transaction

2021-12-02 Thread Michael Paquier
On Fri, Oct 01, 2021 at 04:03:09PM +0900, Michael Paquier wrote:
> This last update was two months ago, and the patch has not moved
> since:
> https://commitfest.postgresql.org/34/3220/
> 
> Do you have plans to work more on that or perhaps the CF entry should
> be withdrawn or RwF'd?

Two months later, this has been switched to RwF.
--
Michael


signature.asc
Description: PGP signature


Re: Parallelize correlated subqueries that execute within each worker

2021-12-02 Thread Michael Paquier
On Mon, Nov 15, 2021 at 10:01:37AM -0500, Robert Haas wrote:
> On Wed, Nov 3, 2021 at 1:34 PM James Coleman  wrote:
>> As I understand the current code, parallel plans are largely chosen
>> based not on where it's safe to insert a Gather node but rather by
>> determining if a given path is parallel safe. Through that lens params
>> are a bit of an odd man out -- they aren't inherently unsafe in the
>> way a parallel-unsafe function is, but they can only be used in
>> parallel plans under certain conditions (whether because of project
>> policy, performance, or missing infrastructure).
> 
> Right.

Please note that the CF bot is complaining here, so I have moved this
patch to the next CF, but changed the status as waiting on author.
--
Michael


signature.asc
Description: PGP signature


Re: libpq compression

2021-12-02 Thread Michael Paquier
On Fri, Oct 01, 2021 at 11:20:09PM +0200, Daniel Gustafsson wrote:
> To keep this thread from stalling, attached is a rebased patchset with the
> above mentioned fix to try and get this working on Windows.

This patch has been waiting on author for two months now, so I have
marked it as RwF in the CF app.
--
Michael


signature.asc
Description: PGP signature


Re: row filtering for logical replication

2021-12-02 Thread Greg Nancarrow
On Thu, Dec 2, 2021 at 6:18 PM Peter Smith  wrote:
>
> PSA a new v44* patch set.
>

Some initial comments:

0001

src/backend/replication/logical/tablesync.c
(1) In fetch_remote_table_info update, "list_free(*qual);" should be
"list_free_deep(*qual);"

doc/src/sgml/ref/create_subscription.sgml
(2) Refer to Notes

Perhaps a link to the Notes section should be used here, as follows:

-  copied. Refer to the Notes section below.
+  copied. Refer to the  section below.

- 
+ 


0002

1) Typo in patch comment
"Specifially"

src/backend/catalog/pg_publication.c
2) bms_replident comment
Member "Bitmapset  *bms_replident;" in rf_context should have a
comment, maybe something like "set of replica identity col indexes".

3) errdetail message
In rowfilter_walker(), the "forbidden" errdetail message is loaded
using gettext() in one instance, but just a raw formatted string in
other cases. Shouldn't they all consistently be translated strings?


0003

src/backend/replication/logical/proto.c
1) logicalrep_write_tuple

(i)
if (slot == NULL || TTS_EMPTY(slot))
can be replaced with:
if (TupIsNull(slot))

(ii) In the above case (where values and nulls are palloc'd),
shouldn't the values and nulls be pfree()d at the end of the function?


0005

src/backend/utils/cache/relcache.c
(1) RelationGetInvalRowFilterCol
Shouldn't "rfnode" be pfree()d after use?


Regards,
Greg Nancarrow
Fujitsu Australia




Do sys logger and stats collector need wait events WAIT_EVENT_SYSLOGGER_MAIN/_PGSTAT_MAIN?

2021-12-02 Thread Bharath Rupireddy
Hi,

Although the pg_stat_activity has no entry for the sys logger and
stats collector (because of no shared memory access), the wait events
WAIT_EVENT_SYSLOGGER_MAIN and WAIT_EVENT_PGSTAT_MAIN are defined. They
seem to be unnecessary. Passing 0 or some other undefined wait event
value to the existing calls of WaitLatch and WaitLatchOrSocket instead
of WAIT_EVENT_SYSLOGGER_MAIN/WAIT_EVENT_PGSTAT_MAIN, would work. We
can delete these wait events and their info from pgstat.c.

I'm sure this is not so critical, but I'm just checking if someone
feels that they should be removed or have some other reasons for
keeping them.

Thoughts?

Regards,
Bharath Rupireddy.




Re: Use simplehash.h instead of dynahash in SMgr

2021-12-02 Thread Michael Paquier
On Tue, Oct 05, 2021 at 11:07:48AM +1300, David Rowley wrote:
> I think I'd much rather talk about the concerns here than just
> withdraw this. Even if what I have today just serves as something to
> aid discussion.

Hmm.  This last update was two months ago, and the patch does not
apply anymore.  I am marking it as RwF for now.
--
Michael


signature.asc
Description: PGP signature


Re: Commitfest 2021-11 Patch Triage - Part 1

2021-12-02 Thread Yugo NAGATA
On Wed, 1 Dec 2021 09:14:31 -0300
Marcos Pegoraro  wrote:

> >
> > I think the reason why we can't update a materialized view directly is
> > because
> > it is basically a "view" and it should not contains any data irrelevant to
> > its
> > definition and underlying tables. If we would have a feature to update a
> > materialized view direcly,  maybe, it should behave as updatable-view as
> > well
> > as normal (virtual) views, although I am not sure
> >
> 
> Well, I didn´t find any place where is detailed why those tables are not
> updatable.
> And would be fine to be updated through triggers or cron jobs until IVM is
> available.
> CheckValidRowMarkRel just gives an exception "cannot lock rows in
> materialized view ...", but why ?
> What are the differences between Materialized Views and tables ?

It would be that materialized views are related to its definition query
and expected to have contents that is consistent with it,  at least at some
point in time, I think.

In order to allow triggers to update materialized views, we'd need to
make OpenMatViewIncrementalMaintenance and CloseMatViewIncrementalMaintenance
public since there are static functions in matview.c. However, there is a
concern that it will make the contents of a materialized view completely
unreliable [1]. Therefore, if we do it, we would need privilege management
in some way.

[1] 
https://www.postgresql.org/message-id/flat/CACjxUsP8J6bA4RKxbmwujTVMwMZrgR3AZ7yP5F2XkB-f9w7K7Q%40mail.gmail.com#efbee336d7651ce39bc5ff9574f92002

Regards,
Yugo Nagata
-- 
Yugo NAGATA 




Re: Skipping logical replication transactions on subscriber side

2021-12-02 Thread Masahiko Sawada
On Fri, Dec 3, 2021 at 11:53 AM Amit Kapila  wrote:
>
> On Thu, Dec 2, 2021 at 8:38 PM Peter Eisentraut
>  wrote:
> >
> > On 02.12.21 07:48, Amit Kapila wrote:
> > > a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
> > > b. Alter Subscription  SET ( subscription_parameter [=value]
> > > [, ... ] );
> > > c. Alter Subscription  On Error ( subscription_parameter
> > > [=value] [, ... ] );
> > > d. Alter Subscription  SKIP ( subscription_parameter
> > > [=value] [, ... ] );
> > > where subscription_parameter can be one of:
> > > xid = 
> > > lsn = 
> > > ...
> >
> > > As per discussion till now, option (d) seems preferable.
> >
> > I agree.

+1

> >
> > > In this, we
> > > need to see how and what to allow as options. The simplest way for the
> > > first version is to just allow one xid to be specified at a time which
> > > would mean that specifying multiple xids should error out. We can also
> > > additionally allow specifying operations like 'insert', 'update',
> > > etc., and then relation list (list of oids). What that would mean is
> > > that for a transaction we can allow which particular operations and
> > > relations we want to skip.
> >
> > I don't know how difficult it would be, but allowing multiple xids might
> > be desirable.
> >
>
> Are there many cases where there could be multiple xid failures that
> the user can skip? Apply worker always keeps looping at the same error
> failure so the user wouldn't know of the second xid failure (if any)
> till the first failure is resolved. I could think of one such case
> where it is possible during the initial synchronization phase where
> apply worker went ahead then tablesync worker by skipping to apply the
> changes on the corresponding table. After that, it is possible, that
> the table sync worker failed during the catch-up phase and apply
> worker fails during the processing of some other rel.
>
> >  But this syntax gives you flexibility, so we can also
> > start with a simple implementation.
> >
>
> Yeah, I also think so. BTW, what do you think of providing extra
> flexibility of giving other options like 'operation', 'rel' along with
> xid? I think such options could be useful for large transactions that
> operate on multiple tables as it is quite possible that only a
> particular operation from the entire transaction is the cause of
> failure. Now, on one side, we can argue that skipping the entire
> transaction is better from the consistency point of view but I think
> it is already possible that we just skip a particular update/delete
> (if the corresponding tuple doesn't exist on the subscriber). For the
> sake of simplicity, we can just allow providing xid at this stage and
> then extend it later as required but I am not very sure of that point.

+1

Skipping a whole transaction by specifying xid would be a good start.
Ideally, we'd like to automatically skip only operations within the
transaction that fail but it seems not easy to achieve. If we allow
specifying operations and/or relations, probably multiple operations
or relations need to be specified in some cases. Otherwise, the
subscriber cannot continue logical replication if the transaction has
multiple operations on different relations that fail. But similar to
the idea of specifying multiple xids, we need to note the fact that
user wouldn't know of the second operation failure unless the apply
worker applies the change. So I'm not sure there are many use cases in
practice where users can specify multiple operations and relations in
order to skip applies that fail.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: suboverflowed subtransactions concurrency performance optimize

2021-12-02 Thread Dilip Kumar
On Tue, Nov 30, 2021 at 5:49 PM Simon Riggs
 wrote:

> transam.c uses a single item cache to prevent thrashing from repeated
> lookups, which reduces problems with shared access to SLRUs.
> multitrans.c also has similar.
>
> I notice that subtrans. doesn't have this, but could easily do so.
> Patch attached, which seems separate to other attempts at tuning.

Yeah, this definitely makes sense.

> On review, I think it is also possible that we update subtrans ONLY if
> someone uses >PGPROC_MAX_CACHED_SUBXIDS.
> This would make subtrans much smaller and avoid one-entry-per-page
> which is a major source of cacheing.
> This would means some light changes in GetSnapshotData().
> Let me know if that seems interesting also?

Do you mean to say avoid setting the sub-transactions parent if the
number of sun-transactions is not crossing PGPROC_MAX_CACHED_SUBXIDS?
But the TransactionIdDidCommit(), might need to fetch the parent if
the transaction status is TRANSACTION_STATUS_SUB_COMMITTED, so how
would we handle that?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] support tab-completion for single quote input with equal sign

2021-12-02 Thread Michael Paquier
On Fri, Sep 17, 2021 at 02:45:57AM +0900, Kyotaro Horiguchi wrote:
> This test fails for the same reason, but after fixing it the result
> contains \a (BEL) in the output on my CentOS8. I'm not sure what is
> happening here..

The patch is still failing under the CF bot, and this last update was
two months ago.  I am marking it as returned with feedback.
--
Michael


signature.asc
Description: PGP signature


Re: Failed transaction statistics to measure the logical replication progress

2021-12-02 Thread vignesh C
On Wed, Dec 1, 2021 at 3:04 PM osumi.takami...@fujitsu.com
 wrote:
>
> On Friday, November 19, 2021 11:11 PM Masahiko Sawada  
> wrote:
> > Besides that, I’m not sure how useful commit_bytes, abort_bytes, and
> > error_bytes are. I originally thought these statistics track the size of 
> > received
> > data, i.g., how much data is transferred from the publisher and processed on
> > the subscriber. But what the view currently has is how much memory is used 
> > in
> > the subscription worker. The subscription worker emulates
> > ReorderBufferChangeSize() on the subscriber side but, as the comment of
> > update_apply_change_size() mentions, the size in the view is not accurate:
> ...
> > I guess that the purpose of these values is to compare them to total_bytes,
> > stream_byte, and spill_bytes but if the calculation is not accurate, does 
> > it mean
> > that the more stats are updated, the more the stats will be getting 
> > inaccurate?
> Thanks for your comment !
>
> I tried to solve your concerns about byte columns but there are really 
> difficult issues to solve.
> For example, to begin with the messages of apply worker are different from 
> those of
> reorder buffer.
>
> Therefore, I decided to split the previous patch and make counter columns go 
> first.
> v14 was checked by pgperltidy and pgindent.
>
> This patch can be applied to the PG whose commit id is after 8d74fc9 
> (introduction of
> pg_stat_subscription_workers).

Thanks for the updated patch.
Currently we are storing the commit count, error_count and abort_count
for each table of the table sync operation. If we have thousands of
tables, we will be storing the information for each of the tables.
Shouldn't we be storing the consolidated information in this case.
diff --git a/src/backend/replication/logical/tablesync.c
b/src/backend/replication/logical/tablesync.c
index f07983a..02e9486 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1149,6 +1149,11 @@ copy_table_done:
MyLogicalRepWorker->relstate_lsn = *origin_startpos;
SpinLockRelease(>relmutex);

+   /* Report the success of table sync. */
+   pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
+
  MyLogicalRepWorker->relid,
+
  0 /* no logical message type */ );

postgres=# select * from pg_stat_subscription_workers ;
 subid | subname | subrelid | commit_count | error_count | abort_count
| last_error_relid | last_error_command | last_error_xid |
last_error_count | last_error_message | last_error_time
---+-+--+--+-+-+--+++--++-
 16411 | sub1|16387 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16396 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16390 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16393 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16402 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16408 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16384 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16399 |1 |   0 |   0
|  |||
   0 ||
 16411 | sub1|16405 |1 |   0 |   0
|  |||
   0 ||
(9 rows)

Regards,
Vignesh




RE: Data is copied twice when specifying both child and parent table in publication

2021-12-02 Thread houzj.f...@fujitsu.com
On Thursday, December 2, 2021 4:54 PM houzj.f...@fujitsu.com 
 wrote:
> On Thursday, December 2, 2021 12:50 PM Amit Kapila
>  wrote:
> > On Thu, Dec 2, 2021 at 9:41 AM Greg Nancarrow 
> > wrote:
> > >
> > > On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow 
> > wrote:
> > > >
> > > > If you updated my original description to say "(instead of just
> > > > the individual partitions)", it would imply the same I think.
> > > > But I don't mind if you want to explicitly state both cases to make it 
> > > > clear.
> > > >
> > >
> > > For example, something like:
> > >
> > > For publications of partitioned tables with
> > > publish_via_partition_root set to true, only the partitioned table
> > > (and not its partitions) is included in the view, whereas if
> > > publish_via_partition_root is set to false, only the individual 
> > > partitions are
> included in the view.
> > >
> >
> > Yeah, that sounds good to me.
> 
> It looks good to me as well.
> Attach the patches for (HEAD~13) which merge the suggested doc change. I
> prepared the code patch and test patch separately to make it easier for
> committer to confirm.

It seems we might not need to backpatch the doc change, so
attach another version which remove the doc changes from backpatch patches.

Best regards,
Hou zj



HEAD-0001-Fix-double-publish-of-child-table-s-data.patch
Description: HEAD-0001-Fix-double-publish-of-child-table-s-data.patch


PG14-0001-Fix-double-publish-of-child-table-s-data.patch
Description: PG14-0001-Fix-double-publish-of-child-table-s-data.patch


PG13-0001-Fix-double-publish-of-child-table-s-data.patch
Description: PG13-0001-Fix-double-publish-of-child-table-s-data.patch


HEAD-0002-testcases.patch
Description: HEAD-0002-testcases.patch


PG13-0002-testcases.patch
Description: PG13-0002-testcases.patch


PG14-0002-testcases.patch
Description: PG14-0002-testcases.patch


HEAD-0003-improve-the-doc-for-pg_publication_tables.patch
Description: HEAD-0003-improve-the-doc-for-pg_publication_tables.patch


Re: Alter all tables in schema owner fix

2021-12-02 Thread vignesh C
On Fri, Dec 3, 2021 at 9:53 AM Greg Nancarrow  wrote:
>
> On Fri, Dec 3, 2021 at 2:06 PM vignesh C  wrote:
> >
> > Currently while changing the owner of ALL TABLES IN SCHEMA
> > publication, it is not checked if the new owner has superuser
> > permission or not. Added a check to throw an error if the new owner
> > does not have superuser permission.
> > Attached patch has the changes for the same. Thoughts?
> >
>
> It looks OK to me, but just two things:
>
> 1) Isn't it better to name "CheckSchemaPublication" as
> "IsSchemaPublication", since it has a boolean return and also
> typically CheckXXX type functions normally do checking and error-out
> if they find a problem.

Modified

> 2) Since superuser_arg() caches previous input arg (last_roleid) and
> has a fast-exit, and has been called immediately before for the FOR
> ALL TABLES case, it would be better to write:
>
> + if (CheckSchemaPublication(form->oid) && !superuser_arg(newOwnerId))
>
> as:
>
> + if (!superuser_arg(newOwnerId) && IsSchemaPublication(form->oid))

Modified

Thanks for the comments, the attached v2 patch has the changes for the same.

Regards,
Vignesh
From c75ce78efc80e86bcc62f8560a767cc73666a7d2 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Thu, 2 Dec 2021 19:44:15 +0530
Subject: [PATCH v2] Fix for new owner of ALL TABLES IN SCHEMA publication
 should be superuser.

Currently while changing the owner of ALL TABLES IN SCHEMA publication, it is
not checked if the new owner has superuser permission or not. Added a check to
throw an error if the new owner does not have super user permission.
---
 src/backend/commands/publicationcmds.c| 39 +++
 src/test/regress/expected/publication.out | 15 +
 src/test/regress/sql/publication.sql  | 15 +
 3 files changed, 69 insertions(+)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e95f6..e709ddb589 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -234,6 +234,38 @@ CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist,
 	}
 }
 
+/*
+ * Returns true if any schema is associated with the publication, false if no
+ * schema is associated with the publication.
+ */
+static bool
+IsSchemaPublication(Oid pubid)
+{
+	Relation	pubschsrel;
+	ScanKeyData scankey;
+	SysScanDesc scan;
+	HeapTuple	tup;
+	bool 		result = false;
+
+	pubschsrel = table_open(PublicationNamespaceRelationId, AccessShareLock);
+	ScanKeyInit(,
+Anum_pg_publication_namespace_pnpubid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(pubid));
+
+	scan = systable_beginscan(pubschsrel,
+			  PublicationNamespacePnnspidPnpubidIndexId,
+			  true, NULL, 1, );
+	tup = systable_getnext(scan);
+	if (HeapTupleIsValid(tup))
+		result = true;
+
+	systable_endscan(scan);
+	table_close(pubschsrel, AccessShareLock);
+
+	return result;
+}
+
 /*
  * Create new publication.
  */
@@ -1192,6 +1224,13 @@ AlterPublicationOwner_internal(Relation rel, HeapTuple tup, Oid newOwnerId)
 	 errmsg("permission denied to change owner of publication \"%s\"",
 			NameStr(form->pubname)),
 	 errhint("The owner of a FOR ALL TABLES publication must be a superuser.")));
+
+		if (!superuser_arg(newOwnerId) && IsSchemaPublication(form->oid))
+			ereport(ERROR,
+	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	 errmsg("permission denied to change owner of publication \"%s\"",
+			NameStr(form->pubname)),
+	 errhint("The owner of a FOR ALL TABLES IN SCHEMA publication must be a superuser.")));
 	}
 
 	form->pubowner = newOwnerId;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558968..b7df48e87c 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -373,6 +373,21 @@ ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- ok
 DROP PUBLICATION testpub2;
 DROP PUBLICATION testpub3;
 SET ROLE regress_publication_user;
+CREATE ROLE regress_publication_user3;
+GRANT regress_publication_user2 TO regress_publication_user3;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub4 FOR ALL TABLES IN SCHEMA pub_test;
+RESET client_min_messages;
+ALTER PUBLICATION testpub4 OWNER TO regress_publication_user3;
+SET ROLE regress_publication_user3;
+-- fail - new owner must be superuser
+ALTER PUBLICATION testpub4 owner to regress_publication_user2; -- fail
+ERROR:  permission denied to change owner of publication "testpub4"
+HINT:  The owner of a FOR ALL TABLES IN SCHEMA publication must be a superuser.
+ALTER PUBLICATION testpub4 owner to regress_publication_user; -- ok
+SET ROLE regress_publication_user;
+DROP PUBLICATION testpub4;
+DROP ROLE regress_publication_user3;
 REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;
 DROP TABLE testpub_parted;
 DROP TABLE testpub_tbl1;
diff --git a/src/test/regress/sql/publication.sql 

Re: extended stats on partitioned tables

2021-12-02 Thread Justin Pryzby
On Thu, Nov 04, 2021 at 12:44:45AM +0100, Tomas Vondra wrote:
> >> And I'm not sure we do the right thing after removing children, for example
> >> (that should drop the inheritance stats, I guess).

> > Do you mean for inheritance only ?  Or partitions too ?
> > I think for partitions, the stats should stay.
> > And for inheritence, they can stay, for consistency with partitions, and 
> > since
> > it does no harm.
> 
> I think the behavior should be the same as for data in pg_statistic,
> i.e. if we keep/remove those, we should do the same thing for extended
> statistics.

That works for column stats the way I proposed for extended stats: child stats
are never removed, neither when the only child is dropped, nor when re-running
analyze (that part is actually a bit odd).

Rebased, fixing an intermediate compile error, and typos in the commit message.

-- 
Justin
>From 7d752af4bbbded86f3c9d4b3571bdb4b8bdf5ef2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH 01/15] Do not use extended statistics on inheritance trees..

Since 859b3003de, inherited ext stats are not built.
However, the non-inherited stats stats were incorrectly used during planning of
queries with inheritance heirarchies.

Since the ext stats do not include child tables, they can lead to worse
estimates.  This is remarkably similar to 427c6b5b9, which affected column
statistics 15 years ago.

choose_best_statistics is handled a bit differently (in the calling function),
because it isn't passed rel nor rel->inh, and it's an exported function, so
avoid changing its signature in back branches.

https://www.postgresql.org/message-id/flat/20210925223152.ga7...@telsasoft.com

Backpatch to v10
---
 src/backend/statistics/dependencies.c   |  5 +
 src/backend/statistics/extended_stats.c |  5 +
 src/backend/utils/adt/selfuncs.c|  9 +
 src/test/regress/expected/stats_ext.out | 24 
 src/test/regress/sql/stats_ext.sql  | 15 +++
 5 files changed, 58 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..015b9e28f6 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1410,11 +1410,16 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* unique expressions */
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return 1.0;
+
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..b9e755f44e 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1694,6 +1694,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..a0932e39e1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no extended statistics */
 	if (!rel->statlist)
@@ -5232,6 +5237,10 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid,
 			if (vardata->statsTuple)
 break;
 
+			/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+			if (planner_rt_fetch(onerel->relid, root)->inh)
+break;
+
 			/* skip stats without per-expression stats */
 			if (info->kind != STATS_EXT_EXPRESSIONS)
 continue;
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index c60ba45aba..5410f58f7f 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -176,6 +176,30 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1;
 ANALYZE ab1;
 DROP TABLE ab1 CASCADE;
 NOTICE:  drop cascades to table ab1c

Re: Alter all tables in schema owner fix

2021-12-02 Thread vignesh C
On Fri, Dec 3, 2021 at 9:58 AM Bossart, Nathan  wrote:
>
> On 12/2/21, 7:07 PM, "vignesh C"  wrote:
> > Currently while changing the owner of ALL TABLES IN SCHEMA
> > publication, it is not checked if the new owner has superuser
> > permission or not. Added a check to throw an error if the new owner
> > does not have superuser permission.
> > Attached patch has the changes for the same. Thoughts?
>
> Yeah, the documentation clearly states that "the new owner of a FOR
> ALL TABLES or FOR ALL TABLES IN SCHEMA publication must be a
> superuser" [0].
>
> +/*
> + * Check if any schema is associated with the publication.
> + */
> +static bool
> +CheckSchemaPublication(Oid pubid)
>
> I don't think the name CheckSchemaPublication() accurately describes
> what this function is doing.  I would suggest something like
> PublicationHasSchema() or PublicationContainsSchema().  Also, much of
> this new function appears to be copied from GetPublicationSchemas().
> Should we just use that instead?

I was thinking of changing it to IsSchemaPublication as suggested by
Greg unless you feel differently. I did not use GetPublicationSchemas
function because in our case we just need to check if there is any
schema publication, we don't need the schema list to be prepared in
this case. That is the reason I wrote a new function which just checks
if any schema is present or not for the publication. I'm planning to
use CheckSchemaPublication (renamed to IsSchemaPublication) so that
the list need not be prepared.

> +CREATE ROLE regress_publication_user3 LOGIN SUPERUSER;
> +GRANT regress_publication_user2 TO regress_publication_user3;
> +SET ROLE regress_publication_user3;
> +SET client_min_messages = 'ERROR';
> +CREATE PUBLICATION testpub4 FOR ALL TABLES IN SCHEMA pub_test;
> +RESET client_min_messages;
> +SET ROLE regress_publication_user;
> +ALTER ROLE regress_publication_user3 NOSUPERUSER;
> +SET ROLE regress_publication_user3;
>
> I think this test setup can be simplified a bit:
>
> CREATE ROLE regress_publication_user3 LOGIN;
> GRANT regress_publication_user2 TO regress_publication_user3;
> SET client_min_messages = 'ERROR';
> CREATE PUBLICATION testpub4 FOR ALL TABLES IN SCHEMA pub_test;
> RESET client_min_messages;
> ALTER PUBLICATION testpub4 OWNER TO regress_publication_user3;
> SET ROLE regress_publication_user3;

I will make this change in the next version.

Regards,
VIgnesh




Re: Alter all tables in schema owner fix

2021-12-02 Thread Bossart, Nathan
On 12/2/21, 7:07 PM, "vignesh C"  wrote:
> Currently while changing the owner of ALL TABLES IN SCHEMA
> publication, it is not checked if the new owner has superuser
> permission or not. Added a check to throw an error if the new owner
> does not have superuser permission.
> Attached patch has the changes for the same. Thoughts?

Yeah, the documentation clearly states that "the new owner of a FOR
ALL TABLES or FOR ALL TABLES IN SCHEMA publication must be a
superuser" [0].

+/*
+ * Check if any schema is associated with the publication.
+ */
+static bool
+CheckSchemaPublication(Oid pubid)

I don't think the name CheckSchemaPublication() accurately describes
what this function is doing.  I would suggest something like
PublicationHasSchema() or PublicationContainsSchema().  Also, much of
this new function appears to be copied from GetPublicationSchemas().
Should we just use that instead?

+CREATE ROLE regress_publication_user3 LOGIN SUPERUSER;
+GRANT regress_publication_user2 TO regress_publication_user3;
+SET ROLE regress_publication_user3;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub4 FOR ALL TABLES IN SCHEMA pub_test;
+RESET client_min_messages;
+SET ROLE regress_publication_user;
+ALTER ROLE regress_publication_user3 NOSUPERUSER;
+SET ROLE regress_publication_user3;

I think this test setup can be simplified a bit:

CREATE ROLE regress_publication_user3 LOGIN;
GRANT regress_publication_user2 TO regress_publication_user3;
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub4 FOR ALL TABLES IN SCHEMA pub_test;
RESET client_min_messages;
ALTER PUBLICATION testpub4 OWNER TO regress_publication_user3;
SET ROLE regress_publication_user3;

Nathan

[0] https://www.postgresql.org/docs/devel/sql-alterpublication.html



Re: Alter all tables in schema owner fix

2021-12-02 Thread Greg Nancarrow
On Fri, Dec 3, 2021 at 2:06 PM vignesh C  wrote:
>
> Currently while changing the owner of ALL TABLES IN SCHEMA
> publication, it is not checked if the new owner has superuser
> permission or not. Added a check to throw an error if the new owner
> does not have superuser permission.
> Attached patch has the changes for the same. Thoughts?
>

It looks OK to me, but just two things:

1) Isn't it better to name "CheckSchemaPublication" as
"IsSchemaPublication", since it has a boolean return and also
typically CheckXXX type functions normally do checking and error-out
if they find a problem.

2) Since superuser_arg() caches previous input arg (last_roleid) and
has a fast-exit, and has been called immediately before for the FOR
ALL TABLES case, it would be better to write:

+ if (CheckSchemaPublication(form->oid) && !superuser_arg(newOwnerId))

as:

+ if (!superuser_arg(newOwnerId) && IsSchemaPublication(form->oid))


Regards,
Greg Nancarrow
Fujitsu Australia




Re: pg_get_publication_tables() output duplicate relid

2021-12-02 Thread Amit Kapila
On Thu, Dec 2, 2021 at 7:18 PM Amit Langote  wrote:
>
> On Thu, Dec 2, 2021 at 2:27 PM Amit Kapila  wrote:
> > On Thu, Dec 2, 2021 at 8:42 AM Amit Langote  wrote:
> > > Reading Alvaro's email at the top again gave me a pause to reconsider
> > > what I had said in reply.  It might indeed have been nice if the
> > > publication DDL itself had prevented the cases where a partition and
> > > its ancestor are added to a publication, though as Hou-san mentioned,
> > > partition ATTACHes make that a bit tricky to enforce at all times,
> > > though of course not impossible.  Maybe it's okay to just de-duplicate
> > > pg_publication_tables output as the patch does, even though it may
> > > appear to be a band-aid solution if we one considers Alvaro's point
> > > about consistency.
> >
> > True, I think even if we consider that idea it will be a much bigger
> > change, and also as it will be a behavioral change so we might want to
> > keep it just for HEAD and some of these fixes need to be backpatched.
> > Having said that, if you or someone want to pursue that idea and come
> > up with a better solution than what we have currently it is certainly
> > welcome.
>
> Okay, I did write a PoC patch this morning after sending out my
> earlier email.  I polished it a bit, which is attached.
>

I see multiple problems with this patch and idea. (a) I think you
forgot to deal with "All Tables In Schema" Publication which will be
quite tricky to deal with during attach operation. How will you remove
a particular relation from such a publication if there is a need to do
so? (b) Also, how will we prohibit adding partition and its root in
the case of "All Tables In Schema" or "All Tables" publication? If
there is no way to do that, then that would mean we anyway need to
deal with parent and child tables as part of the same publication and
this new behavior will add an exception to it. (c) I think this can
make switching "publish_via_partition_root" inconvenient for users.
Say all or some of the partitions are part of the publication and then
the user decides to add root table and change publish option
"publish_via_partition_root" to "true" at that moment she needs to
remove all partitions first.

-- 
With Regards,
Amit Kapila.




Alter all tables in schema owner fix

2021-12-02 Thread vignesh C
Hi,

Currently while changing the owner of ALL TABLES IN SCHEMA
publication, it is not checked if the new owner has superuser
permission or not. Added a check to throw an error if the new owner
does not have superuser permission.
Attached patch has the changes for the same. Thoughts?

Regards,
Vignesh
From d0d0d2d8944a66f28b239da5fb80c15415a016e6 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Thu, 2 Dec 2021 19:44:15 +0530
Subject: [PATCH v1] Fix for new owner of ALL TABLES IN SCHEMA publication
 should be superuser.

Currently while changing the owner of ALL TABLES IN SCHEMA publication, it is
not checked if the new owner has superuser permission or not. Added a check to
throw an error if the new owner does not have super user permission.
---
 src/backend/commands/publicationcmds.c| 38 +++
 src/test/regress/expected/publication.out | 17 ++
 src/test/regress/sql/publication.sql  | 17 ++
 3 files changed, 72 insertions(+)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7d4a0e95f6..7081919ffc 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -234,6 +234,37 @@ CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist,
 	}
 }
 
+/*
+ * Check if any schema is associated with the publication.
+ */
+static bool
+CheckSchemaPublication(Oid pubid)
+{
+	Relation	pubschsrel;
+	ScanKeyData scankey;
+	SysScanDesc scan;
+	HeapTuple	tup;
+	bool 		result = false;
+
+	pubschsrel = table_open(PublicationNamespaceRelationId, AccessShareLock);
+	ScanKeyInit(,
+Anum_pg_publication_namespace_pnpubid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(pubid));
+
+	scan = systable_beginscan(pubschsrel,
+			  PublicationNamespacePnnspidPnpubidIndexId,
+			  true, NULL, 1, );
+	tup = systable_getnext(scan);
+	if (HeapTupleIsValid(tup))
+		result = true;
+
+	systable_endscan(scan);
+	table_close(pubschsrel, AccessShareLock);
+
+	return result;
+}
+
 /*
  * Create new publication.
  */
@@ -1192,6 +1223,13 @@ AlterPublicationOwner_internal(Relation rel, HeapTuple tup, Oid newOwnerId)
 	 errmsg("permission denied to change owner of publication \"%s\"",
 			NameStr(form->pubname)),
 	 errhint("The owner of a FOR ALL TABLES publication must be a superuser.")));
+
+		if (CheckSchemaPublication(form->oid) && !superuser_arg(newOwnerId))
+			ereport(ERROR,
+	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	 errmsg("permission denied to change owner of publication \"%s\"",
+			NameStr(form->pubname)),
+	 errhint("The owner of a FOR ALL TABLES IN SCHEMA publication must be a superuser.")));
 	}
 
 	form->pubowner = newOwnerId;
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 1feb558968..4c74488113 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -373,6 +373,23 @@ ALTER PUBLICATION testpub2 ADD TABLE testpub_tbl1;  -- ok
 DROP PUBLICATION testpub2;
 DROP PUBLICATION testpub3;
 SET ROLE regress_publication_user;
+CREATE ROLE regress_publication_user3 LOGIN SUPERUSER;
+GRANT regress_publication_user2 TO regress_publication_user3;
+SET ROLE regress_publication_user3;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub4 FOR ALL TABLES IN SCHEMA pub_test;
+RESET client_min_messages;
+SET ROLE regress_publication_user;
+ALTER ROLE regress_publication_user3 NOSUPERUSER;
+SET ROLE regress_publication_user3;
+-- fail - new owner must be superuser
+ALTER PUBLICATION testpub4 owner to regress_publication_user2; -- fail
+ERROR:  permission denied to change owner of publication "testpub4"
+HINT:  The owner of a FOR ALL TABLES IN SCHEMA publication must be a superuser.
+ALTER PUBLICATION testpub4 owner to regress_publication_user; -- ok
+SET ROLE regress_publication_user;
+DROP PUBLICATION testpub4;
+DROP ROLE regress_publication_user3;
 REVOKE CREATE ON DATABASE regression FROM regress_publication_user2;
 DROP TABLE testpub_parted;
 DROP TABLE testpub_tbl1;
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index 8fa0435c32..880f0caa1d 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -218,6 +218,23 @@ DROP PUBLICATION testpub2;
 DROP PUBLICATION testpub3;
 
 SET ROLE regress_publication_user;
+CREATE ROLE regress_publication_user3 LOGIN SUPERUSER;
+GRANT regress_publication_user2 TO regress_publication_user3;
+SET ROLE regress_publication_user3;
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub4 FOR ALL TABLES IN SCHEMA pub_test;
+RESET client_min_messages;
+SET ROLE regress_publication_user;
+ALTER ROLE regress_publication_user3 NOSUPERUSER;
+SET ROLE regress_publication_user3;
+-- fail - new owner must be superuser
+ALTER PUBLICATION testpub4 owner to regress_publication_user2; -- fail
+ALTER PUBLICATION testpub4 owner to 

Re: More time spending with "delete pending"

2021-12-02 Thread Alexander Lakhin
Hello Daniel and Michael,
02.12.2021 08:03, Michael Paquier wrote:
> On Wed, Dec 01, 2021 at 11:51:58AM +0100, Daniel Gustafsson wrote:
>> Michael, have you had a chance to look at the updated patch here?  I'm moving
>> this patch entry from Waiting on Author to Needs Review.
> No, I haven't.  Before being touched more, this requires first a bit
> more of testing infrastructure based on MinGW, and I don't have that
> in place yet :/
I think that my patch should be withdrawn in favor of
https://commitfest.postgresql.org/34/3326/ ("Check for
STATUS_DELETE_PENDING on Windows" by Thomas Munro), that use deeper
understanding of the problematic status, not indirect indication of it.

Barring objections, I'll withdraw my commitfest entry.

Best regards,
Alexander




Re: Skipping logical replication transactions on subscriber side

2021-12-02 Thread Amit Kapila
On Thu, Dec 2, 2021 at 8:38 PM Peter Eisentraut
 wrote:
>
> On 02.12.21 07:48, Amit Kapila wrote:
> > a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
> > b. Alter Subscription  SET ( subscription_parameter [=value]
> > [, ... ] );
> > c. Alter Subscription  On Error ( subscription_parameter
> > [=value] [, ... ] );
> > d. Alter Subscription  SKIP ( subscription_parameter
> > [=value] [, ... ] );
> > where subscription_parameter can be one of:
> > xid = 
> > lsn = 
> > ...
>
> > As per discussion till now, option (d) seems preferable.
>
> I agree.
>
> > In this, we
> > need to see how and what to allow as options. The simplest way for the
> > first version is to just allow one xid to be specified at a time which
> > would mean that specifying multiple xids should error out. We can also
> > additionally allow specifying operations like 'insert', 'update',
> > etc., and then relation list (list of oids). What that would mean is
> > that for a transaction we can allow which particular operations and
> > relations we want to skip.
>
> I don't know how difficult it would be, but allowing multiple xids might
> be desirable.
>

Are there many cases where there could be multiple xid failures that
the user can skip? Apply worker always keeps looping at the same error
failure so the user wouldn't know of the second xid failure (if any)
till the first failure is resolved. I could think of one such case
where it is possible during the initial synchronization phase where
apply worker went ahead then tablesync worker by skipping to apply the
changes on the corresponding table. After that, it is possible, that
the table sync worker failed during the catch-up phase and apply
worker fails during the processing of some other rel.

>  But this syntax gives you flexibility, so we can also
> start with a simple implementation.
>

Yeah, I also think so. BTW, what do you think of providing extra
flexibility of giving other options like 'operation', 'rel' along with
xid? I think such options could be useful for large transactions that
operate on multiple tables as it is quite possible that only a
particular operation from the entire transaction is the cause of
failure. Now, on one side, we can argue that skipping the entire
transaction is better from the consistency point of view but I think
it is already possible that we just skip a particular update/delete
(if the corresponding tuple doesn't exist on the subscriber). For the
sake of simplicity, we can just allow providing xid at this stage and
then extend it later as required but I am not very sure of that point.

> > I am not sure what exactly we can provide to users to allow skipping
> > initial table sync as we can't specify XID there. One option that
> > comes to mind is to allow specifying a combination of copy_data and
> > relid to skip table sync for a particular relation. We might think of
> > not doing anything for table sync workers but not sure if that is a
> > good option.
>
> I don't think this feature should affect tablesync.  The semantics are
> not clear, and it's not really needed.  If the tablesync doesn't work,
> you can try the setup again from scratch.
>

Okay, that makes sense. But note it is possible that tablesync workers
might also need to skip some xids during the catchup phase to complete
the sync.

-- 
With Regards,
Amit Kapila.




Re: row filtering for logical replication

2021-12-02 Thread Peter Smith
On Thu, Dec 2, 2021 at 2:32 PM tanghy.f...@fujitsu.com
 wrote:
>
> On Thursday, December 2, 2021 5:21 AM Peter Smith  
> wrote:
> >
> > PSA the v44* set of patches.
> >
>
> Thanks for the new patch. Few comments:
>
> 1. This is an example in publication doc, but in fact it's not allowed. 
> Should we
> change this example?
>
> +CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS 
> TRUE);
>
> postgres=# CREATE PUBLICATION active_departments FOR TABLE departments WHERE 
> (active IS TRUE);
> ERROR:  invalid publication WHERE expression for relation "departments"
> HINT:  only simple expressions using columns, constants and immutable system 
> functions are allowed
>

Thanks for finding this. Actually, the documentation looks correct to
me. The problem was the validation walker of patch 0002 was being
overly restrictive. It needed to also allow a BooleanTest node.

Now it works (locally) for me. For example.

test_pub=# create table departments(depno int primary key, active boolean);
CREATE TABLE
test_pub=# create publication pdept for table departments where
(active is true) with (publish="insert");
CREATE PUBLICATION
test_pub=# create publication pdept2 for table departments where
(active is false) with (publish="insert");
CREATE PUBLICATION

This fix will be available in v45*.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: keepliaves etc. as environment variables

2021-12-02 Thread Tatsuo Ishii
> Hi,
> 
> On 2021-12-03 10:28:34 +0900, Tatsuo Ishii wrote:
>> It seems there are no environment variables corresponding to keepalives
>> etc. connection parameters in libpq. Is there any reason for this?
> 
> PGOPTIONS='-c tcp_keepalive_*=foo' should work.

Sorry I was not clear. I wanted to know why there are no specific
environment variable for keepalives etc. like PGCONNECT_TIMEOUT.

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: keepliaves etc. as environment variables

2021-12-02 Thread Andres Freund
Hi,

On 2021-12-03 10:28:34 +0900, Tatsuo Ishii wrote:
> It seems there are no environment variables corresponding to keepalives
> etc. connection parameters in libpq. Is there any reason for this?

PGOPTIONS='-c tcp_keepalive_*=foo' should work.

Greetings,

Andres Freund




keepliaves etc. as environment variables

2021-12-02 Thread Tatsuo Ishii
It seems there are no environment variables corresponding to keepalives
etc. connection parameters in libpq. Is there any reason for this?

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Tab completion for CREATE SCHEMAAUTHORIZATION

2021-12-02 Thread Michael Paquier
On Wed, Aug 11, 2021 at 10:16:15AM +0900, Michael Paquier wrote:
> +   else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION"))
> +   COMPLETE_WITH_QUERY(Query_for_list_of_owner_roles);
> +   else if (Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION"))
> +   COMPLETE_WITH_QUERY(Query_for_list_of_owner_roles);
> +   else if (Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION", 
> MatchAny))
> +   COMPLETE_WITH("CREATE", "GRANT");
> +   else if (Matches("CREATE", "SCHEMA", MatchAny))
> +   COMPLETE_WITH("AUTHORIZATION", "CREATE", "GRANT");
> Looks like you forgot the case "CREATE SCHEMA AUTHORIZATION MatchAny"
> that should be completed by GRANT and CREATE.

This patch has been waiting on author for more than a couple of weeks,
so I have marked it as returned with feedback in the CF app.  Please
feel free to resubmit if you are able to work more on that.
--
Michael


signature.asc
Description: PGP signature


Re: CLUSTER on partitioned index

2021-12-02 Thread Michael Paquier
On Thu, Sep 23, 2021 at 06:56:26PM -0500, Justin Pryzby wrote:
> On Thu, Sep 23, 2021 at 08:18:41PM +0200, Matthias van de Meent wrote:
>> Note: The following review is based on the assumption that this v11
>> revision was meant to contain only one patch. I put this up as a note,
>> because it seemed quite limited when compared to earlier versions of
>> the patchset.
> 
> Alvaro's critique was that the patchset was too complicated for what was
> claimed to be a marginal feature.  My response was to rearrange the patchset 
> to
> its minimal form, supporting CLUSTER without marking the index as clustered.
>
> My goal is to present a minimal patch and avoid any nonessential complexity.

FWIW, my opinion on the matter is similar to Alvaro's, and an extra
read of the patch gives me the same impression.  Let's see if others
have an opinion on the matter.

> Thanks for looking.  I'm going to see about updating comments based on
> corresponding parts of vacuum and on this message itself.

It doesn't feel right to just discard the patch at this stage, and it
needs an update, so I have moved it to the next CF for now, waiting on
author.  If this does not really move on, my suggestion is to discard
the patch at the end of next CF, aka 2022-01.
--
Michael


signature.asc
Description: PGP signature


Re: Skip vacuum log report code in lazy_scan_heap() if possible

2021-12-02 Thread Andres Freund
Hi,

On 2021-12-03 09:53:22 +0900, Michael Paquier wrote:
> On Thu, Dec 02, 2021 at 10:22:25PM +, Bossart, Nathan wrote:
> > Since I have no further comments, I went ahead and marked this once as
> > ready-for-committer.
>
> Well, as you say, lazy_scan_heap() is only run once per relation, so
> that's not a hot code path.

Yea, it seems like a premature optimization.


> Looking at the callers of
> message_level_is_interesting(), we apply that also in areas where a
> lot of unnecessary work would be involved, like the drop of object
> dependencies or ProcSleep() (I recall that there were profiles where
> standby replies in walsender.c could show up).  And based on the
> amount of unnecessary work done at the end of lazy_scan_heap(), I'd
> say that this is worth skipping, so let's do it.

I think this mostly reduces the coverage of the relevant code without any
measurable speed gain. -0.5 from here.

Greetings,

Andres Freund




Re: Temporary tables versus wraparound... again

2021-12-02 Thread Bossart, Nathan
On 10/12/21, 3:07 PM, "Greg Stark"  wrote:
> Here's an updated patch. I added some warning messages to autovacuum.

I think this is useful optimization, and I intend to review the patch
more closely soon.  It looks reasonable to me after a quick glance.

> One thing I learned trying to debug this situation in production is
> that it's nigh impossible to find the pid of the session using a
> temporary schema. The number in the schema refers to the backendId in
> the sinval stuff for which there's no external way to look up the
> corresponding pid. It would have been very helpful if autovacuum had
> just told me which backend pid to kill.

I certainly think it would be good to have autovacuum log the PID, but
IIUC a query like this would help you map the backend ID to the PID:

SELECT bid, pg_stat_get_backend_pid(bid) AS pid FROM 
pg_stat_get_backend_idset() bid;

Nathan



Re: GUC flags

2021-12-02 Thread Michael Paquier
On Wed, Dec 01, 2021 at 11:17:34PM -0600, Justin Pryzby wrote:
> I find it easier to read "wait before authentication ..." than "wait ... 
> before
> authentication".

I have a hard time seeing a strong difference here.  At the end, I
have used what you suggested, adjusted the rest based on your set of
comments, and applied the patch.
--
Michael


signature.asc
Description: PGP signature


Re: should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-02 Thread Michael Paquier
On Fri, Dec 03, 2021 at 12:45:56AM +, Bossart, Nathan wrote:
> I think the problems you noted upthread are shared for all GUCs with
> type GUC_LIST_QUOTE (e.g., search_path, temp_tablespaces).  Perhaps
> the documentation for each of these GUCs should contain a short blurb
> about how to properly SET a list of values.

Yeah, the approach taken by the proposed patch is not going to scale
and age well.

It seems to me that we should have something dedicated to lists around
the section for "Parameter Names and Values", and add a link in the
description of each parameters concerned back to the generic
description.

> Also upthread, I see that you gave the following example for an
> incorrect way to set shared_preload_libraries:
> 
> ALTER SYSTEM SET shared_preload_libraries =
> auth_delay,pg_stat_statements,sepgsql;  --> wrong
> 
> Why is this wrong?  It seems to work okay for me.

Yep.
--
Michael


signature.asc
Description: PGP signature


Re: Skip vacuum log report code in lazy_scan_heap() if possible

2021-12-02 Thread Michael Paquier
On Thu, Dec 02, 2021 at 10:22:25PM +, Bossart, Nathan wrote:
> Since I have no further comments, I went ahead and marked this once as
> ready-for-committer.

Well, as you say, lazy_scan_heap() is only run once per relation, so
that's not a hot code path.  Looking at the callers of
message_level_is_interesting(), we apply that also in areas where a
lot of unnecessary work would be involved, like the drop of object 
dependencies or ProcSleep() (I recall that there were profiles where
standby replies in walsender.c could show up).  And based on the
amount of unnecessary work done at the end of lazy_scan_heap(), I'd
say that this is worth skipping, so let's do it.

There is always the argument that one may blindly add some logic at
the end of lazy_scan_heap(), causing it to be skipped depending on the
configuration, but that's unlikely going to happen after the activity
is logged, so I am fine to apply what you have here.  Let's wait a bit
to see if others have any objections, first.
--
Michael


signature.asc
Description: PGP signature


Re: should we document an example to set multiple libraries in shared_preload_libraries?

2021-12-02 Thread Bossart, Nathan
On 12/1/21, 5:59 AM, "Bharath Rupireddy" 
 wrote:
> On Wed, Dec 1, 2021 at 6:45 PM Tom Lane  wrote:
>> Considering the vanishingly small number of actual complaints we've
>> seen about this, that sounds ridiculously over-engineered.
>> A documentation example should be sufficient.
>
> Thanks. Here's the v1 patch adding examples in the documentation.

I think the problems you noted upthread are shared for all GUCs with
type GUC_LIST_QUOTE (e.g., search_path, temp_tablespaces).  Perhaps
the documentation for each of these GUCs should contain a short blurb
about how to properly SET a list of values.

Also upthread, I see that you gave the following example for an
incorrect way to set shared_preload_libraries:

ALTER SYSTEM SET shared_preload_libraries =
auth_delay,pg_stat_statements,sepgsql;  --> wrong

Why is this wrong?  It seems to work okay for me.

Nathan



Re: row filtering for logical replication

2021-12-02 Thread Peter Smith
On Fri, Dec 3, 2021 at 12:59 AM Euler Taveira  wrote:
>
> On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
>
> PSA a new v44* patch set.
>
> It includes a new patch 0006 which implements the idea above.
>
> ExprState cache logic is basically all the same as before (including
> all the OR combining), but there are now 4x ExprState caches keyed and
> separated by the 4x different pubactions.
>
> row filter is not applied for TRUNCATEs so it is just 3 operations.
>

Correct. The patch 0006 comment/code will be updated for this point in
the next version posted.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: Skip vacuum log report code in lazy_scan_heap() if possible

2021-12-02 Thread Bossart, Nathan
On 10/29/21, 10:49 AM, "Bossart, Nathan"  wrote:
> On 10/29/21, 3:54 AM, "Greg Nancarrow"  wrote:
>> When recently debugging a vacuum-related issue, I noticed that there
>> is some log-report processing/formatting code at the end of
>> lazy_scan_heap() that, unless the vacuum VERBOSE option is specified,
>> typically results in no log output (as the log-level is then DEBUG2).
>> I think it is worth skipping this code if ultimately nothing is going
>> to be logged (and I found that even for a tiny database, a VACUUM may
>> execute this code hundreds of times).
>> I have attached a simple patch for this.
>
> I think this logging only happens once per table, so I'm not sure it's
> really worth it to short-circuit here.  If it was per-page, IMO there
> would be a much stronger case for it.  That being said, I don't think
> the proposed patch would hurt anything.

Since I have no further comments, I went ahead and marked this once as
ready-for-committer.

Nathan



Re: pg_dump versus ancient server versions

2021-12-02 Thread Andres Freund
Hi,

On 2021-12-02 11:01:47 +0100, Peter Eisentraut wrote:
> - The policy for other client-side tools (list at [0]) is less clear
>   and arguably less important.  I suggest we focus on pg_dump and psql
>   first, and then we can decide for the rest whether they want to
>   match a longer window, a shorter window, or a different policy
>   altogether (e.g., ecpg).

I think we should at least include pg_upgrade in this as well, it's pretty
closely tied to at least pg_dump.


> * pg_dump and psql will maintain compatibility with servers at least
>   ten major releases back.

Personally I think that's too long... It boils down keeping branches buildable
for ~15 years after they've been released. That strikes me as pretty far into
diminishing-returns, and steeply increasing costs, territory.

I realize it's more complicated for users, but a policy based on supporting a
certain number of out-of-support branches calculated from the newest major
version is more realistic. I'd personally go for something like newest-major -
7 (i.e. 2 extra releases), but I realize that others think it's worthwhile to
support a few more.  I think there's a considerable advantage of having one
cutoff date across all branches.

That's not to say we'd remove support for older versions from back
branches. Just that we don't ever consider them supported (or test them) once
below the cutoff.


> I use the count of major releases here instead of some number of
> years, as was previously discussed, for two reasons.  First, it makes
> computing the cutoff easier, because you are not bothered by whether
> some old release was released a few weeks before or after the
> equivalent date in the current year for the new release.  Second,
> there is no ambiguity about what happens during the lifetime of a
> major release: If major release $NEW supports major release $OLD at
> the time of $NEW's release, then that stays the same for the whole
> life of $NEW; we don't start dropping support for $OLD in $NEW.5
> because a year has passed.

Makes sense.


> * We keep old major release branches buildable as long as a new major
>   release that has support for that old release is under support.

> Buildable for this purpose means just enough that you can use it to
> test pg_dump and psql.  This probably includes being able to run make
> installcheck and use pg_dump and psql against the regression database.

I think we should explicitly limit the number of platforms we care about for
this purpose. I don't think we should even try to keep 8.2 compile on AIX or
whatnot.

Greetings,

Andres Freund




Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-02 Thread Bossart, Nathan
On 12/2/21, 9:50 AM, "David Steele"  wrote:
> On 12/2/21 12:38, David Steele wrote:
>> On 12/2/21 11:00, Andrew Dunstan wrote:
>>>
>>> Should we really be getting rid of
>>> PostgreSQL::Test::Cluster::backup_fs_hot() ?
>>
>> Agreed, it would be better to update backup_fs_hot() to use exclusive
>> mode and save out backup_label instead.
>
> Oops, of course I meant non-exclusive mode.

+1.  I'll fix that in the next revision.

Nathan



Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-02 Thread Bossart, Nathan
On 12/1/21, 6:48 PM, "Bharath Rupireddy" 
 wrote:
> +1 for the overall idea of making the checkpoint faster. In fact, we
> here at our team have been thinking about this problem for a while. If
> there are a lot of files that checkpoint has to loop over and remove,
> IMO, that task can be delegated to someone else (maybe a background
> worker called background cleaner or bg cleaner, of course, we can have
> a GUC to enable or disable it). The checkpoint can just write some

Right.  IMO it isn't optimal to have critical things like startup and
checkpointing depend on somewhat-unrelated tasks.  I understand the
desire to avoid adding additional processes, and maybe it is a bigger
hammer than what is necessary to reduce the impact, but it seemed like
a natural solution for this problem.  That being said, I'm all for
exploring other ways to handle this.

> Another idea could be to parallelize the checkpoint i.e. IIUC, the
> tasks that checkpoint do in CheckPointGuts are independent and if we
> have some counters like (how many snapshot/mapping files that the
> server generated)

Could you elaborate on this?  Is your idea that the checkpointer would
create worker processes like autovacuum does?

Nathan



Dubious usage of TYPCATEGORY_STRING

2021-12-02 Thread Tom Lane
The parser's type-coercion heuristics include some special rules
for types belonging to the STRING category, which are predicated
on the assumption that such types are reasonably general-purpose
string types.  This assumption has been violated by a few types,
one error being ancient and the rest not so much:

1. The "char" type is labeled as STRING category, even though it's
(a) deprecated for general use and (b) unable to store more than
one byte, making "string" quite a misnomer.

2. Various types we invented to store special catalog data, such as
pg_node_tree and pg_ndistinct, are also labeled as STRING category.
This seems like a fairly bad idea too.

An example of the reasons not to treat these types as being
general-purpose strings can be seen at [1], where the "char"
type has acquired some never-intended cast behaviors.  Taking
that to an extreme, we currently accept

regression=# select '(1,2)'::point::"char";
 char 
--
 (
(1 row)

My first thought about fixing point 1 was to put "char" into some
other typcategory, but that turns out to break some of psql's
catalog queries, with results like:

regression=# \dp
ERROR:  operator is not unique: unknown || "char"
LINE 16:E' (' || polcmd || E'):'
  ^
HINT:  Could not choose a best candidate operator. You might need to add 
explicit type casts.

I looked briefly at rejiggering the casting rules so that that
would still work, but it looks like a mess.  The problem is that
unknown || "char" can match either text || text or text || anynonarray,
and it's only the special preference for preferred types *of the
same typcategory as the input type* that allows us to prefer one
of those over the other.

Hence, what 0001 below does is to leave "char" in the string
category, but explicitly disable its access to the special
cast-via-I/O rules.  This is a hack for sure, but it won't have
any surprising side-effects on other types, which changing the
general operator-matching rules could.  The only thing it breaks
in check-world is that contrib/citext expects casting between
"char" and citext to work.  I think that's not a very reasonable
expectation so I just took out the relevant test cases.  (If anyone
is hot about it, we could add explicit support for such casts in
the citext module ... but it doesn't seem worth the trouble.)

As for point 2, I haven't found any negative side-effects of
taking the special types out of the string category, so 0002
attached invents a separate TYPCATEGORY_INTERNAL category to
put them in.

Thoughts?

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/CAOC8YUcXymCMpC5d%3D7JvcwyjXPTT00WeebOM3UqTBreOD1N9hw%40mail.gmail.com

diff --git a/contrib/citext/expected/citext.out b/contrib/citext/expected/citext.out
index ec99aaed5d..307d292d56 100644
--- a/contrib/citext/expected/citext.out
+++ b/contrib/citext/expected/citext.out
@@ -721,18 +721,6 @@ SELECT 'f'::citext::char = 'f'::char AS t;
  t
 (1 row)
 
-SELECT 'f'::"char"::citext = 'f' AS t;
- t 

- t
-(1 row)
-
-SELECT 'f'::citext::"char" = 'f'::"char" AS t;
- t 

- t
-(1 row)
-
 SELECT '100'::money::citext = '$100.00' AS t;
  t 
 ---
@@ -1041,7 +1029,6 @@ CREATE TABLE caster (
 varchar varchar,
 bpchar  bpchar,
 charchar,
-chr "char",
 namename,
 bytea   bytea,
 boolean boolean,
@@ -1087,10 +1074,6 @@ INSERT INTO caster (char)  VALUES ('f'::text);
 INSERT INTO caster (text)  VALUES ('f'::char);
 INSERT INTO caster (char)  VALUES ('f'::citext);
 INSERT INTO caster (citext)VALUES ('f'::char);
-INSERT INTO caster (chr)   VALUES ('f'::text);
-INSERT INTO caster (text)  VALUES ('f'::"char");
-INSERT INTO caster (chr)   VALUES ('f'::citext);
-INSERT INTO caster (citext)VALUES ('f'::"char");
 INSERT INTO caster (name)  VALUES ('foo'::text);
 INSERT INTO caster (text)  VALUES ('foo'::name);
 INSERT INTO caster (name)  VALUES ('foo'::citext);
diff --git a/contrib/citext/expected/citext_1.out b/contrib/citext/expected/citext_1.out
index 75fd08b7cc..9f423b7496 100644
--- a/contrib/citext/expected/citext_1.out
+++ b/contrib/citext/expected/citext_1.out
@@ -721,18 +721,6 @@ SELECT 'f'::citext::char = 'f'::char AS t;
  t
 (1 row)
 
-SELECT 'f'::"char"::citext = 'f' AS t;
- t 

- t
-(1 row)
-
-SELECT 'f'::citext::"char" = 'f'::"char" AS t;
- t 

- t
-(1 row)
-
 SELECT '100'::money::citext = '$100.00' AS t;
  t 
 ---
@@ -1041,7 +1029,6 @@ CREATE TABLE caster (
 varchar varchar,
 bpchar  bpchar,
 charchar,
-chr "char",
 namename,
 bytea   bytea,
 boolean boolean,
@@ -1087,10 +1074,6 @@ INSERT INTO caster (char)  VALUES ('f'::text);
 INSERT INTO caster (text)  VALUES ('f'::char);
 INSERT INTO caster (char)  VALUES ('f'::citext);
 INSERT INTO caster (citext)

Re: O(n) tasks cause lengthy startups and checkpoints

2021-12-02 Thread Bossart, Nathan
On 12/1/21, 6:06 PM, "Euler Taveira"  wrote:
> Saying that a certain task is O(n) doesn't mean it needs a separate process to
> handle it. Did you have a use case or even better numbers (% of checkpoint /
> startup time) that makes your proposal worthwhile?

I don't have specific numbers on hand, but each of the four functions
I listed is something I routinely see impacting customers.

> For (3), there is already a GUC that would avoid the slowdown during startup.
> Use it if you think the startup time is more important that disk space 
> occupied
> by useless files.

Setting remove_temp_files_after_crash to false only prevents temp file
cleanup during restart after a backend crash.  It is always called for
other startups.

> For (4), you are forgetting that the on-disk state of replication slots is
> stored in the pg_replslot/SLOTNAME/state. It seems you cannot just rename the
> replication slot directory and copy the state file. What happen if there is a
> crash before copying the state file?

Good point.  I think it's possible to deal with this, though.  Perhaps
the files that should be deleted on startup should go in a separate
directory, or maybe we could devise a way to ensure the state file is
copied even if there is a crash at an inconvenient time.

Nathan



Re: pg_dump versus ancient server versions

2021-12-02 Thread Andrew Dunstan


On 12/2/21 12:30, Tom Lane wrote:
>
>> In practice, the effort can focus on keeping the most recent cutoff
>> release buildable.  So in the above example, we really only need to
>> keep PG >=9.2 buildable to support ongoing development.  The chances
>> that some needs to touch code pertaining to older versions in
>> backbranches is lower, so those really would need to be dealt with
>> very rarely.
> OK.  Also, when you do need to check that, there are often other ways
> than rebuilding the old branch on modern platforms --- people may
> well have still-executable builds laying about, even if rebuilding
> from source would be problematic.
>
>   



I have a very old fedora instance where I can build every release back
to 7.2 :-) And with only slight massaging for the very old releases,
these builds run on my Fedora 34 development system. Certainly 8.2 and
up wouldn't be a problem. Currently I have only tested building without
any extra libraries/PLs, but I can look at other combinations. So, long
story short this is fairly doable at least in some environments. This
provides a good use case for the work I have been doing on backwards
compatibility of the TAP framework. I need to get back to that now that
the great module namespace adjustment has settled down.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Commitfest 2021-11 Patch Triage - Part 1

2021-12-02 Thread Tomas Vondra




On 12/2/21 12:06, Daniel Gustafsson wrote:

On 6 Nov 2021, at 17:20, Tomas Vondra  wrote:

On 11/5/21 22:15, Daniel Gustafsson wrote:

...
1651: GROUP BY optimization
===
This is IMO a desired optimization, and after all the heavy lifting by Tomas
the patch seems to be in pretty good shape.


I agree it's desirable. To move the patch forward, I need some feedback on the 
costing questions, mentioned in my last review, and the overall approach to 
planning (considering multiple group by variants).


I'm moving this to the next CF with the hope that more feedback is received.
If you think "Ready for Committer" is a more suitable status, please do update
it as I'm leaving it in Needs Review for now not having the full context.


I don't think anyone reviewed the costing, so I think "needs review" is 
appropriate.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-12-02 Thread Tomas Vondra

Hi,

On 12/2/21 15:58, Arne Roland wrote:
Afaiac we should add a simple testcase here, like I suggested in 
477344d5f17c4a8e95d3a5bb6642718a 
. 
Apart from that I am not sure there is work to be done here.




Well, I mentioned three open questions in my first message, and I don't 
think we've really addressed them yet. We've agreed we don't need to add 
the incremental sort here, but that leaves



1) If get_cheapest_fractional_path_for_pathkeys returns NULL, should we 
default to cheapest_startup or cheapest_total?


2) Should get_cheapest_fractional_path_for_pathkeys worry about 
require_parallel_safe? I think yes, but we need to update the patch.


I'll take a closer look next week, once I get home from NYC, and I'll 
submit an improved version for the January CF.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Windows: Wrong error message at connection termination

2021-12-02 Thread Tom Lane
Alexander Lakhin  writes:
> 29.11.2021 22:16, Tom Lane wrote:
>> After re-reading that thread and re-studying relevant Windows
>> documentation [1][2], I think the main open question is whether
>> we need to issue shutdown() or not, and if so, whether to use
>> SD_BOTH or just SD_SEND.  I'm inclined to prefer not calling
>> shutdown(), because [1] is self-contradictory as to whether it
>> can block, and [2] is pretty explicit that it's not necessary.

> I've tested the close-only patch with pg_sleep() in pqReadData(), and it
> works too.

Thanks for testing!

> So I wonder how to understand "To assure that all data is
> sent and received on a connected socket before it is closed, an
> application should use shutdown to close connection before calling
> closesocket." in [1].

I suppose their documentation has evolved over time.  This sentence
probably predates their explicit acknowledgement in [2] that you don't
have to call shutdown().  Maybe, once upon a time with very old
versions of Winsock, you did have to do so if you wanted graceful close.

I'll push the close-only change in a little bit.

regards, tom lane




Re: Column Filtering in Logical Replication

2021-12-02 Thread Alvaro Herrera
On 2021-Sep-16, Peter Smith wrote:

> I noticed that the latest v5 no longer includes the TAP test which was
> in the v4 patch.
> 
> (src/test/subscription/t/021_column_filter.pl)
> 
> Was that omission deliberate?

Somehow I not only failed to notice the omission, but also your email
where you told us about it.  I have since posted a version of the patch
that again includes it.

Thanks!

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No renuncies a nada. No te aferres a nada."




Re: Windows: Wrong error message at connection termination

2021-12-02 Thread Alexander Lakhin
Hello Tom,
29.11.2021 22:16, Tom Lane wrote:
> Hm, yeah, that discussion seems to have slipped through the cracks.
> Not sure why it didn't end up in pushing something.
>
> After re-reading that thread and re-studying relevant Windows
> documentation [1][2], I think the main open question is whether
> we need to issue shutdown() or not, and if so, whether to use
> SD_BOTH or just SD_SEND.  I'm inclined to prefer not calling
> shutdown(), because [1] is self-contradictory as to whether it
> can block, and [2] is pretty explicit that it's not necessary.
>
>   regards, tom lane
>
> [1] 
> https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-shutdown
> [2] 
> https://docs.microsoft.com/en-us/windows/win32/winsock/graceful-shutdown-linger-options-and-socket-closure-2
I've tested the close-only patch with pg_sleep() in pqReadData(), and it
works too. So I wonder how to understand "To assure that all data is
sent and received on a connected socket before it is closed, an
application should use shutdown to close connection before calling
closesocket." in [1].
Maybe they mean that shutdown should be used before, but not after
closesocket ). Or maybe the Windows' behaviour somehow evolved over
time. (With the patch I cannot reproduce the FATAL message loss even on
Windows 2012 R2.) So without a practical evidence of the importance of
shutdown() I'm inclined to a more simple solution too.

As to 268313a95, back in 2003 it was possible to compile server on
Windows only using Cygwin (though you could compile libpq with Visual C,
see [3]). So "#ifdef WIN32" that is proposed now, will not affect that
scenario anyway.

Best regards,
Alexander

[3]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=doc/src/sgml/install-win32.sgml;hb=268313a95




Re: pgcrypto: Remove explicit hex encoding/decoding from tests

2021-12-02 Thread Tom Lane
Peter Eisentraut  writes:
> pgcrypto tests use encode() and decode() calls to convert to/from hex 
> encoding.  This was from before the hex format was available in bytea. 
> Now we can remove the extra explicit encoding/decoding calls and rely on 
> the default output format.

Generally +1, but I see you removed some instances of

--- ensure consistent test output regardless of the default bytea format
-SET bytea_output TO escape;

I think that the principle still applies that this should work regardless
of the installation's default bytea format, so I'd recommend putting

-- ensure consistent test output regardless of the default bytea format
SET bytea_output TO hex;

at the top of each file instead.

regards, tom lane




Re: Replace uses of deprecated Python module distutils.sysconfig

2021-12-02 Thread Tom Lane
Peter Eisentraut  writes:
> With Python 3.10, configure spits out warnings about the module 
> distutils.sysconfig being deprecated and scheduled for removal in Python 
> 3.12:

Bleah.

> This patch changes the uses in configure to use the module sysconfig 
> instead.  The logic stays the same.  (It's basically the same module but 
> as its own top-level module.)
> Note that sysconfig exists since Python 2.7, so this moves the minimum 
> required version up from Python 2.6.

That's surely no problem in HEAD, but as you say, it is an issue for
the older branches.  How difficult would it be to teach configure to
try both ways, or adapt based on its python version check?

> I suggest leaving the backbranches alone for now.  At the moment, we 
> don't even know whether additional changes will be required for 3.12 
> (and 3.11) support, so the overall impact isn't known yet.  In a few 
> months, we will probably know more about this.

Agreed, this is a moving target so we shouldn't be too concerned
about it yet.

regards, tom lane




Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-02 Thread David Steele

On 12/2/21 12:38, David Steele wrote:

On 12/2/21 11:00, Andrew Dunstan wrote:


Should we really be getting rid of
PostgreSQL::Test::Cluster::backup_fs_hot() ?


Agreed, it would be better to update backup_fs_hot() to use exclusive 
mode and save out backup_label instead.


Oops, of course I meant non-exclusive mode.

Regards,
-David




Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-02 Thread David Steele

On 12/2/21 11:00, Andrew Dunstan wrote:


On 12/1/21 19:30, Bossart, Nathan wrote:

On 12/1/21, 10:37 AM, "Bossart, Nathan"  wrote:

On 12/1/21, 8:27 AM, "David Steele"  wrote:

On 11/30/21 18:31, Bossart, Nathan wrote:

Do you think it's still worth trying to make it safe, or do you think
we should just remove exclusive mode completely?

My preference would be to remove it completely, but I haven't gotten a
lot of traction so far.

In this thread, I count 6 people who seem alright with removing it,
and 2 who might be opposed, although I don't think anyone has
explicitly stated they are against it.

I hastily rebased the patch from 2018 and got it building and passing
the tests.  I'm sure it will need additional changes, but I'll wait
for more feedback before I expend too much more effort on this.



Should we really be getting rid of
PostgreSQL::Test::Cluster::backup_fs_hot() ?


Agreed, it would be better to update backup_fs_hot() to use exclusive 
mode and save out backup_label instead.


Regards,
-David




Re: pg_dump versus ancient server versions

2021-12-02 Thread Tom Lane
Peter Eisentraut  writes:
> Proposal:

> * pg_dump and psql will maintain compatibility with servers at least
>ten major releases back.
> * We keep old major release branches buildable as long as a new major
>release that has support for that old release is under support.

> This assumes a yearly major release cadence.

If the point is to not have to count dates carefully, why does the cadence
matter?

> I say "at least" because I wouldn't go around aggressively removing
> support for old releases.  If $NEW is supposed to support 9.5 but
> there is code that says `if (version > 9.4)`, I would not s/9.4/9.5/
> that unless that code is touched for other reasons.

I can get behind something roughly like this, but I wonder if it wouldn't
be better to formulate the policy in a reactive way, i.e. when X happens
we'll do Y.  If we don't plan to proactively remove some code every year,
then it seems like the policy really is more like "when something breaks,
then we'll make an attempt to keep it working if the release is less than
ten majors back; otherwise we'll declare that release no longer
buildable."

However, this'd imply continuing to test against releases that are out of
the ten-year window but have not yet been found to be broken.  Not sure
if that's a useful expenditure of test resources or not.

> Buildable for this purpose means just enough that you can use it to
> test pg_dump and psql.  This probably includes being able to run make
> installcheck and use pg_dump and psql against the regression database.
> It does not require support for any additional build-time options that
> are not required for this purpose (e.g., new OpenSSL releases).

I agree with the idea of being conservative about what outside
dependencies we will worry about for "buildable" old versions.
(Your nearby message about Python breakage is a good example of
why we must limit that.)  But I wonder about, say, libxml or libicu,
or even if we can afford to drop all the non-plpgsql PLs.  An
example of why that seems worrisome is that it's not clear we'd
have any meaningful coverage of transforms in pg_dump with no PLs.
I don't have any immediate proposal here, but it seems like an area
that needs some thought and specific policy.

> Example under this proposal:

> PG 15 supports PG 9.2
> PG 14 supports PG 9.1
> PG 13 supports PG 9.0
> PG 12 supports PG 8.4
> PG 11 supports PG 8.3
> PG 10 supports PG 8.2

I was going to express concern about having to resurrect branches
back to 8.2, but:

> In practice, the effort can focus on keeping the most recent cutoff
> release buildable.  So in the above example, we really only need to
> keep PG >=9.2 buildable to support ongoing development.  The chances
> that some needs to touch code pertaining to older versions in
> backbranches is lower, so those really would need to be dealt with
> very rarely.

OK.  Also, when you do need to check that, there are often other ways
than rebuilding the old branch on modern platforms --- people may
well have still-executable builds laying about, even if rebuilding
from source would be problematic.

regards, tom lane




Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-12-02 Thread Andrew Dunstan


On 12/1/21 19:30, Bossart, Nathan wrote:
> On 12/1/21, 10:37 AM, "Bossart, Nathan"  wrote:
>> On 12/1/21, 8:27 AM, "David Steele"  wrote:
>>> On 11/30/21 18:31, Bossart, Nathan wrote:
 Do you think it's still worth trying to make it safe, or do you think
 we should just remove exclusive mode completely?
>>> My preference would be to remove it completely, but I haven't gotten a
>>> lot of traction so far.
>> In this thread, I count 6 people who seem alright with removing it,
>> and 2 who might be opposed, although I don't think anyone has
>> explicitly stated they are against it.
> I hastily rebased the patch from 2018 and got it building and passing
> the tests.  I'm sure it will need additional changes, but I'll wait
> for more feedback before I expend too much more effort on this.
>


Should we really be getting rid of
PostgreSQL::Test::Cluster::backup_fs_hot() ?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Some RELKIND macro refactoring

2021-12-02 Thread Peter Eisentraut

On 24.11.21 05:20, Michael Paquier wrote:

On Mon, Nov 22, 2021 at 11:21:52AM +0100, Peter Eisentraut wrote:

On 19.11.21 08:31, Michael Paquier wrote:

Regarding 0001, I find the existing code a bit more self-documenting
if we keep those checks flagInhAttrs() and guessConstraintInheritance().
So I would rather leave these.

In that case, the existing check in guessConstraintInheritance() seems
wrong, because it doesn't check for RELKIND_MATVIEW.  Should we fix that?
It's dead code either way, but if the code isn't exercises, then these kinds
of inconsistency come about.

Yeah, this one could be added.  Perhaps that comes down to one's taste
at the end, but I would add it.


Ok, I have committed adding the missing relkind, as you suggest.  Patch 
v3-0001 is therefore obsolete.






Re: Can I assume relation would not be invalid during from ExecutorRun to ExecutorEnd

2021-12-02 Thread Robert Haas
On Wed, Dec 1, 2021 at 10:58 PM Andy Fan  wrote:
> Thanks! I clearly understand what's wrong in my previous knowledge.

Cool, glad it helped.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Skipping logical replication transactions on subscriber side

2021-12-02 Thread Peter Eisentraut



On 02.12.21 07:48, Amit Kapila wrote:

a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
b. Alter Subscription  SET ( subscription_parameter [=value]
[, ... ] );
c. Alter Subscription  On Error ( subscription_parameter
[=value] [, ... ] );
d. Alter Subscription  SKIP ( subscription_parameter
[=value] [, ... ] );
where subscription_parameter can be one of:
xid = 
lsn = 
...



As per discussion till now, option (d) seems preferable.


I agree.


In this, we
need to see how and what to allow as options. The simplest way for the
first version is to just allow one xid to be specified at a time which
would mean that specifying multiple xids should error out. We can also
additionally allow specifying operations like 'insert', 'update',
etc., and then relation list (list of oids). What that would mean is
that for a transaction we can allow which particular operations and
relations we want to skip.


I don't know how difficult it would be, but allowing multiple xids might 
be desirable.  But this syntax gives you flexibility, so we can also 
start with a simple implementation.



I am not sure what exactly we can provide to users to allow skipping
initial table sync as we can't specify XID there. One option that
comes to mind is to allow specifying a combination of copy_data and
relid to skip table sync for a particular relation. We might think of
not doing anything for table sync workers but not sure if that is a
good option.


I don't think this feature should affect tablesync.  The semantics are 
not clear, and it's not really needed.  If the tablesync doesn't work, 
you can try the setup again from scratch.





Re: Is ssl_crl_file "SSL server cert revocation list"?

2021-12-02 Thread Peter Eisentraut

On 02.12.21 10:42, Daniel Gustafsson wrote:

On 2 Dec 2021, at 06:07, Kyotaro Horiguchi  wrote:

At Thu, 02 Dec 2021 13:54:41 +0900 (JST), Kyotaro Horiguchi 
 wrote in

As discussed in the thread [1], I find the wording "SSL server
certificate revocation list" as misleading or plain wrong.


FWIW, I'm convinced that that's plain wrong after finding some
occurances of "(SSL) client certificate" in the doc.


I agree with this, the concepts have been a bit muddled.

While in there I noticed that we omitted mentioning sslcrldir in a few cases.
The attached v2 adds these and removes the whitespace changes from your patch
for easier review.


This change looks correct to me.




Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-12-02 Thread Arne Roland
Afaiac we should add a simple testcase here, like I suggested in 
477344d5f17c4a8e95d3a5bb6642718a.
 Apart from that I am not sure there is work to be done here.


Am I wrong?


Regards

Arne


From: Arne Roland 
Sent: Saturday, June 26, 2021 5:50:49 PM
To: Tomas Vondra
Cc: pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in 
generate_orderedappend_path


Hi Tomas,

I don't think there is much work left to do here.

Did you have a look at the test case? Did it make sense to you?

And I am sorry. I had another look at this and it seems I was confused (again).

From: Arne Roland
Sent: Monday, April 26, 2021 13:00
To: Tomas Vondra; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in 
generate_orderedappend_path

> I think it should. We have a ParallelAppend node after all.
> I'm not really familiar with the way 
> get_cheapest_fractional_path_for_pathkeys is used, but a quick search 
> suggests to
> me, that build_minmax_path was thus far the only one using it. And minmax 
> paths are never parallel safe anyway. I think that is the reason it doesn't 
> do that already.

The whole segment were are talking about obviously assumes 
require_parallel_safe is not needed. I wasn't aware that in 
set_append_rel_size. And I just realized there is a great comment explaining 
why it rightfully does so:
 /*
 * If any live child is not parallel-safe, treat the whole appendrel
 * as not parallel-safe.  In future we might be able to generate plans
 * in which some children are farmed out to workers while others are
 * not; but we don't have that today, so it's a waste to consider
 * partial paths anywhere in the appendrel unless it's all safe.
 * (Child rels visited before this one will be unmarked in
 * set_append_rel_pathlist().)
 */
So afaik we don't need to think further about this.


From: Tomas Vondra 
Sent: Thursday, June 3, 2021 22:57
To: Zhihong Yu
Cc: Arne Roland; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in 
generate_orderedappend_path
> Actually, there are two comments
>
>/* XXX maybe we should have startup_new_fractional? */
>
> in the patch I posted - I completely forgot about that. But I think
> that's a typo, I think - it should be
>
> /* XXX maybe we should have startup_neq_fractional? */
>
> and the new flag would work similarly to startup_neq_total, i.e. it's
> pointless to add paths where startup == fractional cost.
>
> At least I think that was the idea when I wrote the patch, it way too
> long ago.

> Sorry, I almost forgot about this myself. I only got reminded upon seeing 
> that again with different queries/tables.
> Just to be sure I get this correctly: You mean startup_gt_fractional (cost) 
> as an additional condition, right?

Could you clarify that for me?

Regards
Arne



Re: Column Filtering in Logical Replication

2021-12-02 Thread Alvaro Herrera
On 2021-Dec-01, Alvaro Herrera wrote:

> Hi
> 
> I took the latest posted patch, rebased on current sources, fixed the
> conflicts, and pgindented.  No further changes.  Here's the result.  All
> tests are passing for me.  Some review comments that were posted have
> not been addressed yet; I'll look into that soon.

In v7 I have reinstated the test file and fixed the silly problem that
caused it to fail (probably a mistake of mine while rebasing).

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)
>From 1fadd74c39967bca2807f5f7ad8f894ed7c4ad50 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 6 Sep 2021 10:34:29 -0300
Subject: [PATCH v7] Add column filtering to logical replication

Add capability to specifiy column names while linking
the table to a publication, at the time of CREATE or ALTER
publication. This will allow replicating only the specified
columns. Other columns, if any, on the subscriber will be populated
locally or NULL will be inserted if no value is supplied for the column
by the upstream during INSERT.
This facilitates replication to a table on subscriber
containing only the subscribed/filtered columns.
If no filter is specified, all the columns are replicated.
REPLICA IDENTITY columns are always replicated.
Thus, prohibit adding relation to publication, if column filters
do not contain REPLICA IDENTITY.
Add a tap test for the same in src/test/subscription.

Author: Rahila Syed 
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektnrh8nj1jf...@mail.gmail.com
---
 src/backend/access/common/relation.c |  22 +++
 src/backend/catalog/pg_publication.c |  58 +++-
 src/backend/commands/publicationcmds.c   |   8 +-
 src/backend/nodes/copyfuncs.c|   1 +
 src/backend/nodes/equalfuncs.c   |   1 +
 src/backend/parser/gram.y|  36 -
 src/backend/replication/logical/proto.c  | 103 ++---
 src/backend/replication/logical/tablesync.c  | 101 -
 src/backend/replication/pgoutput/pgoutput.c  |  77 --
 src/include/catalog/pg_publication.h |   1 +
 src/include/catalog/pg_publication_rel.h |   4 +
 src/include/nodes/parsenodes.h   |   1 +
 src/include/replication/logicalproto.h   |   6 +-
 src/include/utils/rel.h  |   1 +
 src/test/subscription/t/021_column_filter.pl | 143 +++
 15 files changed, 512 insertions(+), 51 deletions(-)
 create mode 100644 src/test/subscription/t/021_column_filter.pl

diff --git a/src/backend/access/common/relation.c b/src/backend/access/common/relation.c
index 632d13c1ea..05d6fcba26 100644
--- a/src/backend/access/common/relation.c
+++ b/src/backend/access/common/relation.c
@@ -21,12 +21,14 @@
 #include "postgres.h"
 
 #include "access/relation.h"
+#include "access/sysattr.h"
 #include "access/xact.h"
 #include "catalog/namespace.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "storage/lmgr.h"
 #include "utils/inval.h"
+#include "utils/lsyscache.h"
 #include "utils/syscache.h"
 
 
@@ -215,3 +217,23 @@ relation_close(Relation relation, LOCKMODE lockmode)
 	if (lockmode != NoLock)
 		UnlockRelationId(, lockmode);
 }
+
+/*
+ * Return a bitmapset of attributes given the list of column names
+ */
+Bitmapset *
+get_table_columnset(Oid relid, List *columns, Bitmapset *att_map)
+{
+	ListCell   *cell;
+
+	foreach(cell, columns)
+	{
+		const char *attname = lfirst(cell);
+		int			attnum = get_attnum(relid, attname);
+
+		if (!bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, att_map))
+			att_map = bms_add_member(att_map,
+	 attnum - FirstLowInvalidHeapAttributeNumber);
+	}
+	return att_map;
+}
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 63579b2f82..de5f3266cd 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -50,8 +50,12 @@
  * error if not.
  */
 static void
-check_publication_add_relation(Relation targetrel)
+check_publication_add_relation(Relation targetrel, List *targetcols)
 {
+	bool		replidentfull = (targetrel->rd_rel->relreplident == REPLICA_IDENTITY_FULL);
+	Oid			relid = RelationGetRelid(targetrel);
+	Bitmapset  *idattrs;
+
 	/* Must be a regular or partitioned table */
 	if (RelationGetForm(targetrel)->relkind != RELKIND_RELATION &&
 		RelationGetForm(targetrel)->relkind != RELKIND_PARTITIONED_TABLE)
@@ -82,6 +86,36 @@ check_publication_add_relation(Relation targetrel)
  errmsg("cannot add relation \"%s\" to publication",
 		RelationGetRelationName(targetrel)),
  errdetail("This operation is not supported for unlogged tables.")));
+
+	/*
+	 * Cannot specify column filter when REPLICA IDENTITY IS FULL or if column
+	 * filter does not contain REPLICA IDENITY columns
+	 */
+	if (targetcols != NIL)
+	{
+		if (replidentfull)
+			

Re: allow partial union-all and improve parallel subquery costing

2021-12-02 Thread Daniel Gustafsson
With the thread stalled and requests for a test (documentation really?) not
responded to I'm marking this patch Returned with Feedback.

--
Daniel Gustafsson   https://vmware.com/





Re: row filtering for logical replication

2021-12-02 Thread Euler Taveira
On Thu, Dec 2, 2021, at 4:18 AM, Peter Smith wrote:
> PSA a new v44* patch set.
> 
> It includes a new patch 0006 which implements the idea above.
> 
> ExprState cache logic is basically all the same as before (including
> all the OR combining), but there are now 4x ExprState caches keyed and
> separated by the 4x different pubactions.
row filter is not applied for TRUNCATEs so it is just 3 operations.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2021-12-02 Thread Dilip Kumar
On Wed, Dec 1, 2021 at 6:04 PM Dilip Kumar  wrote:

> Thanks a lot for testing this. From the error, it seems like some of
> the old buffer w.r.t. the previous tablespace is not dropped after the
> movedb.  Actually, we are calling DropDatabaseBuffers() after copying
> to a new tablespace and dropping all the buffers of this database
> w.r.t the old tablespace.  But seems something is missing, I will
> reproduce this and try to fix it by tomorrow.  I will also fix the
> other review comments raised by you in the previous mail.

Okay, I got the issue, basically we are dropping the database buffers
but not unregistering the existing sync request for database buffers
w.r.t old tablespace. Attached patch fixes that.  I also had to extend
ForgetDatabaseSyncRequests so that we can delete the sync request of
the database for the particular tablespace so added another patch for
the same (0006).

I will test the performance scenario next week, which is suggested by John.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 91acf75aef203d2201ab462e21f26d36d12dad67 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Fri, 24 Sep 2021 18:13:25 +0530
Subject: [PATCH v7 3/7] Refactor index_copy_data

Make separate interface for copying relation storage, this will
be used by later patch for copying the database relations.
---
 src/backend/commands/tablecmds.c | 61 
 src/include/commands/tablecmds.h |  5 
 src/tools/pgindent/typedefs.list |  1 +
 3 files changed, 43 insertions(+), 24 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 5e9cae2..25f897f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14237,21 +14237,15 @@ AlterTableMoveAll(AlterTableMoveAllStmt *stmt)
 	return new_tablespaceoid;
 }
 
-static void
-index_copy_data(Relation rel, RelFileNode newrnode)
+/*
+ * Copy source smgr relation's all fork's data to the destination.
+ *
+ * copy_storage - storage copy function, which is passed by the caller.
+ */
+void
+RelationCopyAllFork(SMgrRelation src_smgr, SMgrRelation	dst_smgr,
+	char relpersistence, copy_relation_storage copy_storage)
 {
-	SMgrRelation dstrel;
-
-	dstrel = smgropen(newrnode, rel->rd_backend);
-
-	/*
-	 * Since we copy the file directly without looking at the shared buffers,
-	 * we'd better first flush out any pages of the source relation that are
-	 * in shared buffers.  We assume no new changes will be made while we are
-	 * holding exclusive lock on the rel.
-	 */
-	FlushRelationBuffers(rel);
-
 	/*
 	 * Create and copy all forks of the relation, and schedule unlinking of
 	 * old physical files.
@@ -14259,32 +14253,51 @@ index_copy_data(Relation rel, RelFileNode newrnode)
 	 * NOTE: any conflict in relfilenode value will be caught in
 	 * RelationCreateStorage().
 	 */
-	RelationCreateStorage(newrnode, rel->rd_rel->relpersistence);
+	RelationCreateStorage(dst_smgr->smgr_rnode.node, relpersistence);
 
 	/* copy main fork */
-	RelationCopyStorage(RelationGetSmgr(rel), dstrel, MAIN_FORKNUM,
-		rel->rd_rel->relpersistence);
+	copy_storage(src_smgr, dst_smgr, MAIN_FORKNUM, relpersistence);
 
 	/* copy those extra forks that exist */
 	for (ForkNumber forkNum = MAIN_FORKNUM + 1;
 		 forkNum <= MAX_FORKNUM; forkNum++)
 	{
-		if (smgrexists(RelationGetSmgr(rel), forkNum))
+		if (smgrexists(src_smgr, forkNum))
 		{
-			smgrcreate(dstrel, forkNum, false);
+			smgrcreate(dst_smgr, forkNum, false);
 
 			/*
 			 * WAL log creation if the relation is persistent, or this is the
 			 * init fork of an unlogged relation.
 			 */
-			if (RelationIsPermanent(rel) ||
-(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
+			if (relpersistence == RELPERSISTENCE_PERMANENT ||
+(relpersistence == RELPERSISTENCE_UNLOGGED &&
  forkNum == INIT_FORKNUM))
-log_smgrcreate(, forkNum);
-			RelationCopyStorage(RelationGetSmgr(rel), dstrel, forkNum,
-rel->rd_rel->relpersistence);
+log_smgrcreate(_smgr->smgr_rnode.node, forkNum);
+
+			/* Copy a fork's data, block by block. */
+			copy_storage(src_smgr, dst_smgr, forkNum, relpersistence);
 		}
 	}
+}
+
+static void
+index_copy_data(Relation rel, RelFileNode newrnode)
+{
+	SMgrRelation dstrel;
+
+	dstrel = smgropen(newrnode, rel->rd_backend);
+
+	/*
+	 * Since we copy the file directly without looking at the shared buffers,
+	 * we'd better first flush out any pages of the source relation that are
+	 * in shared buffers.  We assume no new changes will be made while we are
+	 * holding exclusive lock on the rel.
+	 */
+	FlushRelationBuffers(rel);
+
+	RelationCopyAllFork(RelationGetSmgr(rel), dstrel,
+		rel->rd_rel->relpersistence, RelationCopyStorage);
 
 	/* drop old relation, and close new one */
 	RelationDropStorage(rel);
diff --git a/src/include/commands/tablecmds.h b/src/include/commands/tablecmds.h
index 336549c..e0e0aa5 100644
--- a/src/include/commands/tablecmds.h
+++ 

Re: pg_get_publication_tables() output duplicate relid

2021-12-02 Thread Amit Langote
On Thu, Dec 2, 2021 at 2:27 PM Amit Kapila  wrote:
> On Thu, Dec 2, 2021 at 8:42 AM Amit Langote  wrote:
> > Reading Alvaro's email at the top again gave me a pause to reconsider
> > what I had said in reply.  It might indeed have been nice if the
> > publication DDL itself had prevented the cases where a partition and
> > its ancestor are added to a publication, though as Hou-san mentioned,
> > partition ATTACHes make that a bit tricky to enforce at all times,
> > though of course not impossible.  Maybe it's okay to just de-duplicate
> > pg_publication_tables output as the patch does, even though it may
> > appear to be a band-aid solution if we one considers Alvaro's point
> > about consistency.
>
> True, I think even if we consider that idea it will be a much bigger
> change, and also as it will be a behavioral change so we might want to
> keep it just for HEAD and some of these fixes need to be backpatched.
> Having said that, if you or someone want to pursue that idea and come
> up with a better solution than what we have currently it is certainly
> welcome.

Okay, I did write a PoC patch this morning after sending out my
earlier email.  I polished it a bit, which is attached.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


0001-wip-don-t-add-partition-to-publication-if-parent-pre.patch
Description: Binary data


Re: pgbench logging broken by time logic changes

2021-12-02 Thread Daniel Gustafsson
> On 4 Nov 2021, at 13:38, Daniel Gustafsson  wrote:
> 
>> On 11 Jul 2021, at 15:07, Fabien COELHO  wrote:
> 
>> Attached the fully "ignored" version of the time features test as a patch.
> 
> This version of the patch is failing to apply on top of HEAD, can you please
> submit a rebased version?

I'm marking this patch Returned with Feedback, please feel free to resubmit
this when there is an updated version of the patch available.

--
Daniel Gustafsson   https://vmware.com/





Re: pg_dump versus ancient server versions

2021-12-02 Thread Robert Haas
On Thu, Dec 2, 2021 at 5:01 AM Peter Eisentraut
 wrote:
> * pg_dump and psql will maintain compatibility with servers at least
>ten major releases back.
>
> * We keep old major release branches buildable as long as a new major
>release that has support for that old release is under support.
>
> Buildable for this purpose means just enough that you can use it to
> test pg_dump and psql.  This probably includes being able to run make
> installcheck and use pg_dump and psql against the regression database.
> It does not require support for any additional build-time options that
> are not required for this purpose (e.g., new OpenSSL releases).
> Conversely, it should be buildable with default compiler options.  For
> example, if it fails to build and test cleanly unless you use -O0,
> that should be fixed.  Fixes in very-old branches should normally be
> backpatches that have stabilized in under-support branches.  Changes
> that silence compiler warnings in newer compilers are by themselves
> not considered a backpatch-worthy fix.

Sounds reasonable. It doesn't really make sense to insist that the
tools have to be compatible with releases that most developers can't
actually build.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: preserve timestamps when installing headers

2021-12-02 Thread Aleksander Alekseev
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Personally, I don't often encounter the problem that Alexander is describing, 
but I agree that there are cases when the simplest way to debug a tricky bug is 
to make a modification to the core. In fact, I used this technique to diagnose 
[1].

Unless anyone can think of the scenario when the proposed change will break 
something, I would suggest merging it.

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=98ec35b0

The new status of this patch is: Ready for Committer


Re: [PATCH] Automatic HASH and LIST partition creation

2021-12-02 Thread Daniel Gustafsson
This thread has stalled since July with review comments unanswered, I'm marking
the patch Returned with Feedback.  Please feel free to resubmit when/if a new
patch is available.

--
Daniel Gustafsson   https://vmware.com/





Re: Commitfest 2021-11 Patch Triage - Part 1

2021-12-02 Thread Daniel Gustafsson
> On 6 Nov 2021, at 17:20, Tomas Vondra  wrote:
> 
> On 11/5/21 22:15, Daniel Gustafsson wrote:
>> ...
>> 1651: GROUP BY optimization
>> ===
>> This is IMO a desired optimization, and after all the heavy lifting by Tomas
>> the patch seems to be in pretty good shape.
> 
> I agree it's desirable. To move the patch forward, I need some feedback on 
> the costing questions, mentioned in my last review, and the overall approach 
> to planning (considering multiple group by variants).

I'm moving this to the next CF with the hope that more feedback is received.
If you think "Ready for Committer" is a more suitable status, please do update
it as I'm leaving it in Needs Review for now not having the full context.

--
Daniel Gustafsson   https://vmware.com/





Re: error_severity of brin work item

2021-12-02 Thread Daniel Gustafsson
This thread has stalled, and from the discussions there isn't consencus on how
to proceed (if at all?), so I'm marking this Returned with Feedback.  If the
discussion and work picks up again then it can be resubmitted.

--
Daniel Gustafsson   https://vmware.com/





Re: An out-of-date comment in nodeIndexonlyscan.c

2021-12-02 Thread Daniel Gustafsson
> On 9 Nov 2021, at 15:28, Daniel Gustafsson  wrote:
> 
> This patch now fails to apply, probably due to a mostly trivial collision with
> fdd88571454e2b00dbe446e8609c6e4294ca89ae in the test files.  Can you submit a
> rebased version?

I've marked this Returned with Feedback, please feel free to resubmit when a
new version of the patch is available.

--
Daniel Gustafsson   https://vmware.com/





Re: Iterating on IndexTuple attributes and nbtree page-level dynamic prefix truncation

2021-12-02 Thread Daniel Gustafsson
> On 24 Jun 2021, at 18:21, Matthias van de Meent 
>  wrote:
> 
> On Thu, 17 Jun 2021 at 17:14, Matthias van de Meent
>  wrote:
>> 
>> I'll try to
>> benchmark the patches in this patchset (both 0001, and 0001+0002) in
>> the upcoming weekend.
> 
> Somewhat delayed, benchmark results are attached.

I'm moving this patch to the next CF to allow for more review of the latest
revision and benchmark results.  It no longer applies though, so please post a
rebased version.

--
Daniel Gustafsson   https://vmware.com/





Re: Lowering the ever-growing heap->pd_lower

2021-12-02 Thread Daniel Gustafsson
This thread has stalled, and the request for benchmark/test has gone unanswered
so I'm marking this patch Returned with Feedback.  Please feel free to resubmit
this patch if it is picked up again.

--
Daniel Gustafsson   https://vmware.com/





Re: pg_dump versus ancient server versions

2021-12-02 Thread Peter Eisentraut

I was thinking a bit about formulating a policy for pg_dump backward
compatibility, based on the discussions in this thread.

Premises and preparatory thoughts:

- Users (and developers) want pg_dump to support server versions that
  are much older than non-EOL versions.

- Less critically, much-longer backward compatibility has also
  historically been provided for psql, so keeping those two the same
  would make sense.

- The policy for other client-side tools (list at [0]) is less clear
  and arguably less important.  I suggest we focus on pg_dump and psql
  first, and then we can decide for the rest whether they want to
  match a longer window, a shorter window, or a different policy
  altogether (e.g., ecpg).

- If we are going to maintain compatibility with very old server
  versions, we need to make sure the older server versions can at
  least still be built while an allegedly-compatible client tool is
  under support.

[0]: https://www.postgresql.org/docs/devel/reference-client.html

Proposal:

* pg_dump and psql will maintain compatibility with servers at least
  ten major releases back.

This assumes a yearly major release cadence.

I use the count of major releases here instead of some number of
years, as was previously discussed, for two reasons.  First, it makes
computing the cutoff easier, because you are not bothered by whether
some old release was released a few weeks before or after the
equivalent date in the current year for the new release.  Second,
there is no ambiguity about what happens during the lifetime of a
major release: If major release $NEW supports major release $OLD at
the time of $NEW's release, then that stays the same for the whole
life of $NEW; we don't start dropping support for $OLD in $NEW.5
because a year has passed.

I say "at least" because I wouldn't go around aggressively removing
support for old releases.  If $NEW is supposed to support 9.5 but
there is code that says `if (version > 9.4)`, I would not s/9.4/9.5/
that unless that code is touched for other reasons.

Then ...

* We keep old major release branches buildable as long as a new major
  release that has support for that old release is under support.

Buildable for this purpose means just enough that you can use it to
test pg_dump and psql.  This probably includes being able to run make
installcheck and use pg_dump and psql against the regression database.
It does not require support for any additional build-time options that
are not required for this purpose (e.g., new OpenSSL releases).
Conversely, it should be buildable with default compiler options.  For
example, if it fails to build and test cleanly unless you use -O0,
that should be fixed.  Fixes in very-old branches should normally be
backpatches that have stabilized in under-support branches.  Changes
that silence compiler warnings in newer compilers are by themselves
not considered a backpatch-worthy fix.

(In some cases, the support window of typical compilers should be
considered.  If adding support for a very new compiler with new
aggressive optimizations turns out to be too invasive, then that
compiler might simply be declared not supported for that release.  But
we should strive to support at least one compiler that still has some
upstream support.)

This keep-buildable effort is on an as-needed basis.  There is no
requirement to keep the buildability current at all times, and there
is no requirement to keep all platforms working at all times.
Obviously, any changes made to improve buildability should not
knowingly adversely affect other platforms.

(The above could be reconsidered if buildfarm support is available,
but I don't consider that necessary and wouldn't want to wait for it.)

There is no obligation on anyone backpatching fixes to supported
branches to also backpatch them to keep-buildable branches.  It is up
to those working on pg_dump/psql and requiring testing against old
versions to pick available fixes and apply them to keep-buildable
branches as needed.

Finally, none of this is meant to imply that there will be any
releases, packages, security support, production support, or community
support for keep-buildable branches.  This is a Git-repo-only,
developer-focusing effort.

Example under this proposal:

PG 15 supports PG 9.2
PG 14 supports PG 9.1
PG 13 supports PG 9.0
PG 12 supports PG 8.4
PG 11 supports PG 8.3
PG 10 supports PG 8.2

In practice, the effort can focus on keeping the most recent cutoff
release buildable.  So in the above example, we really only need to
keep PG >=9.2 buildable to support ongoing development.  The chances
that some needs to touch code pertaining to older versions in
backbranches is lower, so those really would need to be dealt with
very rarely.


The parent message has proposed to remove support for PG <9.0 from
master.  But I think that was chosen mainly because it was a round
number.  I suggest we pick a cutoff based on years, as I had
described, and then proceed with that patch.




Re: pgcrypto: Remove explicit hex encoding/decoding from tests

2021-12-02 Thread Daniel Gustafsson
> On 2 Dec 2021, at 10:22, Peter Eisentraut  
> wrote:

> pgcrypto tests use encode() and decode() calls to convert to/from hex 
> encoding.  This was from before the hex format was available in bytea. Now we 
> can remove the extra explicit encoding/decoding calls and rely on the default 
> output format.

My eyes glazed over a bit but definitely a +1 on the idea.

--
Daniel Gustafsson   https://vmware.com/





Re: Is ssl_crl_file "SSL server cert revocation list"?

2021-12-02 Thread Daniel Gustafsson
> On 2 Dec 2021, at 06:07, Kyotaro Horiguchi  wrote:
> 
> At Thu, 02 Dec 2021 13:54:41 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
>> As discussed in the thread [1], I find the wording "SSL server
>> certificate revocation list" as misleading or plain wrong.
> 
> FWIW, I'm convinced that that's plain wrong after finding some
> occurances of "(SSL) client certificate" in the doc.

I agree with this, the concepts have been a bit muddled.

While in there I noticed that we omitted mentioning sslcrldir in a few cases.
The attached v2 adds these and removes the whitespace changes from your patch
for easier review.

--
Daniel Gustafsson   https://vmware.com/



fix_crl_doc-v2.diff
Description: Binary data


Re: Non-superuser subscription owners

2021-12-02 Thread Amit Kapila
On Thu, Dec 2, 2021 at 12:51 AM Mark Dilger
 wrote:
>
>
> > On Dec 1, 2021, at 5:36 AM, Amit Kapila  wrote:
> >
> > On Wed, Dec 1, 2021 at 2:12 AM Jeff Davis  wrote:
> >>
> >> On Tue, 2021-11-30 at 17:25 +0530, Amit Kapila wrote:
> >>> I think it would be better to do it before we allow subscription
> >>> owners to be non-superusers.
> >>
> >> There are a couple other things to consider before allowing non-
> >> superusers to create subscriptions anyway. For instance, a non-
> >> superuser shouldn't be able to use a connection string that reads the
> >> certificate file from the server unless they also have
> >> pg_read_server_files privs.
> >>
> >
> > Isn't allowing to create subscriptions via non-superusers and allowing
> > to change the owner two different things? I am under the impression
> > that the latter one is more towards allowing the workers to apply
> > changes with a non-superuser role.
>
> The short-term goal is to have logical replication workers respect the 
> privileges of the role which owns the subscription.
>
> The long-term work probably includes creating a predefined role with 
> permission to create subscriptions, and the ability to transfer those 
> subscriptions to roles who might be neither superuser nor members of any 
> particular predefined role; the idea being that logical replication 
> subscriptions can be established without any superuser involvement, and may 
> thereafter run without any special privilege.
>
> The more recent patches on this thread are not as ambitious as the earlier 
> patch-sets.  We are no longer trying to support transferring subscriptions to 
> non-superusers.
>
> Right now, on HEAD, if a subscription owner has superuser revoked, the 
> subscription can continue to operate as superuser in so far as its 
> replication actions are concerned.  That seems like a pretty big security 
> hole.
>
> This patch mostly plugs that hole by adding permissions checks, so that a 
> subscription owned by a role who has privileges revoked cannot (for the most 
> part) continue to act under the old privileges.
>

If we want to maintain the property that subscriptions can only be
owned by superuser for your first version then isn't a simple check
like ((!superuser()) for each of the operations is sufficient?

> There are two problematic edge cases that can occur after transfer of 
> ownership.  Remember, the new owner is required to be superuser for the 
> transfer of ownership to occur.
>
> 1) A subscription is transferred to a new owner, and the new owner then has 
> privilege revoked.
>
> 2) A subscription is transferred to a new owner, and then the old owner has 
> privileges increased.
>

In (2), I am not clear what do you mean by "the old owner has
privileges increased"? If the owners can only be superusers then what
does it mean to increase the privileges.

> In both cases, a currently running logical replication worker may finish a 
> transaction in progress acting with the current privileges of the old owner.  
> The clearest solution is, as you suggest, to refuse transfer of ownership of 
> subscriptions that are enabled.
>
> Doing so will create a failure case for REASSIGN OWNED BY.  Will that be ok?
>

I think so. Do we see any problem with that? I think we have some
failure cases currently as well like "All Tables Publication" can only
be owned by superusers whereas ownership for others can be to
non-superusers and similarly we can't change ownership for pinned
objects. I think the case being discussed is not exactly the same but
I am not able to see a problem with it.

-- 
With Regards,
Amit Kapila.




RE: Data is copied twice when specifying both child and parent table in publication

2021-12-02 Thread houzj.f...@fujitsu.com
On Thursday, December 2, 2021 12:50 PM Amit Kapila  
wrote:
> On Thu, Dec 2, 2021 at 9:41 AM Greg Nancarrow 
> wrote:
> >
> > On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow 
> wrote:
> > >
> > > If you updated my original description to say "(instead of just the
> > > individual partitions)", it would imply the same I think.
> > > But I don't mind if you want to explicitly state both cases to make it 
> > > clear.
> > >
> >
> > For example, something like:
> >
> > For publications of partitioned tables with publish_via_partition_root
> > set to true, only the partitioned table (and not its partitions) is
> > included in the view, whereas if publish_via_partition_root is set to
> > false, only the individual partitions are included in the view.
> >
> 
> Yeah, that sounds good to me.

It looks good to me as well.
Attach the patches for (HEAD~13) which merge the suggested doc change. I
prepared the code patch and test patch separately to make it easier for 
committer 
to confirm.

Best regards,
Hou zj




HEAD-0001-Fix-double-publish-of-child-table-s-data.patch
Description: HEAD-0001-Fix-double-publish-of-child-table-s-data.patch


PG14-0001-Fix-double-publish-of-child-table-s-data.patch
Description: PG14-0001-Fix-double-publish-of-child-table-s-data.patch


PG13-0001-Fix-double-publish-of-child-table-s-data.patch
Description: PG13-0001-Fix-double-publish-of-child-table-s-data.patch


HEAD-0002-testcases.patch
Description: HEAD-0002-testcases.patch


PG13-0002-testcases.patch
Description: PG13-0002-testcases.patch


PG14-0002-testcases.patch
Description: PG14-0002-testcases.patch


Re: row filtering for logical replication

2021-12-02 Thread vignesh C
On Thu, Dec 2, 2021 at 9:29 AM houzj.f...@fujitsu.com
 wrote:
>
> On Thur, Dec 2, 2021 5:21 AM Peter Smith  wrote:
> > PSA the v44* set of patches.
> >
> > The following review comments are addressed:
> >
> > v44-0001 main patch
> > - Renamed the TAP test 026->027 due to clash caused by recent commit [1]
> > - Refactored table_close [Houz 23/11] #2
> > - Alter compare where clauses [Amit 24/11] #0
> > - PG docs CREATE SUBSCRIPTION [Tang 30/11] #2
> > - PG docs CREATE PUBLICATION [Vignesh 30/11] #1, #4, [Tang 30/11] #1, [Tomas
> > 23/9] #2
> >
> > v44-0002 validation walker
> > - Add NullTest support [Peter 18/11]
> > - Update comments [Amit 24/11] #3
> > - Disallow user-defined types [Amit 24/11] #4
> > - Errmsg - skipped because handled by top-up [Vignesh 23/11] #2
> > - Removed #if 0 [Vignesh 30/11] #2
> >
> > v44-0003 new/old tuple
> > - NA
> >
> > v44-0004 tab-complete and pgdump
> > - Handle table-list commas better [Vignesh 23/11] #2
> >
> > v44-0005 top-up patch for validation
> > - (This patch will be added again later)
>
> Attach the v44-0005 top-up patch.

Thanks for the updated patch, few comments:
1) Both testpub5a and testpub5c publication are same, one of them can be removed
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5a FOR TABLE testpub_rf_tbl1 WHERE (a > 1)
WITH (publish="insert");
+CREATE PUBLICATION testpub5b FOR TABLE testpub_rf_tbl1;
+CREATE PUBLICATION testpub5c FOR TABLE testpub_rf_tbl1 WHERE (a > 3)
WITH (publish="insert");
+RESET client_min_messages;
+\d+ testpub_rf_tbl1
+DROP PUBLICATION testpub5a, testpub5b, testpub5c;

testpub5b will be covered in the earlier existing case above:
ALTER PUBLICATION testpib_ins_trunct ADD TABLE pub_test.testpub_nopk,
testpub_tbl1;

\d+ pub_test.testpub_nopk
\d+ testpub_tbl1

I felt test related to testpub5b is also not required

2) testpub5 and testpub_syntax2 are similar, one of them can be removed:
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1,
testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);
+RESET client_min_messages;
+\dRp+ testpub5

+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_syntax2 FOR TABLE testpub_rf_tbl1,
testpub_rf_myschema.testpub_rf_tbl5 WHERE (h < 999);
+RESET client_min_messages;
+\dRp+ testpub_syntax2
+DROP PUBLICATION testpub_syntax2;

3) testpub7 can be renamed to testpub6 to maintain the continuity
since the previous testpub6 did not succeed:
+CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func, LEFTARG = integer,
RIGHTARG = integer);
+CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27);
+-- fail - WHERE not allowed in DROP
+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA
testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;

4) Did this test intend to include where clause in testpub_rf_tb16, if
so it can be added:
+-- fail - cannot ALTER SET table which is a member of a pre-existing schema
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub7 FOR ALL TABLES IN SCHEMA testpub_rf_myschema1;
+ALTER PUBLICATION testpub7 SET ALL TABLES IN SCHEMA
testpub_rf_myschema1, TABLE testpub_rf_myschema1.testpub_rf_tb16;
+RESET client_min_messages;

5) It should be removed from typedefs.list too:
-/* For rowfilter_walker. */
-typedef struct {
-   Relationrel;
-   boolcheck_replident; /* check if Var is
bms_replident member? */
-   Bitmapset  *bms_replident;
-} rf_context;
-

Regards,
Vignesh