On Thu, 11 Dec 2025 at 04:10, Peter Smith <[email protected]> wrote: > > Hi Shlok - > > Here are some review comments for v31-0001 (EXCEPT (tablelist)) > > ====== > Commit message > > 1. > The new syntax allows specifying excluded relations when creating or altering > a publication. For example: > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE (t1,t2); > > ~ > > In v30, you removed all the ALTER PUBLICATION changes, so the "or > altering" in the message above also needs to be removed. > > ====== > doc/src/sgml/logical-replication.sgml > > 2. > <para> > - To add tables to a publication, the user must have ownership rights on the > - table. To add all tables in schema to a publication, the user must be a > - superuser. To create a publication that publishes all tables, all tables > in > - schema, or all sequences automatically, the user must be a superuser. > + To create a publication using <literal>FOR ALL TABLES</literal>, > + <literal>FOR ALL SEQUENCES</literal> or > + <literal>FOR TABLES IN SCHEMA</literal>, the user must be a > superuser. To add > + <literal>ALL TABLES</literal> or <literal>TABLES IN SCHEMA</literal> to a > + publication, the user must be a superuser. To add tables to a publication, > + the user must have ownership rights on the table. > </para> > > This is a good improvement, but I was not sure why it is in this > patch. Should it be a separate thread for a docs improvement? > > ====== > src/backend/catalog/pg_publication.c > > GetTopMostAncestorInPublication: > > 3. > { > Oid ancestor = lfirst_oid(lc); > - List *apubids = GetRelationPublications(ancestor); > - List *aschemaPubids = NIL; > + List *apubids = NIL; > + List *aexceptpubids = NIL; > + List *aschemapubids = NIL; > + bool set_top = false; > + > + GetRelationPublications(ancestor, &apubids, &aexceptpubids); > > level++; > > - if (list_member_oid(apubids, puboid)) > + /* check if member of table publications */ > + set_top = list_member_oid(apubids, puboid); > + if (!set_top) > { > - topmost_relid = ancestor; > + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor)); > > - if (ancestor_level) > - *ancestor_level = level; > + /* check if member of schema publications */ > + set_top = list_member_oid(aschemapubids, puboid); > + > + /* > + * If the publication is all tables publication and the table is > + * not part of exception tables. > + */ > + if (!set_top && puballtables) > + set_top = !list_member_oid(aexceptpubids, puboid); > } > - else > + > + if (set_top) > { > - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); > - if (list_member_oid(aschemaPubids, puboid)) > - { > - topmost_relid = ancestor; > + topmost_relid = ancestor; > > - if (ancestor_level) > - *ancestor_level = level; > - } > + if (ancestor_level) > + *ancestor_level = level; > } > > list_free(apubids); > - list_free(aschemaPubids); > + list_free(aschemapubids); > + list_free(aexceptpubids); > } > > That 'aschemapubids' can be declared and freed within the if block. > > ~~~ > > publication_add_relation: > > 4. > + /* > + * Check when a partition is excluded via EXCEPT TABLE while the > + * publication has publish_via_partition_root = true. > + */ > + if (pub->alltables && pub->pubviaroot && pri->except && > + targetrel->rd_rel->relispartition) > + ereport(WARNING, > > > This comment doesn't sound quite right: > > SUGGESTION > Handle the case where a partition is excluded by EXCEPT TABLE while > publish_via_partition_root = true. > > ~~~ > > 5. > + /* > + * Check when a partitioned table is excluded via EXCEPT TABLE while the > + * publication has publish_via_partition_root = false. > + */ > + if (pub->alltables && !pub->pubviaroot && pri->except && > + targetrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) > + ereport(WARNING, > > Ditto. Reword like suggested in the previous review comment. > > ~~~ > > 6. > +/* > + * Get the list of publication oids associated with a specified relation. > + * pubids is filled with the list of publication oids the relation is part > of. > + * except_pubids is filled with the list of publication oids the relation is > + * excluded from. > + * > + * This function returns true if the relation is part of any publication. > + */ > > Maybe putting 'pubids' and 'except_pubids' in single quotes will help > readability of this comment? > > Also, these are already Lists, so they are not filled with lists. > > SUGGESTION > Parameter 'pubids' returns the OIDs of the publications the relation is part > of. > Parameter 'except_pubids' returns the OIDs of publications the > relation is excluded from. > > ~~~ > > GetPublicationRelations: > > 7. > /* > - * Gets list of relation oids for a publication. > + * Return the list of relation OIDs for a publication. > + * > + * For a FOR ALL TABLES publication, this returns the list of tables that > were > + * explicitly excluded via an EXCEPT TABLE clause. > + * > + * For a FOR TABLE publication, this returns the list of tables explicitly > + * included in the publication. > * > - * This should only be used FOR TABLE publications, the FOR ALL > TABLES/SEQUENCES > - * should use GetAllPublicationRelations(). > + * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use > + * GetAllPublicationRelations() to obtain the complete set of tables covered > by > + * the publication. > */ > List * > GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt) > > 7a. > The function is called 'GetPublicationRelations', so it seems > unintuitive that it sometimes returns the list of all the tables that > are *excluded* from the publication. If you are going to have one > single function that does everything, then IMO it might be better to > hide that behind some wrapper functions like: > GetPublicationMemberRelations > GetPublicationExcludedRelations > > Consider also that all these assumptions might be OK today but they > won't be OK in the future. e.g. One day, when named FOR SEQUENCE > sq1,sq2 are supported then you will be alble to write a command like > FOR ALL TABLES EXCEPT (t1), FOR SEQUENCE sq1,sq2. That's going to be a > muddle of some included and some excluded relations. So, it is better > to cater for that scenario now, rather than have to rewrite all of > this function again in the future. e.g. Maybe instead of this function > returning one list it is better to return included/excluded Lists or > relations as output parameters? > > ~ > > 7b. > Also, comments like "Publications declared with FOR ALL TABLES or FOR > ALL SEQUENCES should use..." seems like too many assumptions are being > made. It would be better to enforce the calling requirements using > parameter checking and Asserts instead instead of hoping that callers > are going to abide by the comments. > > ~~~ > > GetAllPublicationRelations: > > 8. > + exceptlist = GetPublicationRelations(pubid, pubviaroot ? > + PUBLICATION_PART_ALL : > + PUBLICATION_PART_ROOT); > > This is similar to the above review comment. I'm not sure how you can > just assume that this must be the "except list" -- AFAICT this assumes > that 'GetAllPublicationRelations' can only be called by FOR ALL TABLES > (???). Seems like a lot of assumptions, that would be much better to > be enforced by Asserts in the code. > I agree with comments 7 and 8. I have added two functions 'GetPublicationIncludedRelations' and 'GetPublicationExcludedRelations'. To get Relations which are included or excluded in a publication. Both functions will call 'GetPublicationRelationsInternal' function. I have also reintroduced the 'except_flag' variable
> ======
> src/backend/commands/publicationcmds.c
>
> pub_rf_contains_invalid_column:
>
> 9.
> bool
> pub_rf_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
> - bool pubviaroot)
> + bool pubviaroot, bool puballtables)
>
> I felt that 'puballtables' is more "important" than 'pubviaroot' so
> maybe it should come earlier in the parameter list. (e.g. make it more
> similar to 'pub_contains_invalid_column')
>
> ======
> src/backend/parser/gram.y
>
> 10.
> + pub_except_obj_list opt_except_clause
>
> I felt that 'opt_except_clause' should better be called
> 'opt_pub_except_clause' or 'pub_opt_except_clause' because without
> 'pub' it is a bit vague.
>
I agree. I prefer 'opt_pub_except_clause'. By looking at other
variables it better make sense to start the variable name with 'opt_'
as it indicates that it is optional.
Made changes for the same.
> ~~~
>
> 11.
> +/*
> + * ALL TABLES EXCEPT ( table_name [, ...] ) specification
> + */
>
> 11a
> This comment should be up where all the other CREATE PUBLICATION
> syntax is commented.
>
> ~
>
> 11b.
> Also, there is a missing optional "[TABLE]" part.
>
> ~~~
>
> 12.
> +pub_except_obj_list: PublicationExceptObjSpec
> + { $$ = list_make1($1); }
> + | pub_except_obj_list ',' PublicationExceptObjSpec
> + { $$ = lappend($1, $3); }
> + ;
> +
> +opt_except_clause:
> + EXCEPT opt_table '(' pub_except_obj_list ')' { $$ = $4; }
> + | /*EMPTY*/ { $$ = NIL; }
> + ;
>
> I felt the clause should be defined before the obj list because that
> seems the natural order to read these.
>
> ======
> src/bin/pg_dump/pg_dump.c
>
> 13.
> +static SimplePtrList exceptinfo = {NULL, NULL};
>
> Having this as global seems a bit hacky. It has nothing in common with
> all the other nearby lists, which are commented as being based on
> "patterns given by command-line switches"
>
I agree, I have added it in the PublicationInfo struct and made the
corresponding code changes.
> ~~~
>
> dumpPublication:
>
> 14.
> + /* Include exception tables if the publication has EXCEPT TABLEs */
> + for (SimplePtrListCell *cell = exceptinfo.head; cell; cell = cell->next)
> + {
> + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
> + TableInfo *tbinfo;
> +
> + if (pubinfo == pubrinfo->publication)
> + {
> + tbinfo = pubrinfo->pubtable;
>
> That 'tbinfo' can be declared within the "if".
>
> ~~~
>
> 15.
> + appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));
>
> ONLY is not the default. How did you decide that "ONLY" is the correct
> thing to do here?
>
For pg_dump for publication we use "ONLY" by default while specifying the table
For Alter publication we use similar thing:
```
appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
fmtId(pubinfo->dobj.name));
```
Also if we specify a parent table in a publication(without ONLY) all
its child tables are also added to the pg_publication_rel table.
So when we dump such a publication we get something like:
.... EXCEPT TABLE(ONLY parent_table, ONLY child_table)...
> ~~~
>
> getPublicationTables:
>
> 16.
> - pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
> + if (prexcept)
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
> + else
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
> +
>
> Would a single assignment (ternary) make this code simpler and easier to read?
>
> SUGGESTION
> pubrinfo[j].dobj.objType = prexcept ?
> DO_PUBLICATION_EXCEPT_REL :
> DO_PUBLICATION_REL;
>
> ======
> src/bin/pg_dump/t/002_pg_dump.pl
>
> 17.
> + 'CREATE PUBLICATION pub10' => {
> + create_order => 50,
> + create_sql =>
> + 'CREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE
> (dump_test.test_table_generated);',
> + regexp => qr/^
> + \QCREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE (ONLY
> dump_test.test_table_generated, ONLY
> dump_test.test_table_generated_child2, ONLY
> dump_test.test_table_generated_child1) WITH (publish = 'insert,
> update, delete, truncate');\E
> + /xm,
> + like => { %full_runs, section_post_data => 1, },
> + },
> +
>
> These "generated" names seem unusual. I saw there are some other
> tables like 'dump_test.test_inheritance_child' and
> 'dump_test.test_inheritance_parent'. Can you use those more normal
> table names instead?
>
> Also curious - does the order of the tests matter? I saw that the
> CREATE TABLE tests seem to be coming after the CREATE PUBLICATION
> tests that are using them.
>
I looked into it and came to the conclusion that this is controlled
using 'create_order' while specifying the tests.
Tests with a lower create_order value are executed earlier.
So to ensure 'CREATE PUBLICATION' runs correctly we have to make sure
the 'create_order' of these statements is higher than that of the
respective 'CREATE TABLE' statement.
> ~~~
> 18.
> - if (!defined($tests{$test}->{all_runs})
> + if ( !defined($tests{$test}->{all_runs})
>
> Why add this whitespace?
>
pg_perltidy makes this change. I have reverted it.
> ======
> src/include/nodes/parsenodes.h
>
> 19.
> AP_SetObjects, /* set list of objects */
> + AP_Reset, /* reset the publication */
> } AlterPublicationAction;
>
> AFAIK, you removed all ALTER command changes from v30-0001. So this
> should not be here.
>
> ~~~
>
> 20.
> + bool for_all_tables; /* Special publication for all tables in db */
> AlterPublicationAction action; /* What action to perform with the given
> * objects */
> } AlterPublicationStmt;
>
> AFAIK, you removed all ALTER command changes from v30-0001. So this
> should not be here.
>
> ======
> src/test/regress/sql/publication.sql
>
> 21.
> +SET client_min_messages = 'ERROR';
> +CREATE PUBLICATION testpub_foralltables_excepttable FOR ALL TABLES
> EXCEPT TABLE (testpub_tbl1, testpub_tbl2);
> +-- specify EXCEPT without TABLE
> +CREATE PUBLICATION testpub_foralltables_excepttable1 FOR ALL TABLES
> EXCEPT (testpub_tbl1);
>
> Should be 2 comments here for the 2x CREATE:
>
> # Exclude tables using FOR ALL TABLES EXCEPT TABLE (tablelist)
>
> # Exclude tables using FOR ALL TABLES EXCEPT (tablelist)
>
> ~~~
>
> 22.
> CREATE TABLE testpub_tbl3 (a int);
> CREATE TABLE testpub_tbl3a (b text) INHERITS (testpub_tbl3);
>
> If you rename these tables like 'testpub_tbl_parent' and
> 'testpub_tbl_child', it will be much easier to see what is going on.
>
> ~~~
>
> 23.
> +CREATE PUBLICATION testpub5 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3);
>
> Missing comment -- something like:
> # Exclude parent table, omitting both of 'ONLY' and '*'
>
> ~~~
>
> 24.
> +-- EXCEPT with wildcard: exclude table and all descendants
> +CREATE PUBLICATION testpub6 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3*);
>
> 24a.
> TBH, I don't think this is a "wildcard" -- it is not doing any pattern
> matching. IMO just call it an "asterisk" or a "star".
>
> ~
>
> 24b.
> And put a space before the '*' here.
>
> ======
> .../t/037_rep_changes_except_table.pl
>
> 25.
> +# ============================================
> +# EXCEPT TABLE test cases for partition tables
> +# ============================================
> +# Check behavior of EXCEPT TABLE together with publish_via_partition_root
> +# when applied to a partitioned table and its partitions.
>
>
> Really, that "Check behavior" sentence is generic for all of the
> following tests, so it should also be (within the "=======" of the
> previous comment)
>
> ~~~
>
> 26.
> +$node_publisher->safe_psql(
> + 'postgres', qq(
> + CREATE TABLE sch1.t1(a int) PARTITION BY RANGE(a);
> + CREATE TABLE sch1.part1 PARTITION OF sch1.t1 FOR VALUES FROM (0) TO (5);
> + CREATE TABLE sch1.part2 PARTITION OF sch1.t1 FOR VALUES FROM (6) TO (10);
> + INSERT INTO sch1.t1 VALUES (1), (6);
> +));
> +
> +$node_subscriber->safe_psql(
> + 'postgres', qq(
> + CREATE TABLE sch1.t1(a int);
> + CREATE TABLE sch1.part1(a int);
> + CREATE TABLE sch1.part2(a int);
> +));
>
> 26a.
> There should be a comment for this part that just says something like
> "Setup partition table and partitions on the publisher that map to
> normal tables on the subscriber"
>
> ~
>
> 26b.
> The INSERT should be done later, after the CREATE PUBLICATION but
> before the CREATE SUBSCRIPTION. The pattern will be the same for all
> the test cases.
>
> ~~~
>
> 27.
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.part1)"
> +);
>
> Even though the publish_via_partition_root is 'false' by default, I
> think you should spell it out explicitly here for clarity.
>
> ~~~
>
> 28.
> +# EXCEPT TABLE (sch1.t1) with publish_via_partition_root = false
> +# Excluding the partitioned table while publish_via_partition_root = false
> +# still allows rows inserted into its partitions to be replicated.
>
> I felt you should word this differently. I don't think you should say
> "inserted into its partitions" because actually, you inserted into the
> partition table, and the data just ends up in the partitions.
>
> ~~~
>
> 29.
> +$node_publisher->safe_psql(
> + 'postgres', qq(
> + CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.t1);
> + INSERT INTO sch1.t1 VALUES (1), (6);
> +));
>
> Ditto earlier comment. Better to explicitly say
> "publish_via_partition_root=false".
>
I have also addressed the remaining comments and attached the latest patch.
Thanks,
Shlok Kyal
v32-0001-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch
Description: Binary data
