Hi SHubham, Here are my review comments for v40-0001 (code) Please don't post a blanket response of "I have fixed all the comments" response to this. Sometimes things get missed. Instead, please reply done/not-done/whatever individually, so I can track the changes properly.
====== src/backend/catalog/pg_publication.c 1. - if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated) - ereport(ERROR, - errcode(ERRCODE_INVALID_COLUMN_REFERENCE), - errmsg("cannot use generated column \"%s\" in publication column list", - colname)); - This function no longer rejects generated columns from a publication column list. So, now the function comment is wrong because it still says "Checks for and raises an ERROR for any; unknown columns, system columns, duplicate columns or generated columns." NOTE: This was fixed already in v39-0001, but then broken again in v40-0001 (???) nit - also remove that semicolon (;) from the function comment. ====== src/backend/commands/publicationcmds.c 2. - if (!isnull) + if (!isnull || !pubgencols) { int x; Bitmapset *idattrs; Bitmapset *columns = NULL; - /* With REPLICA IDENTITY FULL, no column list is allowed. */ - if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL) - result = true; + if (!isnull) + { + /* With REPLICA IDENTITY FULL, no column list is allowed. */ + if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL) + result = true; + + /* Transform the column list datum to a bitmapset. */ + columns = pub_collist_to_bitmapset(NULL, datum, NULL); + } + else + { + TupleDesc desc = RelationGetDescr(relation); + int nliveatts = 0; + + for (int i = 0; i < desc->natts; i++) + { + Form_pg_attribute att = TupleDescAttr(desc, i); + + /* Skip if the attribute is dropped or generated */ + if (att->attisdropped) + continue; + + nliveatts++; + + if (att->attgenerated) + continue; + + columns = bms_add_member(columns, i + 1); + } - /* Transform the column list datum to a bitmapset. */ - columns = pub_collist_to_bitmapset(NULL, datum, NULL); + /* Return if all columns of the table will be replicated */ + if (bms_num_members(columns) == nliveatts) + { + bms_free(columns); + ReleaseSysCache(tuple); + return false; + } + } 2a. AFAIK this code was written to deal with Sawada-San's comment about UPDATE/DELETE [1], but the logic is not very clear. It needs more comments on the "else" part, the generated columns and how the new logic solves Sawada-San's problem. ~ 2b. This function is now dealing both with 'publish_via_root' as well as 'publish_generated_columns'. I am suspicious that you may be handling each of these parameters independently (i.e.. there is some early return "Return if all columns of the table will be replicated") but there might be some scenarios where a *combination* of pubviaroot and gencols is not working as expected. For example, there is a whole comment later about this: "If pubviaroot is true, we are validating the column list of the parent table, but the bitmap contains the replica identity information of the child table." which I am not sure is being dealt with correctly. IMO there need to be more test cases added for these tricky combination scenarios to prove the code is good. ~ 2c. I expected to see some more REPLICA IDENTITY tests to reproduce the problem scenario that Sawada-San reported. Where are those? ====== src/backend/replication/logical/tablesync.c 3. make_copy_attnamelist Patch v38-0001 used to have a COPY error: + /* + * Check if the subscription table generated column has same name + * as a non-generated column in the corresponding publication + * table. + */ + if (!remotegenlist[remote_attnum]) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("logical replication target relation \"%s.%s\" has a generated column \"%s\" " + "but corresponding column on source relation is not a generated column", + rel->remoterel.nspname, rel->remoterel.relname, NameStr(attr->attname)))); + My most recent comment about this code was something along the lines of "Make this detailed useful error message common if possible". But, in v39 all this ERROR was removed (and it is still missing in v40). Why? I did not see any other review comments asking for this to be removed. AFAIK this was a correct error message for not_generated->generated. Won't the current code now just silently ignore/skip this, which is contrary behaviour to what normal replication would do? The recently deleted combo TAP tests could have detected this. ~~~ fetch_remote_table_info 4. if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000) There was a new 'server_version' variable added, so why not use it here? In previous versions, we used to do so, but now since v39, the code has reverted yet again. (???). ~~~ 5. appendStringInfo(&cmd, "SELECT DISTINCT" - " (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)" - " THEN NULL ELSE gpt.attrs END)" + " (gpt.attrs)" " FROM pg_publication p," " LATERAL pg_get_publication_tables(p.pubname) gpt," " pg_class c" Is this change OK? This fragment is guarded only by server_version >= 150000 (not 180000) so I was unsure. Can you explain why you made this change, and why it is correct even for versions older than 180000? ~~~ 6. lrel->natts = natt; - walrcv_clear_result(res); nit - this whitespace change has nothing to do with this patch. Remove it. ~~~ copy_table 7. + * We also need to use this same COPY (SELECT ...) syntax when + * 'publish_generated_columns' is specified as true and the remote + * table has generated columns, because copy of generated columns is + * not supported by the normal COPY. That mention about "when 'publish_generated_columns' is specified" is not strictly true anymore, because the publication column list overrides this parameter. It may be better to word this like below: SUGGESTION We also need to use this same COPY (SELECT ...) syntax when generated columns are published, because copying generated columns is not supported by the normal COPY. ====== src/backend/utils/cache/relcache.c 8. /* * Check if all columns are part of the REPLICA IDENTITY index or not. * - * If the publication is FOR ALL TABLES then it means the table has no - * column list and we can skip the validation. + * If the publication is FOR ALL TABLES and publication includes + * generated columns then it means that all the table will replicate + * all columns and we can skip the validation. */ /table/tables/ Also, I think this can be re-worded to name the 'publish_generated_columns' parameter which might be more clear. SUGGESTION: If the publication is FOR ALL TABLES then column lists are not possible. In this case, if 'publish_generated_columns' is true then all table columns will be replicated, so the validation can be skipped. ====== src/include/replication/logicalproto.h 9. Bitmapset *attkeys; /* Bitmap of key columns */ + bool *remotegenlist; /* Array to store whether each column is + * generated */ } LogicalRepRelation; Since this is just a palloc'ed array same as the other fields like 'attnames' and 'atttyps', maybe it should be named and commented more similarly to those? E.g more like: bool *attremotegen; /* remote column is generated? */ ====== src/test/regress/sql/publication.sql 10. +-- Remove generate columns from column list, when 'publish_generated_columns'=false +ALTER PUBLICATION pub2 SET TABLE gencols(a); +\dRp+ pub2 typo /generate/generated/ ====== src/test/subscription/t/100_bugs.pl 11. - CREATE TABLE generated_cols (a int, b_gen int GENERATED ALWAYS AS (5 * a) STORED, c int); + CREATE TABLE generated_cols (a int, b_gen int, c int); Hmm. But, should we be modifying tests in the "bugs" test file for a new feature? If it is really necessary, then at least update the comment to explain why. ~~~ 12. - CREATE TABLE generated_cols (a int, b_gen int GENERATED ALWAYS AS (5 * a) STORED, c int); + CREATE TABLE generated_cols (a int, b_gen int, c int); Now you've ended up with a table called 'generated_cols' which has no generated cols in it at all. Isn't this contrary to why this test case existed in the first place? In other words, it feels a bit like a hack. Perhaps you could have left all the tables as-is but just say 'publish_generated_columns=false'. But then that should be the default parameter value anyhow, so TBH it is not clear to me why 100_bugs.pl needed any changes at all. Can you give the reason? ====== PSA the nitpicks attachment which implements some of the cosmetic suggestions from above. ====== [1] https://www.postgresql.org/message-id/CAD21AoB%3DDBVDNCGBja%2BsDa2-w9tsM7_E%3DZgyw2qYMR1R0FwDsg%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia.
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c index e6e5506..eff876a 100644 --- a/src/backend/catalog/pg_publication.c +++ b/src/backend/catalog/pg_publication.c @@ -500,8 +500,7 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri, * pub_collist_validate * Process and validate the 'columns' list and ensure the columns are all * valid to use for a publication. Checks for and raises an ERROR for - * any; unknown columns, system columns, duplicate columns or generated - * columns. + * any unknown columns, system columns, or duplicate columns. * * Looks up each column's attnum and returns a 0-based Bitmapset of the * corresponding attnums. diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index 2f55fc5..b1ebefe 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -891,7 +891,7 @@ fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel, * We need to do this before fetching info about column names and types, * so that we can skip columns that should not be replicated. */ - if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000) + if (server_version >= 150000) { WalRcvExecResult *pubres; TupleTableSlot *tslot; @@ -1110,6 +1110,7 @@ fetch_remote_table_info(char *nspname, char *relname, LogicalRepRelation *lrel, ExecDropSingleTupleTableSlot(slot); lrel->natts = natt; + walrcv_clear_result(res); /* @@ -1288,9 +1289,8 @@ copy_table(Relation rel) * SELECT query with OR'ed row filters for COPY. * * We also need to use this same COPY (SELECT ...) syntax when - * 'publish_generated_columns' is specified as true and the remote - * table has generated columns, because copy of generated columns is - * not supported by the normal COPY. + * generated columns are published, because copy of generated columns + * is not supported by the normal COPY. */ int i = 0; diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 80e11a3..0216b57 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -5826,9 +5826,9 @@ RelationBuildPublicationDesc(Relation relation, PublicationDesc *pubdesc) /* * Check if all columns are part of the REPLICA IDENTITY index or not. * - * If the publication is FOR ALL TABLES and publication includes - * generated columns then it means that all the table will replicate - * all columns and we can skip the validation. + * If the publication is FOR ALL TABLES then column lists are not possible. + * In this case, if 'publish_generated_columns' is true then all table + * columns will be replicated, so the validation can be skipped. */ if (!(pubform->puballtables && pubform->pubgencols) && (pubform->pubupdate || pubform->pubdelete) && diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out index fc856d9..72943ef 100644 --- a/src/test/regress/expected/publication.out +++ b/src/test/regress/expected/publication.out @@ -1809,7 +1809,7 @@ ALTER PUBLICATION pub2 SET (publish_generated_columns = false); Tables: "public.gencols" (a, gen1) --- Remove generate columns from column list, when 'publish_generated_columns'=false +-- Remove generated columns from column list, when 'publish_generated_columns'=false ALTER PUBLICATION pub2 SET TABLE gencols(a); \dRp+ pub2 Publication pub2 diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql index 454a03b..1ee322f 100644 --- a/src/test/regress/sql/publication.sql +++ b/src/test/regress/sql/publication.sql @@ -1140,7 +1140,7 @@ CREATE PUBLICATION pub2 FOR table gencols(a, gen1) WITH (publish_generated_colum ALTER PUBLICATION pub2 SET (publish_generated_columns = false); \dRp+ pub2 --- Remove generate columns from column list, when 'publish_generated_columns'=false +-- Remove generated columns from column list, when 'publish_generated_columns'=false ALTER PUBLICATION pub2 SET TABLE gencols(a); \dRp+ pub2