Here are some review comments for patch v20-0001. ====== General.
1. That function 'pg_get_publication_tables' does not seem to be described in the PG documentation. Why isn't it in the "System Catalog Information Functions" table [1] ? I asked this same question a long time ago but then the reply [2] was like "it doesn't seem to be a function provided to users". Well, perhaps that just means that the documentation has been accidentally missing for a long time. Does anybody know for sure if the omission of this function from the documentation is deliberate? If nobody here knows, then maybe this can be asked/addressed in a separate thread. ====== src/backend/catalog/pg_publication.c 2. filter_partitions (review comment from my v19 review) > 2a. > My previous review [1] (see #1) suggested putting most code within the > condition. AFAICT my comment still is applicable but was not yet > addressed. 22/3 Wang-san replied: "Personally, I prefer the current style because the approach you mentioned adds some indentations." Sure, but there is more than just indentation/style differences here. Currently, there is some unnecessary code executed if the table is not a partition. And the reader cannot tell at-a-glance if (skip) will be true/false without looking more closely at the loop logic. So, I think changing it would be better, but anyway I won’t debate about it any more because it's not a functional problem. ====== src/backend/commands/subscriptioncmds.c 3. fetch_table_list + /* Get the list of tables from the publisher. */ + if (server_version >= 160000) + { + StringInfoData pub_names; - appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n" - " WHERE t.pubname IN ("); - get_publications_str(publications, &cmd, true); - appendStringInfoChar(&cmd, ')'); + initStringInfo(&pub_names); + get_publications_str(publications, &pub_names, true); + + /* + * From version 16, we allowed passing multiple publications to the + * function pg_get_publication_tables. This helped to filter out the + * partition table whose ancestor is also published in this + * publication array. + * + * Join pg_get_publication_tables with pg_publication to exclude + * non-existing publications. + */ + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n" + " ( SELECT array_agg(a.attname ORDER BY a.attnum)\n" + " FROM pg_attribute a\n" + " WHERE a.attrelid = GPT.relid AND\n" + " a.attnum = ANY(GPT.attrs)\n" + " ) AS attnames\n" + " FROM pg_class C\n" + " JOIN pg_namespace N ON N.oid = C.relnamespace\n" + " JOIN ( SELECT (pg_get_publication_tables(VARIADIC array_agg(pubname::text))).*\n" + " FROM pg_publication\n" + " WHERE pubname IN ( %s )) as GPT\n" + " ON GPT.relid = C.oid\n", + pub_names.data); + + pfree(pub_names.data); + } + else + { + appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename \n"); + + /* Get column lists for each relation if the publisher supports it */ + if (check_columnlist) + appendStringInfoString(&cmd, ", t.attnames\n"); + + appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n" + " WHERE t.pubname IN ("); + get_publications_str(publications, &cmd, true); + appendStringInfoChar(&cmd, ')'); + } I noticed the SQL "if" part is using uppercase aliases, but the SQL in the "else" part is using lowercase aliases. I think it would be better to be consistent (pick one). ====== src/test/subscription/t/013_partition.pl 4. -# for tab4, we publish changes through the "middle" partitioned table +# If we subscribe only to pub_lower_level, changes for tab4 will be published +# through the "middle" partition table. However, since we will be subscribing +# to both pub_lower_level and pub_all (see subscription sub2 below), we will +# publish changes via the root table (tab4). $node_publisher->safe_psql('postgres', "CREATE PUBLICATION pub_lower_level FOR TABLE tab4_1 WITH (publish_via_partition_root = true)" ); ~ This comment seemed a bit overkill IMO. I don't think you need to say much here except maybe: # Note that subscription "sub2" will later subscribe simultaneously to both pub_lower_level (i.e. just table tab4_1) and pub_all. ~~~ 5. I think maybe you could have another test scenario where you INSERT something into tab4_1_1 while only the publication for tab4_1 has publish_via_partition_root=true ~~~ 6. AFAICT the tab4 tests are only testing the initial sync, but are not testing normal replication. Maybe some more normal (post sync) INSERTS are needed to tab4, tab4_1, tab4_1_1 ? ====== src/test/subscription/t/028_row_filter.pl 7. +# insert data into partitioned table. +$node_publisher->safe_psql('postgres', + "INSERT INTO tab_rowfilter_viaroot_part(a) VALUES(13), (17)"); + $node_subscriber->safe_psql('postgres', "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b, tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1" ); @@ -707,13 +711,17 @@ is($result, qq(t|1), 'check replicated rows to tab_rowfilter_toast'); # the row filter for the top-level ancestor: # # tab_rowfilter_viaroot_part filter is: (a > 15) +# - INSERT (13) NO, 13 < 15 # - INSERT (14) NO, 14 < 15 # - INSERT (15) NO, 15 = 15 # - INSERT (16) YES, 16 > 15 +# - INSERT (17) YES, 17 > 15 $result = $node_subscriber->safe_psql('postgres', - "SELECT a FROM tab_rowfilter_viaroot_part"); -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part'); + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1"); +is( $result, qq(16 +17), + 'check replicated rows to tab_rowfilter_viaroot_part'); ~ I'm not 100% sure this is testing quite what you want to test. AFAICT the subscription is created like: "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2, tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b, tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1" Notice in this case BOTH the partitioned table and the partition had been published using "WITH (publish_via_partition_root)". But, IIUC won't it be better to test when only the partition's publication was using that option? For example, I think then it would be a better test of this "At least one" code: /* At least one publication is using publish_via_partition_root. */ if (pub_elem->pubviaroot) viaroot = true; ====== src/test/subscription/t/031_column_list.pl 8. - CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH (publish_via_partition_root = true); + CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH (publish_via_partition_root = true); + CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH (publish_via_partition_root = true); -- initial data INSERT INTO test_root VALUES (1, 2, 3); INSERT INTO test_root VALUES (10, 20, 30); )); +# Subscribe to pub_root_true_1 and pub_root_true_2 at the same time, which +# means that the initial data will be synced once, and only the column list of +# the parent table (test_root) in the publication pub_root_true_1 will be used +# for both table sync and data replication. $node_subscriber->safe_psql( 'postgres', qq( - CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_root_true; + CREATE ~ (This is simlar to the previous review comment #7 above) Won't it be a better test of the "At least one" code when only the publication of partition (test_root_1) is using "WITH (publish_via_partition_root = true)". e.g CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a); CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH (publish_via_partition_root = true); ------ [1] https://www.postgresql.org/docs/devel/functions-info.html#FUNCTIONS-INFO-CATALOG-TABLE [2] https://www.postgresql.org/message-id/OS3PR01MB6275FB5397C6A647F262A3A69E009%40OS3PR01MB6275.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia