Hi, Here are some review comments for patch v5-0001.
====== GENERAL G.1 The patch changes HEAD behaviour for PUBLICATION col-lists right? e.g. maybe before they were always ignored, but now they are not? OTOH, when 'include_generated_columns' is false then the PUBLICATION col-list will ignore any generated cols even when they are present in a PUBLICATION col-list, right? These kinds of points should be noted in the commit message and in the (col-list?) documentation. ====== Commit message General 1a. IMO the commit message needs some background to say something like: "Currently generated column values are not replicated because it is assumed that the corresponding subscriber-side table will generate its own values for those columns." ~ General 1b. Somewhere in this commit message, you need to give all the other special rules --- e.g. the docs says "If the subscriber-side column is also a generated column then this option has no effect" ~~~ 2. This commit enables support for the 'include_generated_columns' option in logical replication, allowing the transmission of generated column information and data alongside regular table changes. This option is particularly useful for scenarios where applications require access to generated column values for downstream processing or synchronization. ~ I don't think the sentence "This option is particularly useful..." is helpful. It seems like just saying "This commit supports option XXX. This is particularly useful if you want XXX". ~~~ 3. CREATE SUBSCRIPTION test1 connection 'dbname=postgres host=localhost port=9999 'publication pub1; ~ What is this CREATE SUBSCRIPTION for? Shouldn't it have an example of the new parameter being used in it? ~~~ 4. Currently copy_data option with include_generated_columns option is not supported. A future patch will remove this limitation. ~ Suggest to single-quote those parameter names for better readability. ~~~ 5. This commit aims to enhance the flexibility and utility of logical replication by allowing users to include generated column information in replication streams, paving the way for more robust data synchronization and processing workflows. ~ IMO this paragraph can be omitted. ====== .../test_decoding/sql/decoding_into_rel.sql 6. +-- check include-generated-columns option with generated column +CREATE TABLE gencoltable (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) STORED); +INSERT INTO gencoltable (a) VALUES (1), (2), (3); +SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'include-generated-columns', '1'); +INSERT INTO gencoltable (a) VALUES (4), (5), (6); +SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'include-generated-columns', '0'); +DROP TABLE gencoltable; + 6a. I felt some additional explicit comments might help the readabilty of the output file. e.g.1 -- When 'include-generated=columns' = '1' the generated column 'b' values will be replicated SELECT data FROM pg_logical_slot_get_changes... e.g.2 -- When 'include-generated=columns' = '0' the generated column 'b' values will not be replicated SELECT data FROM pg_logical_slot_get_changes... ~~ 6b. Suggest adding one more test case (where 'include-generated=columns' is not set) to confirm/demonstrate the default behaviour for replicated generated cols. ====== doc/src/sgml/protocol.sgml 7. + <varlistentry> + <term><replaceable class="parameter">include-generated-columns</replaceable></term> + <listitem> + <para> + Boolean option to enable generated columns. + The include-generated-columns option controls whether generated + columns should be included in the string representation of tuples + during logical decoding in PostgreSQL. This allows users to + customize the output format based on whether they want to include + these columns or not. The default is false. + </para> + </listitem> + </varlistentry> 7a. It doesn't render properly. e.g. Should not be bold italic (probably the class is wrong?), because none of the nearby parameters look this way. ~ 7b. The name here should NOT have hyphens. It needs underscores same as all other nearby protocol parameters. ~ 7c. The description seems overly verbose. SUGGESTION Boolean option to enable generated columns. This option controls whether generated columns should be included in the string representation of tuples during logical decoding in PostgreSQL. The default is false. ====== doc/src/sgml/ref/create_subscription.sgml 8. + + <varlistentry id="sql-createsubscription-params-with-include-generated-column"> + <term><literal>include_generated_column</literal> (<type>boolean</type>)</term> + <listitem> + <para> + Specifies whether the generated columns present in the tables + associated with the subscription should be replicated. The default is + <literal>false</literal>. + </para> The parameter name should be plural (include_generated_columns). ====== src/backend/commands/subscriptioncmds.c 9. #define SUBOPT_ORIGIN 0x00008000 +#define SUBOPT_INCLUDE_GENERATED_COLUMN 0x00010000 Should be plural COLUMNS ~~~ 10. + else if (IsSet(supported_opts, SUBOPT_INCLUDE_GENERATED_COLUMN) && + strcmp(defel->defname, "include_generated_column") == 0) The new subscription parameter should be plural ("include_generated_columns"). ~~~ 11. + + /* + * Do additional checking for disallowed combination when copy_data and + * include_generated_column are true. COPY of generated columns is not supported + * yet. + */ + if (opts->copy_data && opts->include_generated_column) + { + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + /*- translator: both %s are strings of the form "option = value" */ + errmsg("%s and %s are mutually exclusive options", + "copy_data = true", "include_generated_column = true"))); + } /combination/combinations/ The parameter name should be plural in the comment and also in the error message. ====== src/bin/psql/tab-complete.c 12. COMPLETE_WITH("binary", "connect", "copy_data", "create_slot", "disable_on_error", "enabled", "failover", "origin", "password_required", "run_as_owner", "slot_name", - "streaming", "synchronous_commit", "two_phase"); + "streaming", "synchronous_commit", "two_phase","include_generated_columns"); The new param should be added in alphabetical order same as all the others. ====== src/include/catalog/pg_subscription.h 13. + bool subincludegeneratedcolumn; /* True if generated columns must be published */ + The field name should be plural. ~~~ 14. + bool includegeneratedcolumn; /* publish generated column data */ } Subscription; The field name should be plural. ====== src/include/replication/walreceiver.h 15. * prepare time */ char *origin; /* Only publish data originating from the * specified origin */ + bool include_generated_column; /* publish generated columns */ } logical; } proto; } WalRcvStreamOptions; ~ This new field name should be plural. ====== src/test/subscription/t/011_generated.pl 16. +my ($cmdret, $stdout, $stderr) = $node_subscriber->psql('postgres', qq( + CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2 WITH (include_generated_column = true) +)); +ok( $stderr =~ + qr/copy_data = true and include_generated_column = true are mutually exclusive options/, + 'cannot use both include_generated_column and copy_data as true'); Isn't this mutual exclusiveness of options something that could have been tested in the regress test suite instead of TAP tests? e.g. AFAIK you won't require a connection to test this case. ~~~ 17. Missing test? IIUC there is a missing test scenario. You can add another subscriber table TAB3 which *already* has generated cols (e.g. generating different values to the publisher) so then you can verify they are NOT overwritten, even when the 'include_generated_cols' is true. ====== Kind Regards, Peter Smith. Fujitsu Australia