I was going through this thread and testing and reviewing the patches, I think this is a great feature to have and one which customers would appreciate. I wanted to help out, and I saw a request for a test patch for a GUC to always enable streaming on logical replication. Here's one on top of patchset v31, just in case you still need it. By default the GUC is turned on, I ran the regression tests with it and didn't see any errors.
thanks, Ajin Fujitsu Australia On Wed, Jul 8, 2020 at 8:02 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > On Wed, Jul 8, 2020 at 9:36 AM Amit Kapila <amit.kapil...@gmail.com> > wrote: > > > > On Sun, Jul 5, 2020 at 8:37 PM Dilip Kumar <dilipbal...@gmail.com> > wrote: > > > > > > I have compared the changes logged at command end vs logged at commit > > > time. I have ignored the invalidation for the transaction which has > > > any aborted subtransaction in it. While testing this I found one > > > issue, the issue is that if there are some invalidation generated > > > between last command counter increment and the commit transaction then > > > those were not logged. I have fixed the issue by logging the pending > > > invalidation in RecordTransactionCommit. I will include the changes > > > in the next patch set. > > > > > > > I think it would have been better if you could have given examples for > > such cases where you need this extra logging. Anyway, below are few > > minor comments on this patch: > > > > 1. > > + /* > > + * Log any pending invalidations which are adding between the last > > + * command counter increment and the commit. > > + */ > > + if (XLogLogicalInfoActive()) > > + LogLogicalInvalidations(); > > > > I think we can change this comment slightly and extend a bit to say > > for which kind of special cases we are adding this. "Log any pending > > invalidations which are added between the last CommandCounterIncrement > > and the commit. Normally for DDLs, we log this at each command end, > > however for certain cases where we directly update the system table > > the invalidations were not logged at command end." > > > > Something like above based on cases that are not covered by command > > end WAL logging. > > > > 2. > > + * Emit WAL for invalidations. This is currently only used for logging > > + * invalidations at the command end. > > + */ > > +void > > +LogLogicalInvalidations() > > > > After this is getting used at a new place, it is better to modify the > > above comment to something like: "Emit WAL for invalidations. This is > > currently only used for logging invalidations at the command end or at > > commit time if any invalidations are pending." > > > > I have done some more review and below are my comments: > > Review-v31-0010-Provide-new-api-to-get-the-streaming-changes > > ---------------------------------------------------------------------------------------------- > 1. > --- a/src/backend/catalog/system_views.sql > +++ b/src/backend/catalog/system_views.sql > @@ -1240,6 +1240,14 @@ LANGUAGE INTERNAL > VOLATILE ROWS 1000 COST 1000 > AS 'pg_logical_slot_get_changes'; > > +CREATE OR REPLACE FUNCTION pg_logical_slot_get_streaming_changes( > + IN slot_name name, IN upto_lsn pg_lsn, IN upto_nchanges int, > VARIADIC options text[] DEFAULT '{}', > + OUT lsn pg_lsn, OUT xid xid, OUT data text) > +RETURNS SETOF RECORD > +LANGUAGE INTERNAL > +VOLATILE ROWS 1000 COST 1000 > +AS 'pg_logical_slot_get_streaming_changes'; > > If we are going to add a new streaming API for get_changes, don't we > need for pg_logical_slot_get_binary_changes, > pg_logical_slot_peek_changes and pg_logical_slot_peek_binary_changes > as well? I was thinking why not add a new parameter (streaming > boolean) instead of adding the new APIs. This could be an optional > parameter which if user doesn't specify will be considered as false. > We already have optional parameters for APIs like > pg_create_logical_replication_slot. > > 2. You forgot to update sgml/func.sgml. This will be required even if > we decide to add a new parameter instead of a new API. > > 3. > + /* If called has not asked for streaming changes then disable it. */ > + ctx->streaming &= streaming; > > /If called/If the caller > > 4. > diff --git a/.gitignore b/.gitignore > index 794e35b..6083744 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -42,3 +42,4 @@ lib*.pc > /Debug/ > /Release/ > /tmp_install/ > +/build/ > > Why the patch contains this change? > > 5. If I apply the first six patches and run the regressions, it fails > primarily because streaming got enabled by default. And then when I > applied this patch, the tests passed because it disables streaming by > default. I think this should be patch 0007. > > Replication Origins > ------------------------------ > I think we also need to conclude on origins related discussion [1]. > As far as I can see, the origin_id can be sent with the first startup > message. The origin_lsn and origin_commit can be sent with the last > start of streaming commit if we want but not sure if that is of use. > If we need to send it earlier then we need to record it with other WAL > records. The point is that those are set with > pg_replication_origin_xact_setup but not sure how and when that > function is called. The other alternative is that we can ignore that > for now and once the usage is clear we can enhance it. What do you > think? > > [1] - > https://www.postgresql.org/message-id/CAA4eK1JwXaCezFw%2BkZwoxbLKYD0nWpC2rPgx7vUsaDAc0AZaow%40mail.gmail.com > > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com > > >
v31-0015-TEST-guc-always-streaming-logical.patch
Description: Binary data