Re: Pgoutput not capturing the generated columns
Hi, Here are some review comments for the patch v3-0001. I don't think v3 addressed any of my previous review comments for patches v1 and v2. [1][2] So the comments below are limited only to the new code (i.e. the v3 versus v2 differences). Meanwhile, all my previous review comments may still apply. == GENERAL The patch applied gives whitespace warnings: [postgres@CentOS7-x64 oss_postgres_misc]$ git apply ../patches_misc/v3-0001-Support-generated-column-capturing-generated-colu.patch ../patches_misc/v3-0001-Support-generated-column-capturing-generated-colu.patch:150: trailing whitespace. ../patches_misc/v3-0001-Support-generated-column-capturing-generated-colu.patch:202: trailing whitespace. ../patches_misc/v3-0001-Support-generated-column-capturing-generated-colu.patch:730: trailing whitespace. warning: 3 lines add whitespace errors. == contrib/test_decoding/test_decoding.c 1. pg_decode_change MemoryContext old; + bool include_generated_columns; + I'm not really convinced this variable saves any code. == doc/src/sgml/protocol.sgml 2. + + include-generated-columns + + +The include-generated-columns option controls whether generated columns should be included in the string representation of tuples during logical decoding in PostgreSQL. This allows users to customize the output format based on whether they want to include these columns or not. + + + 2a. Something is not correct when this name has hyphens and all the nearby parameter names do not. Shouldn't it be all uppercase like the other boolean parameter? ~ 2b. Text in the SGML file should be wrapped properly. ~ 2c. IMO the comment can be more terse and it also needs to specify that it is a boolean type, and what is the default value if not passed. SUGGESTION INCLUDE_GENERATED_COLUMNS [ boolean ] If true, then generated columns should be included in the string representation of tuples during logical decoding in PostgreSQL. The default is false. == src/backend/replication/logical/proto.c 3. logicalrep_write_tuple - if (!column_in_column_list(att->attnum, columns)) + if (!column_in_column_list(att->attnum, columns) && !att->attgenerated) + continue; + + if (att->attgenerated && !publish_generated_column) continue; 3a. This code seems overcomplicated checking the same flag multiple times. SUGGESTION if (att->attgenerated) { if (!publish_generated_column) continue; } else { if (!column_in_column_list(att->attnum, columns)) continue; } ~ 3b. The same logic occurs several times in logicalrep_write_tuple ~~~ 4. logicalrep_write_attrs if (!column_in_column_list(att->attnum, columns)) continue; + if (att->attgenerated && !publish_generated_column) + continue; + Shouldn't these code fragments (2x in this function) look the same as in logicalrep_write_tuple? See the above review comments. == src/backend/replication/pgoutput/pgoutput.c 5. maybe_send_schema TransactionId topxid = InvalidTransactionId; + bool publish_generated_column = data->publish_generated_column; I'm not convinced this saves any code, and anyway, it is not consistent with other fields in this function that are not extracted to another variable (e.g. data->streaming). ~~~ 6. pgoutput_change - + bool publish_generated_column = data->publish_generated_column; + I'm not convinced this saves any code, and anyway, it is not consistent with other fields in this function that are not extracted to another variable (e.g. data->binary). == [1] My v1 review - https://www.postgresql.org/message-id/CAHut+PsuJfcaeg6zst=6pe5uyjv_uxvrhu3ck7w2ahb1uqy...@mail.gmail.com [2] My v2 review - https://www.postgresql.org/message-id/CAHut%2BPv4RpOsUgkEaXDX%3DW2rhHAsJLiMWdUrUGZOcoRHuWj5%2BQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
ibers + from the primary server. + 7a. I felt that the step should just say "Confirm that the standby server is not lagging behind the subscribers.". So the text "This step can be skipped..." should be a separate paragraph. ~ 7b. The 2nd standby_slot_names should use a varname font. ~ 7c. /may vary at different points in time due to/can vary due to/ 8. + +Firstly, on the subscriber node check the last replayed WAL. +This step needs to be run on the database(s) that includes the failover +enabled subscription(s), to find the last replayed WAL on each database. 8a. Don't need to say "Firstly," ~ 8b. The text "This step..." can be simplified as below: BEFORE This step needs to be run on the database(s) that includes the failover enabled subscription(s), to find the last replayed WAL on each database. SUGGESTION This step needs to be run on any database that includes failover-enabled subscriptions. ~~~ 9. + + Next, on the standby server check that the last-received WAL location + is ahead of the replayed WAL location(s) on the subscriber identified + above. If the above SQL result was NULL, it means the subscriber has not + yet replayed any WAL, so the standby server must be ahead of the + subscriber, and this step can be skipped. Don't need to say "Next," ~~~ 10. + + If the result (failover_ready) of both above steps is + true, existing subscriptions will be able to continue without losing any data + that has been flushed to the new primary server. + Let's word this more like the same sentence top of the page. See review comment #3c SUGGESTION If the result (failover_ready) of both steps above is true, then existing subscriptions can continue subscribing to publications now on the new primary server without any loss of data. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Pgoutput not capturing the generated columns
mp; !publish_generated_column) + continue; ditto #6 ~~ 8. logicalrep_write_attrs - if (att->attisdropped || att->attgenerated) + if (att->attisdropped) continue; if (!column_in_column_list(att->attnum, columns)) continue; + if (att->attgenerated && !publish_generated_column) + continue; + ditto #6 ~~ 9. - if (att->attisdropped || att->attgenerated) + if (att->attisdropped) continue; if (!column_in_column_list(att->attnum, columns)) continue; + if (att->attgenerated && !publish_generated_column) + continue; ditto #6 == src/include/catalog/pg_subscription.h 10. CATALOG + bool subgeneratedcolumn; /* True if generated colums must be published */ /colums/columns/ == src/test/regress/sql/publication.sql 11. --- error: generated column "d" can't be in list +-- ok Maybe change "ok" to say like "ok: generated cols can be in the list too" == 12. GENERAL - Missing CREATE SUBSCRIPTION test? GENERAL - Missing ALTER SUBSCRIPTION test? How come this patch adds a new CREATE SUBSCRIPTION option but does not seem to include any test case for that option in either the CREATE SUBSCRIPTION or ALTER SUBSCRIPTION regression tests? == [1] My v1 review - https://www.postgresql.org/message-id/CAHut+PsuJfcaeg6zst=6pe5uyjv_uxvrhu3ck7w2ahb1uqy...@mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: inconsistent quoting in error messages
On Tue, May 21, 2024 at 2:56 PM Kyotaro Horiguchi wrote: > > I noticed that there are slightly inconsistent messages regarding > quoting policies. > > > This happens if you temporarily set "wal_level=minimal" on the server. > > WAL generated with "full_page_writes=off" was replayed during online backup > > > pg_log_standby_snapshot() can only be used if "wal_level" >= "replica" > > > WAL streaming ("max_wal_senders" > 0) requires "wal_level" to be "replica" > > or "logical" > > I think it's best to quote variable names and values separately, like > "wal_level" = "minimal" (but not use quotes for numeric values), as it > seems to be the most common practice. Anyway, we might want to unify > them. > > > Likewise, I saw two different versions of values with units. > > > "max_stack_depth" must not exceed %ldkB. > > "vacuum_buffer_usage_limit" must be 0 or between %d kB and %d kB > > I'm not sure, but it seems like the latter version is more common. > > regards. > Hi, I think it might be better to keep all the discussions about GUC quoting and related topics like this confined to the main thread here [1]. Otherwise, we might end up with a bunch of competing patches. == [1] https://www.postgresql.org/message-id/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: GUC names in messages
On Fri, May 17, 2024 at 9:57 PM Peter Eisentraut wrote: > > On 17.05.24 05:31, Peter Smith wrote: > >> I think we should accept your two patches > >> > >> v6-0001-GUC-names-docs.patch > >> v6-0002-GUC-names-add-quotes.patch > >> > >> which effectively everyone was in favor of and which seem to be the most > >> robust and sustainable solution. > >> > >> (The remaining three patches from the v6 set would be PG18 material at > >> this point.) > > Thanks very much for taking an interest in resurrecting this thread. > > > > It was always my intention to come back to this when the dust had > > settled on PG17. But it would be even better if the docs for the rule > > "just quote everything", and anything else you deem acceptable, can be > > pushed sooner. > > > > Of course, there will still be plenty more to do for PG18, including > > locating examples in newly pushed code for messages that have slipped > > through the cracks during the last few months using different formats, > > and other improvements, but those tasks should become easier if we can > > get some of these v6 patches out of the way first. > > I committed your 0001 and 0002 now, with some small fixes. > > There has also been quite a bit of new code, of course, since you posted > your patches, so we'll probably find a few more things that could use > adjustment. > > I'd be happy to consider the rest of your patch set after beta1 and/or > for PG18. > Thanks for pushing! I'll try to dedicate more time to this sometime soon to go through all the code again to track down those loose ends. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi Kuroda-san, I did not apply these v12* patches, but I have diff'ed all of them with the previous v11* patches and confirmed that all of my previous review comments now seem to be addressed. I don't have any more comments to make at this time. Thanks! == Kind Regards, Peter Smith. Fujitsu Australia
Re: GUC names in messages
On Thu, May 16, 2024 at 9:35 PM Peter Eisentraut wrote: > > On 04.01.24 07:53, Peter Smith wrote: > >> Now that I read this again, I think this is wrong. > >> > >> We should decide the quoting for a category, not the actual content. > >> Like, quote all file names; do not quote keywords. > >> > >> This led to the attempted patch to decide the quoting of GUC parameter > >> names dynamically based on the actual content, which no one really > >> liked. But then, to preserve consistency, we also need to be uniform in > >> quoting GUC parameter names where the name is hardcoded. > >> > > > > I agree. By attempting to define when to and when not to use quotes it > > has become overcomplicated. > > > > Earlier in the thread, I counted how quotes were used in the existing > > messages [5]; there were ~39 quoted and 164 not quoted. Based on that > > we chose to stay with the majority, and leave all the unquoted ones so > > only adding quotes "when necessary". In hindsight, that was probably > > the wrong choice because it opened a can of worms about what "when > > necessary" even means (e.g. what about underscores, mixed case etc). > > > > Certainly one simple rule "just quote everything" is easiest to follow. > > I've been going through the translation updates for PG17 these days and > was led back around to this issue. It seems we left it in an > intermediate state that no one was really happy with and which is > arguably as inconsistent or more so than before. > > I think we should accept your two patches > > v6-0001-GUC-names-docs.patch > v6-0002-GUC-names-add-quotes.patch > > which effectively everyone was in favor of and which seem to be the most > robust and sustainable solution. > > (The remaining three patches from the v6 set would be PG18 material at > this point.) Thanks very much for taking an interest in resurrecting this thread. It was always my intention to come back to this when the dust had settled on PG17. But it would be even better if the docs for the rule "just quote everything", and anything else you deem acceptable, can be pushed sooner. Of course, there will still be plenty more to do for PG18, including locating examples in newly pushed code for messages that have slipped through the cracks during the last few months using different formats, and other improvements, but those tasks should become easier if we can get some of these v6 patches out of the way first. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi, Here are my remaining review comments for the latest v11* patches. // v11-0001 // No changes. No comments. // v11-0002 // == doc/src/sgml/ref/alter_subscription.sgml 2.1. ALTER SUBSCRIPTION ... SET (failover = on|off) and - ALTER SUBSCRIPTION ... SET (two_phase = on|off) + ALTER SUBSCRIPTION ... SET (two_phase = off) My other thread patch has already been pushed [1], so now you can modify this to say "true|false" as previously suggested. // v11-0003 // == src/backend/commands/subscriptioncmds.c 3.1. AlterSubscription + subtwophase = LOGICALREP_TWOPHASE_STATE_DISABLED; + } + else + subtwophase = LOGICALREP_TWOPHASE_STATE_PENDING; + + /* Change system catalog acoordingly */ values[Anum_pg_subscription_subtwophasestate - 1] = - CharGetDatum(opts.twophase ? - LOGICALREP_TWOPHASE_STATE_PENDING : - LOGICALREP_TWOPHASE_STATE_DISABLED); + CharGetDatum(subtwophase); replaces[Anum_pg_subscription_subtwophasestate - 1] = true; Sorry, I don't think that 'subtwophase' change is an improvement -- IMO the existing ternary code was fine as-is. I only reported the excessive flag checking in the previous v10-0003 review because of some extra "if (!opts.twophase)" code but that was caused by what you called "wrong git operations." and is already fixed now. // v11-0004 // == src/sgml/catalogs.sgml 4.1. + + + subforcealter bool + + + If true, then the ALTER SUBSCRIPTION + can disable two_phase option, even if there are + uncommitted prepared transactions from when two_phase + was enabled + + + I think this description should be changed to say what it *really* does. IMO, the stuff about 'two_phase' is more like a side-effect. SUGGESTION (this is just pseudo-code. You can add the CREATE SUBSCRIPTION 'force_alter' link appropriately) If true, then the ALTER SUBSCRIPTION command can sometimes be forced to proceed instead of giving an error. See force_alter parameter for details about when this might be useful. == [1] https://github.com/postgres/postgres/commit/fa65a022db26bc446fb67ce1d7ac543fa4bb72e4 Kind Regards, Peter Smith. Fujitsu Australia
Re: Docs: Always use true|false instead of sometimes on|off for the subscription options
On Thu, May 16, 2024 at 10:42 PM David Rowley wrote: > > On Thu, 16 May 2024 at 19:05, Peter Smith wrote: > > > > On Thu, May 16, 2024 at 3:11 PM David Rowley wrote: > > > If you want to do this, what's the reason to limit it to just this one > > > page of the docs? > > > > Yeah, I had a vested interest in this one place because I've been > > reviewing the other thread [1] that was mentioned above. If that other > > thread chooses "true|false" then putting "true|false" adjacent to > > another "on|off" will look silly. > > OK, looking a bit further I see this option is new to v17. After a > bit more study of the sgml, I agree that it's worth changing this. > > I've pushed your patch. > Thanks very much for pushing. It was just a one-time thing -- I won't go looking for more examples like it. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Docs: Always use true|false instead of sometimes on|off for the subscription options
On Thu, May 16, 2024 at 3:11 PM David Rowley wrote: > > On Thu, 16 May 2024 at 12:29, Peter Smith wrote: > > There are lots of subscription options listed on the CREATE > > SUBSCRIPTION page [1]. > > > > Although these boolean options are capable of accepting different > > values like "1|0", "on|off", "true|false", here they are all described > > only using values "true|false". > > If you want to do this, what's the reason to limit it to just this one > page of the docs? Yeah, I had a vested interest in this one place because I've been reviewing the other thread [1] that was mentioned above. If that other thread chooses "true|false" then putting "true|false" adjacent to another "on|off" will look silly. But if that other thread adopts the existing 'failover=on|off' values then it will diverge even further from being consistent with the CREATE SUBSCRIPTION page. Unfortunately, that other thread cannot take it upon itself to make this change because it has nothing to do with the 'failover' option, So I saw no choice but to post an independent patch for this. I checked all the PUBLICATION/SUBSCRIPTION reference pages and this was the only inconsistent value that I found. But I might be mistaken. > > If the following is anything to go by, it doesn't seem we're very > consistent about this over the entire documentation. > > doc$ git grep "on" | wc -l > 122 > > doc$ git grep "true" | wc -l > 222 > > And: > > doc$ git grep "off" | wc -l > 102 > > doc$ git grep "false" | wc -l > 162 > Hmm. I'm not entirely sure if those stats are meaningful because I'm not saying option values should avoid "on|off" -- the point was I just suggesting they should be used consistent with where they are described. For example, the CREATE SUBSCRIPTION page describes every boolean option value as "true|false", so let's use "true|false" in every other docs place where those are mentioned. OTOH, other options on other pages may be described as "on|off" which is fine by me, but then those should similarly be using "on|off" again wherever they are mentioned. > I think unless we're going to standardise on something then there's > not much point in adjusting individual cases. IMO, there could be an > endless stream of follow-on patches as a result of accepting this. > Standardisation might be ideal, but certainly, I'm not going to attempt to make a giant patch that impacts the entire documentation just for this one small improvement. It seems a shame if "perfect" becomes the enemy of "good"; How else do you suggest I can make the ALTER SUBSCRIPTION page better? If this one-line change is rejected then the most likely outcome is nothing will ever happen to change it. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Pgoutput not capturing the generated columns
uldn't they both look like this: if (att->attisdropped) continue; if (att->attgenerated && !publish_generated_column) continue; if (!column_in_column_list(att->attnum, columns)) continue; == src/backend/replication/pgoutput/pgoutput.c 5. static void send_relation_and_attrs(Relation relation, TransactionId xid, LogicalDecodingContext *ctx, - Bitmapset *columns); + Bitmapset *columns, + bool publish_generated_column); Use plural. /publish_generated_column/publish_generated_columns/ ~~~ 6. parse_output_parameters bool origin_option_given = false; + bool generate_column_option_given = false; data->binary = false; data->streaming = LOGICALREP_STREAM_OFF; data->messages = false; data->two_phase = false; + data->publish_generated_column = false; I think the 1st var should be 'include_generated_columns_option_given' for consistency with the name of the actual option that was given. == src/include/replication/logicalproto.h 7. (Same as a previous review comment) For all the API changes the new parameter name should be plural. /publish_generated_column/publish_generated_columns/ == src/include/replication/pgoutput.h 8. bool publish_no_origin; + bool publish_generated_column; } PGOutputData; /publish_generated_column/publish_generated_columns/ == Kind Regards, Peter Smith. Fujitsu Australia
Docs: Always use true|false instead of sometimes on|off for the subscription options
Hi hackers, There are lots of subscription options listed on the CREATE SUBSCRIPTION page [1]. Although these boolean options are capable of accepting different values like "1|0", "on|off", "true|false", here they are all described only using values "true|false". ~ IMO this consistent way of describing the boolean option values ought to be followed also on the ALTER SUBSCRIPTION page [2]. Specifically, the ALTER SUBSCRIPTION page has one mention of "SET (failover = on|off)" which I think should be changed to say "SET (failover = true|false)" Now this little change hardly seems important, but actually, it is motivated by another thread [3] under development where this ALTER SUBSCRIPTION "(failover = on|off)" was copied again, thereby making the consistency between the CREATE SUBSCRIPTION and ALTER SUBSCRIPTION pages grow further apart, so I think it is best to nip that in the bud and simply use "true|false" values everywhere. PSA a patch to make this proposed change. == [1] https://www.postgresql.org/docs/devel/sql-createsubscription.html [2] https://www.postgresql.org/docs/devel/sql-altersubscription.html [3] https://www.postgresql.org/message-id/flat/8fab8-65d74c80-1-2f28e880%4039088166 Kind Regards, Peter Smith. Fujitsu Australia v1-0001-Always-use-true-false-instead-of-on-off-for-boole.patch Description: Binary data
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
On Tue, May 14, 2024 at 10:03 PM Hayato Kuroda (Fujitsu) wrote: > > Dear Peter, > ... > > 4.11. > > > > +# Alter the two_phase with the force_alter option. Apart from the the last > > +# ALTER SUBSCRIPTION command, the command will abort the prepared > > transaction > > +# and succeed. > > > > There is typo "the the" and the wording is a bit strange. Why not just say: > > > > SUGGESTION > > Alter the two_phase true to false with the force_alter option enabled. > > This command will succeed after aborting the prepared transaction. > > Fixed. > You wrote "Fixed" for that patch v9-0004 suggestion but I don't think anything was changed at all. Accidentally missed? == Kind Regards, Peter Smith. Futjisu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
istBySubid(subid)) != NIL) { /* Abort all listed transactions */ foreach_ptr(char, gid, prepared_xacts) FinishPreparedTransaction(gid, false); list_free_deep(prepared_xacts); } } /* Change system catalog acoordingly */ values[Anum_pg_subscription_subtwophasestate - 1] = CharGetDatum(opts.twophase ? LOGICALREP_TWOPHASE_STATE_PENDING : LOGICALREP_TWOPHASE_STATE_DISABLED); replaces[Anum_pg_subscription_subtwophasestate - 1] = true; } ~ Why is "if (!opts.twophase)" being checked at the top and then immediately being checed again here: + if (!opts.twophase) + PreventInTransactionBlock(isTopLevel, + "ALTER SUBSCRIPTION ... SET (two_phase = false)"); And then again here: CharGetDatum(opts.twophase ? LOGICALREP_TWOPHASE_STATE_PENDING : LOGICALREP_TWOPHASE_STATE_DISABLED); There is no need to re-check a flag that was already checked, so clearly some of this logic/code is either wrong or redundant. == src/test/subscription/t/099_twophase_added.pl (Let's change these on|off to true|false to match what you did already in patch 0002). 3.3. +# +# Check the case that prepared transactions exist on the subscriber node +# +# If the two_phase is altering from "on" to "off" and there are prepared +# transactions on the subscriber, they must be aborted. This test checks it. /off/false/ /on/true/ ~~~ 3.4. +# Verify the prepared transaction has been replicated to the subscriber because +# two_phase is set to "on". /on/true/ ~~~ 3.5. +# Toggle the two_phase to "off" before the COMMIT PREPARED +$node_subscriber->safe_psql( +'postgres', " +ALTER SUBSCRIPTION regress_sub DISABLE; +ALTER SUBSCRIPTION regress_sub SET (two_phase = off); +ALTER SUBSCRIPTION regress_sub ENABLE;"); /off/false/ /two_phase = off/two_phase = false/ ~~~ 3.6. +# Verify any prepared transactions are aborted because two_phase is changed to +# "off". /off/false/ // patch v10-0004 // == 4.g1. GENERAL - document rendering fails FYI - The document failed to build after I apply patch 0003. Did you try it? In my environment it reported some unbalanced tags: ref/create_subscription.sgml:448: parser error : Opening and ending tag mismatch: link line 436 and para ^ ref/create_subscription.sgml:449: parser error : Opening and ending tag mismatch: para line 435 and listitem etc. == doc/src/sgml/ref/alter_subscription.sgml 4.1. The two_phase parameter can only be altered when the - subscription is disabled. When altering the parameter from true - to false, the backend process checks for any incomplete - prepared transactions done by the logical replication worker (from when - two_phase parameter was still true) - and, if any are found, those are aborted. + subscription is disabled. Altering the parameter from true + to false will give an error when when there are + prepared transactions done by the logical replication worker. If you want + to alter the parameter forcibly in this case, + force_alter + option must be set to true at the same time. TYPO: "when when" Why is necessary to say "at the same time"? == doc/src/sgml/ref/create_subscription.sgml 4.2. + +force_alter (boolean) + + + Specifies if the ALTER SUBSCRIPTION + can be forced to proceed instead of giving an error. There is + currently only one scenario where this parameter has any effect: When + altering two_phase option from true + to false it is possible for there to be incomplete + prepared transactions done by the logical replication worker (from + when two_phase parameter was still true). + If force_alter is false, then + this will give an error; if force_alter is + true, then the incomplete prepared transactions + are aborted and the alter will proceed. + The default is false. + + + IMO this will be better broken into multiple paragraphs. 1. Specifies... 2. There is... 3. The default is... == src/test/subscription/t/099_twophase_added.pl (Let's change all the on|off to true|false like you already did in patch 0002. 4.3. +# Try altering the two_phase option to "off." The command will fail since there +# is a prepared transaction and the 'force_alter' option is not specified as +# true. +my $stdout; +my $stderr; /off./false/ == Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
-+-+-++---+--+--++---+---+--++-+-- - regress_testsub4 | regress_subscription_user | f | {testpub} | f | off | d| f| none | t | f | f| off| dbname=regress_doesnotexist | 0/0 + List of subscriptions + Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Force_alter | Synchronous commit | Conninfo | Skip LSN +--+---+-+-++---+--+--++---+---+--+-++-+-- The column heading should be "Force alter", as already mentioned by an earlier review comment (#4.7) == src/test/subscription/t/099_twophase_added.pl 4.11. +# Alter the two_phase with the force_alter option. Apart from the the last +# ALTER SUBSCRIPTION command, the command will abort the prepared transaction +# and succeed. There is typo "the the" and the wording is a bit strange. Why not just say: SUGGESTION Alter the two_phase true to false with the force_alter option enabled. This command will succeed after aborting the prepared transaction. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi Kuroda-san, I'm having second thoughts about how these patches mention the option values "on|off". These are used in the ALTER SUBSCRIPTION document page for 'two_phase' and 'failover' parameters, and then those "on|off" get propagated to the code comments, error messages, and tests... Now I see that on the CREATE SUBSCRIPTION page [1], every boolean parameter (even including 'two_phase' and 'failover') is described in terms of "true|false" (not "on|off"). In hindsight, it is probably better to refer only to true|false everywhere for these boolean parameters, instead of sometimes using different values like on|off. What do you think? == [1] https://www.postgresql.org/docs/devel/sql-createsubscription.html Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi, Here are some review comments for v8-0002. == Commit message 1. Regarding the off->on case, the logical replication already has a mechanism for it, so there is no need to do anything special for the on->off case; however, we must connect to the publisher and expressly change the parameter. The operation cannot be rolled back, and altering the parameter from "on" to "off" within a transaction is prohibited. ~ I think the difference between "off"-to"on" and "on"-to"off" needs to be explained in more detail. Specifically "already has a mechanism for it" seems very vague. == src/backend/commands/subscriptioncmds.c 2. /* - * The changed two_phase option of the slot can't be rolled - * back. + * Since the altering two_phase option of subscriptions + * also leads to the change of slot option, this command + * cannot be rolled back. So prevent we are in the + * transaction block. */ - PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET (two_phase)"); + if (!opts.twophase) + PreventInTransactionBlock(isTopLevel, + 2a. There is a typo: "So prevent we are". SUGGESTION (minor adjustment and typo fix) Since altering the two_phase option of subscriptions also leads to changing the slot option, this command cannot be rolled back. So prevent this if we are in a transaction block. ~ 2b. But, in my previous review [v7-0002#3] I asked if the comment could explain why this check is only needed for two_phase "on"-to-"off" but not for "off"-to-"on". That explanation/reason is still not yet given in the latest comment. ~~~ 3. /* - * Try to acquire the connection necessary for altering slot. + * Check the need to alter the replication slot. Failover and two_phase + * options are controlled by both the publisher (as a slot option) and the + * subscriber (as a subscription option). + */ + update_failover = replaces[Anum_pg_subscription_subfailover - 1]; + update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate - 1] && + !opts.twophase); (This is similar to the previous comment). In my previous review [v7-0002#3a] I asked why update_two_phase is TRUE only if 'two-phase' is being updated "on"-to-"off", but not when it is being updated "off"-to-"on". That explanation/reason is still not yet given in the latest comment. == src/backend/replication/libpqwalreceiver/libpqwalreceiver.c 4. - appendStringInfo(, "ALTER_REPLICATION_SLOT %s ( FAILOVER %s, TWO_PHASE %s )", - quote_identifier(slotname), - failover ? "true" : "false", - two_phase ? "true" : "false"); + appendStringInfo(, "ALTER_REPLICATION_SLOT %s ( ", + quote_identifier(slotname)); + + if (failover) + appendStringInfo(, "FAILOVER %s", + *failover ? "true" : "false"); + + if (failover && two_phase) + appendStringInfo(, ", "); + + if (two_phase) + appendStringInfo(, "TWO_PHASE %s", + *two_phase ? "true" : "false"); + + appendStringInfoString(, ");"); It will be better if that last line includes the extra space like I had suggested in [v7-0002#4a] so the result will have the same spacing as in the original code. e.g. + appendStringInfoString(, " );"); == src/test/subscription/t/099_twophase_added.pl 5. +# Check the case that prepared transactions exist on the publisher node. +# +# Since the two_phase is "off", then normally, this PREPARE will do nothing +# until the COMMIT PREPARED, but in this test, we toggle the two_phase to "on" +# again before the COMMIT PREPARED happens. This is a major test part so IMO this comment should have # like it had before, to distinguish it from all the sub-step comments. == My v7-0002 review - https://www.postgresql.org/message-id/CAHut%2BPtu_w_UCGR-5DbenA%2By7wRiA8QPi_ZP%3DCCJ3SGdTn-%3D%3Dg%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia.
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
on done by worker is aborted"); /transaction are aborted/transaction was aborted/ == Response to my v7-0004 review -- https://www.postgresql.org/message-id/OSBPR01MB2552F738ACF1DA6838025C4FF5E62%40OSBPR01MB2552.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi, Here are some review comments for v8-0003. == src/sgml/ref/alter_subscription.sgml 1. + + The two_phase parameter can only be altered when the + subscription is disabled. When altering the parameter from on + to off, the backend process checks prepared + transactions done by the logical replication worker and aborts them. + The text may be OK as-is, but I was wondering if it might be better to give a more verbose explanation. BEFORE ... the backend process checks prepared transactions done by the logical replication worker and aborts them. SUGGESTION ... the backend process checks for any incomplete prepared transactions done by the logical replication worker (from when two_phase parameter was still "on") and, if any are found, those are aborted. == src/backend/commands/subscriptioncmds.c 2. AlterSubscription - /* - * Since the altering two_phase option of subscriptions - * also leads to the change of slot option, this command - * cannot be rolled back. So prevent we are in the - * transaction block. + * If two_phase was enabled, there is a possibility the + * transactions has already been PREPARE'd. They must be + * checked and rolled back. */ BEFORE ... there is a possibility the transactions has already been PREPARE'd. SUGGESTION ... there is a possibility that transactions have already been PREPARE'd. ~~~ 3. AlterSubscription + /* + * Since the altering two_phase option of subscriptions + * (especially on->off case) also leads to the + * change of slot option, this command cannot be rolled + * back. So prevent we are in the transaction block. + */ PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET (two_phase = off)"); This comment is a bit vague and includes some typos, but IIUC these problems will already be addressed by the 0002 patch changes.AFAIK patch 0003 is only moving the 0002 comment. == src/test/subscription/t/099_twophase_added.pl 4. +# Check the case that prepared transactions exist on the subscriber node +# +# If the two_phase is altering from "on" to "off" and there are prepared +# transactions on the subscriber, they must be aborted. This test checks it. + Similar to the comment that I gave for v8-0002. I think there should be comment for the major test comment to distinguish it from comments for the sub-steps. ~~~ 5. +# Verify the prepared transaction are aborted because two_phase is changed to +# "off". +$result = $node_subscriber->safe_psql('postgres', +"SELECT count(*) FROM pg_prepared_xacts;"); +is($result, q(0), "prepared transaction done by worker is aborted"); + /the prepared transaction are aborted/any prepared transactions are aborted/ == Kind Regards, Peter Smith Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi, Here are some review comments for patch v7-0004 == Commit message 1. A detailed commit message is needed to describe the purpose and details of this patch. == doc/src/sgml/ref/alter_subscription.sgml 2. CREATE SUBSCRIPTION Shouldn't there be an entry for "force_alter" parameter in the CREATE SUBSCRIPTION "parameters" section, instead of just vaguely mentioning it in passing when describing the "two_phase" in ALTER SUBSCRIPTION? ~ 3. ALTER SUBSCRIPTION - alterable parameters And shouldn't this new option also be named in the ALTER SUBSCRIPTION list: "The parameters that can be altered are..." == src/backend/commands/subscriptioncmds.c 4. XLogRecPtr lsn; + bool twophase_force; } SubOpts; IMO this field ought to be called 'force_alter' to be the same as the option name. Sure, now it is only relevant for 'two_phase', but that might not always be the case in the future. ~~~ 5. AlterSubscription + /* + * Abort prepared transactions if force option is also + * specified. Otherwise raise an ERROR. + */ + if (!opts.twophase_force) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot alter %s when there are prepared transactions", + "two_phase = false"))); + 5a. /if force option is also specified/only if the 'force_alter' option is true/ ~ 5b. "two_phase = false" -- IMO that should say "two_phase = off" ~ 5c. IMO this ereport should include a errhint to tell the user they can use 'force_alter = true' to avoid getting this error. ~~~ 6. + /* force_alter cannot be used standalone */ + if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) && + !IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT)) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("%s must be specified with %s", + "force_alter", "two_phase"))); + IMO this rule is not necessary so the code should be removed. I think using 'force_alter' standalone doesn't do anything at all (certainly, it does no harm) so why add more complications (more rules, more code, more tests) just for the sake of it? == src/test/subscription/t/099_twophase_added.pl 7. +$node_subscriber->safe_psql('postgres', +"ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter = on);"); "force" is a verb, so it is better to say 'force_alter = true' instead of 'force_alter = on'. ~~~ 8. $result = $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM pg_prepared_xacts;"); is($result, q(0), "prepared transaction done by worker is aborted"); +$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE;"); + I felt the ENABLE statement should be above the SELECT statement so that the code is more like it was before applying the patch. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi, Here are some review comments for the patch v7-0003. == Commit Message 1. The patch needs a commit message to describe the purpose and highlight any limitations and other details. == doc/src/sgml/ref/alter_subscription.sgml 2. + + + The two_phase parameter can only be altered when the + subscription is disabled. When altering the parameter from true + to false, the backend process checks prepared + transactions done by the logical replication worker and aborts them. + Here, the para is referring to "true" and "false" but earlier on this page it talks about "twophase = off". IMO it is better to use a consistent terminology like "on|off" everywhere instead of randomly changing the way it is described each time. == src/backend/commands/subscriptioncmds.c 3. AlterSubscription if (IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT)) { + List *prepared_xacts = NIL; This 'prepared_xacts' can be declared at a lower scrope because it is only used if (!opts.twophase). Furthermore, IIUC you don't need to assign NIL in the declaration because there is no chance for it to be unassigned anyway. ~~~ 4. AlterSubscription + /* + * The changed two_phase option (true->false) of the + * slot can't be rolled back. + */ PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET (two_phase = off)"); Here is another example of inconsistent mixing of the terminology where the comment says "true"/"false" but the message says "off". Let's keep everything consistent. (I prefer on|off). ~~~ 5. + if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && + (prepared_xacts = GetGidListBySubid(subid)) != NIL) + { + ListCell *cell; + + /* Abort all listed transactions */ + foreach(cell, prepared_xacts) + FinishPreparedTransaction((char *) lfirst(cell), + false); + + list_free(prepared_xacts); + } 5A. IIRC there is a cleaner way to write this loop without needing ListCell variable -- e.g. foreach_ptr() macro? ~ 5B. Shouldn't this be using list_free_deep() so the pstrdup gid gets freed too? == src/test/subscription/t/099_twophase_added.pl 6. +## +# Check the case that prepared transactions exist on subscriber node +## + Give some more detailed comments here similar to the review comment of patch v7-0002 for the other part of this TAP test. ~~~ 7. TAP test - comments Same as for my v7-0002 review comments, I think this test case also needs a few more one-line comments to describe the sub-steps. e.g.: # prepare a transaction to insert some rows to the table # verify the prepared tx is replicated to the subscriber (because 'two_phase = on') # toggle the two_phase to 'off' *before* the COMMIT PREPARED # verify the prepared tx got aborted # do the COMMIT PREPARED (note that now two_phase is 'off') # verify the inserted rows got replicated ok ~~~ 8. TAP test - subscription name It's better to rename the SUBSCRIPTION in this TAP test so you can avoid getting log warnings like: psql::4: WARNING: subscriptions created by regression test cases should have names starting with "regress_" psql::4: NOTICE: created replication slot "sub" on publisher == Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
over ? "true" : "false"); + + if (failover && two_phase) + appendStringInfo(, ", "); + + if (two_phase) + appendStringInfo(, "TWO_PHASE %s", + *two_phase ? "true" : "false"); + + appendStringInfoString(, " );"); ~~ 4b. Like I said above, IMO the current separator logic in v7 is broken. So it is a bit concerning the tests all passed anyway. How did that happen? I think this indicates that there needs to be an additional test scenario where both 'failover' and 'two_phase' get altered at the same time so this code gets exercised properly. == src/test/subscription/t/099_twophase_added.pl 5. +# Define pre-existing tables on both nodes Why say they are "pre-existing"? They are not pre-existing because you are creating them right here! ~~~ 6. +## +# Check the case that prepared transactions exist on publisher node +## I think this needs a slightly more detailed comment. SUGGESTION (this is just an example, but you can surely improve it) # Check the case that prepared transactions exist on the publisher node. # # Since two_phase is "off", then normally this PREPARE will do nothing until # the COMMIT PREPARED, but in this test, we toggle the two_phase to "on" again # before the COMMIT PREPARED happens. ~~~ 7. Maybe this test case needs a few more one-line comments for each of the sub-steps. e.g.: # prepare a transaction to insert some rows to the table # verify the prepared tx is not yet replicated to the subscriber (because 'two_phase = off') # toggle the two_phase to 'on' *before* the COMMIT PREPARED # verify the inserted rows got replicated ok ~~~ 8. IIUC this test will behave the same even if you DON'T do the toggle 'two_phase = on'. So I wonder is there something more you can do to test this scenario more convincingly? == Kind Regards, Peter Smith Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi Kuroda-san, Thanks for addressing most of my v6-0001 review comments. Below are some minor follow-up comments for v7-0001. == src/backend/access/transam/twophase.c 1. IsTwoPhaseTransactionGidForSubid +/* + * IsTwoPhaseTransactionGidForSubid + * Check whether the given GID is formed by TwoPhaseTransactionGid. + */ +static bool +IsTwoPhaseTransactionGidForSubid(Oid subid, char *gid) I think the function comment should mention something about 'subid'. SUGGESTION Check whether the given GID (as formed by TwoPhaseTransactionGid) is for the specified 'subid'. == src/backend/commands/subscriptioncmds.c 2. AlterSubscription + if (!opts.twophase && + form->subtwophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && + LookupGXactBySubid(subid)) + /* Add error message */ + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot disable two_phase when uncommitted prepared transactions present"), + errhint("Resolve these transactions and try again"))); The comment "/* Add error message */" seems unnecessary. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Slow catchup of 2PC (twophase) transactions on replica in LR
see that this code was extracted from where it was previously inlined in subscriptioncmds.c, so maybe the 'false' is necessary for another reason? At least maybe some explanatory comment is needed for why you are passing this flag as false? == src/backend/replication/logical/worker.c 11. - /* two-phase should not be altered */ + /* two-phase should not be altered while the worker exists */ Assert(newsub->twophasestate == MySubscription->twophasestate); /should not/cannot/ == src/backend/replication/slot.c 12. void -ReplicationSlotAlter(const char *name, bool failover) +ReplicationSlotAlter(const char *name, bool two_phase, bool failover) Same comment as mentioned elsewhere (#15), IMO the new 'two_phase' parameter should be last. ~~~ 13. + if (MyReplicationSlot->data.two_phase != two_phase) + { + SpinLockAcquire(>mutex); + MyReplicationSlot->data.two_phase = two_phase; + SpinLockRelease(>mutex); + + update_slot = true; + } + + if (MyReplicationSlot->data.failover != failover) { SpinLockAcquire(>mutex); MyReplicationSlot->data.failover = failover; SpinLockRelease(>mutex); + update_slot = true; + } 13a. Doesn't it make more sense for the whole check/set to be "atomic", i.e. put the mutex also around the check? SUGGEST SpinLockAcquire(>mutex); if (MyReplicationSlot->data.two_phase != two_phase) { MyReplicationSlot->data.two_phase = two_phase; update_slot = true; } SpinLockRelease(>mutex); ~ Also, (if you agree with the above) why not include both checks (two_phase and failover) within the same mutex instead of acquiring/releasing it twice: SUGGEST SpinLockAcquire(>mutex); if (MyReplicationSlot->data.two_phase != two_phase) { MyReplicationSlot->data.two_phase = two_phase; update_slot = true; } if (MyReplicationSlot->data.failover != failover) { MyReplicationSlot->data.failover = failover; update_slot = true; } SpinLockAcquire(>mutex); ~~~ 13b. There are double blank lines after the first if-block. == src/backend/replication/walsender.c 14. static void -ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover) +ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, + bool *two_phase, bool *failover) Same comment as mentioned elsewhere (#15), IMO the new 'two_phase' parameter should be last. == src/include/replication/walreceiver.h 15. typedef void (*walrcv_alter_slot_fn) (WalReceiverConn *conn, const char *slotname, + bool two_phase, bool failover); Somehow, I feel it is more normal to add the new code (the 'two_phase' parameter) at the END, instead of into the middle of the existing parameters. It also keeps it alphabetical which makes it consistent with other places like the tab-completion code. This comment about swapping the order (putting new stuff last) will propagate changes to lots of other related places. I refer to this comment in a few other places in this post but there are probably more the same that I missed. == src/test/regress/sql/subscription.sql 16. I know you do this already in the TAP test, but doesn't the test case to demonstrate that 'two-phase' option can be altered when the subscription is disabled actually belong here in the regression instead? == src/test/subscription/t/021_twophase.pl 17. +# Disable the subscription and alter it to two_phase = false, +# verify that the altered subscription reflects the two_phase option. /verify/then verify/ ~~~ 18. +# Now do a prepare on publisher and make sure that it is not replicated. +$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub"); +$node_publisher->safe_psql( + 'postgres', qq{ +BEGIN; +INSERT INTO tab_copy VALUES (100); +PREPARE TRANSACTION 'newgid'; + }); + 18a. /on publisher/on the publisher/ 18b. What is that "DROP SUBSCRIPTION tap_sub" doing here? It seems misplaced under this comment. ~~~ 19. +# Make sure that there is 0 prepared transaction on the subscriber +$result = $node_subscriber->safe_psql('postgres', + "SELECT count(*) FROM pg_prepared_xacts;"); +is($result, qq(0), 'transaction is prepared on subscriber'); 19a. SUGGESTION Make sure there are no prepared transactions on the subscriber ~~~ 19b. /'transaction is prepared on subscriber'/'should be no prepared transactions on subscriber'/ ~~~ 20. +# Made sure that the commited transaction is replicated. /Made sure/Make sure/ /commited/committed/ ~~~ 21. +# Make sure that the two-phase is enabled on the subscriber +$result = $node_subscriber->safe_psql('postgres', + "SELECT subtwophasestate FROM pg_subscription WHERE subname = 'tap_sub_copy';" +); +is($result, qq(e), 'two-phase is disabled'); The 'two-phase is disabled' is the identical message used in the opposite case earlier, so something is amiss. Maybe this one should say 'two-phase should be enabled' and the earlier counterpart should say 'two-phase should be disabled'. == Kind Regards, Peter Smith Fujitsu Australia
Re: Improve the connection failure error messages
Hi, just by visual inspection of the v3/v4 patch diffs, the latest v4 LGTM. == Kind Regards, Peter Smith. Fujitsu Australia
Re: logicalrep_worker_launch -- counting/checking the worker limits
On Sun, Mar 31, 2024 at 8:12 PM Andrey M. Borodin wrote: > > > > > On 15 Aug 2023, at 07:38, Peter Smith wrote: > > > > A rebase was needed due to a recent push [1]. > > > > PSA v3. > > > > On 14 Jan 2024, at 10:43, vignesh C wrote: > > > > I have changed the status of the patch to "Waiting on Author" as > > Amit's queries at [1] have not been verified and concluded. Please > > feel free to address them and change the status back again. > > Hi Peter! > > Are you still interested in this thread? If so - please post an answer to > Amit's question. > If you are not interested - please Withdraw a CF entry [0]. > > Thanks! Yeah, sorry for the long period of inactivity on this thread. Although I still have some interest in it, I don't know when I'll get back to it again so meantime I've withdrawn this from the CF as requested. Kind regards, Peter Smith Fujitsu Australia
Re: Improve the connection failure error messages
FYI -- some more code has been pushed since this patch was last updated. AFAICT perhaps you'll want to update this patch again for the following new connection messages on HEAD: - slotfuncs.c [1] - slotsync.c [2] -- [1] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/slotfuncs.c#L989 [2] https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/logical/slotsync.c#L1258 Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve eviction algorithm in ReorderBuffer
On Wed, Mar 13, 2024 at 12:48 PM Masahiko Sawada wrote: > > On Wed, Mar 13, 2024 at 10:15 AM Peter Smith wrote: > > > > On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith wrote: > > > > > > ... > > > > > > 5. > > > > > > + * > > > > > > + * If 'indexed' is true, we create a hash table to track of each > > > > > > node's > > > > > > + * index in the heap, enabling to perform some operations such as > > > > > > removing > > > > > > + * the node from the heap. > > > > > > */ > > > > > > binaryheap * > > > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > > void *arg) > > > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > > + bool indexed, void *arg) > > > > > > > > > > > > BEFORE > > > > > > ... enabling to perform some operations such as removing the node > > > > > > from the heap. > > > > > > > > > > > > SUGGESTION > > > > > > ... to help make operations such as removing nodes more efficient. > > > > > > > > > > > > > > > > But these operations literally require the indexed binary heap as we > > > > > have an assertion: > > > > > > > > > > void > > > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d) > > > > > { > > > > > bh_nodeidx_entry *ent; > > > > > > > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > > > > > Assert(heap->bh_indexed); > > > > > > > > > > > > > I didn’t quite understand -- the operations mentioned are "operations > > > > such as removing the node", but binaryheap_remove_node() also removes > > > > a node from the heap. So I still felt the comment wording of the patch > > > > is not quite correct. > > > > > > Now I understand your point. That's a valid point. > > > > > > > > > > > Now, if the removal of a node from an indexed heap can *only* be done > > > > using binaryheap_remove_node_ptr() then: > > > > - the other removal functions (binaryheap_remove_*) probably need some > > > > comments to make sure nobody is tempted to call them directly for an > > > > indexed heap. > > > > - maybe some refactoring and assertions are needed to ensure those > > > > *cannot* be called directly for an indexed heap. > > > > > > > > > > If the 'index' is true, the caller can not only use the existing > > > functions but also newly added functions such as > > > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about > > > something like below? > > > > > > > You said: "can not only use the existing functions but also..." > > > > Hmm. Is that right? IIUC those existing "remove" functions should NOT > > be called directly if the heap was "indexed" because they'll delete > > the node from the heap OK, but any corresponding index for that > > deleted node will be left lying around -- i.e. everything gets out of > > sync. This was the reason for my original concern. > > > > All existing binaryheap functions should be available even if the > binaryheap is 'indexed'. For instance, with the patch, > binaryheap_remote_node() is: > > void > binaryheap_remove_node(binaryheap *heap, int n) > { > int cmp; > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > Assert(n >= 0 && n < heap->bh_size); > > /* compare last node to the one that is being removed */ > cmp = heap->bh_compare(heap->bh_nodes[--heap->bh_size], >heap->bh_nodes[n], >heap->bh_arg); > > /* remove the last node, placing it in the vacated entry */ > replace_node(heap, n, heap->bh_nodes[heap->bh_size]); > > /* sift as needed to preserve the heap property */ > if (cmp > 0) > sift_up(heap, n); > else if (cmp < 0) > sift_down(heap, n); > } > > The replace_node(), sift_up() and sift_down() update node's index as > well if the bin
Re: Improve eviction algorithm in ReorderBuffer
On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada wrote: > > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith wrote: > > ... > > > > 5. > > > > + * > > > > + * If 'indexed' is true, we create a hash table to track of each node's > > > > + * index in the heap, enabling to perform some operations such as > > > > removing > > > > + * the node from the heap. > > > > */ > > > > binaryheap * > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, void > > > > *arg) > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > + bool indexed, void *arg) > > > > > > > > BEFORE > > > > ... enabling to perform some operations such as removing the node from > > > > the heap. > > > > > > > > SUGGESTION > > > > ... to help make operations such as removing nodes more efficient. > > > > > > > > > > But these operations literally require the indexed binary heap as we > > > have an assertion: > > > > > > void > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d) > > > { > > > bh_nodeidx_entry *ent; > > > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > > > Assert(heap->bh_indexed); > > > > > > > I didn’t quite understand -- the operations mentioned are "operations > > such as removing the node", but binaryheap_remove_node() also removes > > a node from the heap. So I still felt the comment wording of the patch > > is not quite correct. > > Now I understand your point. That's a valid point. > > > > > Now, if the removal of a node from an indexed heap can *only* be done > > using binaryheap_remove_node_ptr() then: > > - the other removal functions (binaryheap_remove_*) probably need some > > comments to make sure nobody is tempted to call them directly for an > > indexed heap. > > - maybe some refactoring and assertions are needed to ensure those > > *cannot* be called directly for an indexed heap. > > > > If the 'index' is true, the caller can not only use the existing > functions but also newly added functions such as > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about > something like below? > You said: "can not only use the existing functions but also..." Hmm. Is that right? IIUC those existing "remove" functions should NOT be called directly if the heap was "indexed" because they'll delete the node from the heap OK, but any corresponding index for that deleted node will be left lying around -- i.e. everything gets out of sync. This was the reason for my original concern. > * If 'indexed' is true, we create a hash table to track each node's > * index in the heap, enabling to perform some operations such as > * binaryheap_remove_node_ptr() etc. > Yeah, something like that... I'll wait for the next patch version before commenting further. -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve eviction algorithm in ReorderBuffer
ing: Assert(!binaryheap_empty(rb->txn_heap)); ~~~ 10. + +/* + * Compare between sizes of two transactions. This is for a binary heap + * comparison function. + */ +static int +ReorderBufferTXNSizeCompare(Datum a, Datum b, void *arg) 10a. /Compare between sizes of two transactions./Compare two transactions by size./ ~~~ 10b. IMO this comparator function belongs just before the ReorderBufferAllocate() function since that is the only place where it is used. == src/include/replication/reorderbuffer.h 11. +/* State of how to track the memory usage of each transaction being decoded */ +typedef enum ReorderBufferMemTrackState +{ + /* + * We don't update max-heap while updating the memory counter. The + * max-heap is built before use. + */ + REORDER_BUFFER_MEM_TRACK_NO_MAXHEAP, + + /* + * We also update the max-heap when updating the memory counter so the + * heap property is always preserved. + */ + REORDER_BUFFER_MEM_TRACK_MAINTAIN_MAXHEAP, +} ReorderBufferMemTrackState; + In my GENERAL review comment #0, I suggested the removal of this entire enum. e.g. It could be replaced with a boolean field 'track_txn_sizes' TBH, I think there is a better way to handle this "state". IIUC - the txn_heap is always allocated up-front. - you only "build" it when > threshold and - when it drops < 0.9 x threshold you reset it. Therefore, AFAICT you do not need to maintain any “switch states” at all; you simply need to check binaryheap_empty(txn_heap), right? * If the heap is empty…. It means you are NOT tracking, so don’t use it * If the heap is NOT empty …. It means you ARE tracking, so use it. ~ Using my idea to remove the state flag will have the side effect of simplifying many other parts of this patch. For example BEFORE +static void +ReorderBufferMaybeChangeNoMaxHeap(ReorderBuffer *rb) +{ + if (rb->memtrack_state == REORDER_BUFFER_MEM_TRACK_NO_MAXHEAP) + return; + ... + if (binaryheap_size(rb->txn_heap) < REORDER_BUFFER_MEM_TRACK_THRESHOLD * 0.9) + { + rb->memtrack_state = REORDER_BUFFER_MEM_TRACK_NO_MAXHEAP; + binaryheap_reset(rb->txn_heap); + } +} AFTER +static void +ReorderBufferMaybeChangeNoMaxHeap(ReorderBuffer *rb) +{ + if (binaryheap_empty(rb->txn_heap)) + return; + ... + if (binaryheap_size(rb->txn_heap) < REORDER_BUFFER_MEM_TRACK_THRESHOLD * 0.9) + binaryheap_reset(rb->txn_heap); +} ~~~ 12. struct ReorderBuffer + /* Max-heap for sizes of all top-level and sub transactions */ + ReorderBufferMemTrackState memtrack_state; + binaryheap *txn_heap; + 12a. Why is this being referred to in the commit message and code comments as "max-heap" when the field is not called by that same name? Won't it be better to give the field a better name -- e.g. "txn_maxheap" or similar? ~ 12b. This comment should also say that the heap is ordered by tx size -- (e.g. the comparator is ReorderBufferTXNSizeCompare) -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve eviction algorithm in ReorderBuffer
On Thu, Mar 7, 2024 at 2:16 PM Masahiko Sawada wrote: > > On Tue, Mar 5, 2024 at 3:28 PM Peter Smith wrote: > > > > 4a. > > The comment in simplehash.h says > > * The following parameters are only relevant when SH_DEFINE is defined: > > * - SH_KEY - ... > > * - SH_EQUAL(table, a, b) - ... > > * - SH_HASH_KEY(table, key) - ... > > * - SH_STORE_HASH - ... > > * - SH_GET_HASH(tb, a) - ... > > > > So maybe it is nicer to reorder the #defines in that same order? > > > > SUGGESTION: > > +#define SH_PREFIX bh_nodeidx > > +#define SH_ELEMENT_TYPE bh_nodeidx_entry > > +#define SH_KEY_TYPE bh_node_type > > +#define SH_SCOPE extern > > +#ifdef FRONTEND > > +#define SH_RAW_ALLOCATOR pg_malloc0 > > +#endif > > > > +#define SH_DEFINE > > +#define SH_KEY key > > +#define SH_EQUAL(tb, a, b) (memcmp(, , sizeof(bh_node_type)) == 0) > > +#define SH_HASH_KEY(tb, key) \ > > + hash_bytes((const unsigned char *) , sizeof(bh_node_type)) > > +#include "lib/simplehash.h" > > I'm really not sure it helps increase readability. For instance, for > me it's readable if SH_DEFINE and SH_DECLARE come to the last before > #include since it's more obvious whether we want to declare, define or > both. Other simplehash.h users also do so. > OK. > > 5. > > + * > > + * If 'indexed' is true, we create a hash table to track of each node's > > + * index in the heap, enabling to perform some operations such as removing > > + * the node from the heap. > > */ > > binaryheap * > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, void *arg) > > +binaryheap_allocate(int capacity, binaryheap_comparator compare, > > + bool indexed, void *arg) > > > > BEFORE > > ... enabling to perform some operations such as removing the node from the > > heap. > > > > SUGGESTION > > ... to help make operations such as removing nodes more efficient. > > > > But these operations literally require the indexed binary heap as we > have an assertion: > > void > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d) > { > bh_nodeidx_entry *ent; > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > Assert(heap->bh_indexed); > I didn’t quite understand -- the operations mentioned are "operations such as removing the node", but binaryheap_remove_node() also removes a node from the heap. So I still felt the comment wording of the patch is not quite correct. Now, if the removal of a node from an indexed heap can *only* be done using binaryheap_remove_node_ptr() then: - the other removal functions (binaryheap_remove_*) probably need some comments to make sure nobody is tempted to call them directly for an indexed heap. - maybe some refactoring and assertions are needed to ensure those *cannot* be called directly for an indexed heap. > > 7b. > > IMO the parameters would be better the other way around (e.g. 'index' > > before the 'node') because that's what the assignments look like: > > > > > > heap->bh_nodes[heap->bh_size] = d; > > > > becomes: > > bh_set_node(heap, heap->bh_size, d); > > > > I think it assumes heap->bh_nodes is an array. But if we change it in > the future, it will no longer make sense. I think it would make more > sense if we define the parameters in an order like "we set the 'node' > at 'index'". What do you think? YMMV. The patch code is also OK by me if you prefer it. // And, here are some review comments for v8-0002. == 1. delete_nodeidx +/* + * Remove the node's index from the hash table if the heap is indexed. + */ +static bool +delete_nodeidx(binaryheap *heap, bh_node_type node) +{ + if (!binaryheap_indexed(heap)) + return false; + + return bh_nodeidx_delete(heap->bh_nodeidx, node); +} 1a. In v8 this function was changed to now return bool, so, I think the function comment should explain the meaning of that return value. ~ 1b. I felt the function body is better expressed positively: "If this then do that", instead of "If not this then do nothing otherwise do that" SUGGESTION if (binaryheap_indexed(heap)) return bh_nodeidx_delete(heap->bh_nodeidx, node); return false; ~~~ 2. +static void +replace_node(binaryheap *heap, int index, bh_node_type new_node) +{ + bool found PG_USED_FOR_ASSERTS_ONLY; + + /* Quick return if not necessary to move */ + if (heap->bh_nodes[index] == new_node) + return; + + /* + * Remove overwritten node's index. The overwritten node's position must + * have been tracked, if enabled. + */ + found = delete_nodeidx(heap, heap->bh_nodes[index]);
Re: Synchronizing slots from primary to standby
Here are some review comments for v107-0001 == src/backend/replication/slot.c 1. +/* + * Struct for the configuration of standby_slot_names. + * + * Note: this must be a flat representation that can be held in a single chunk + * of guc_malloc'd memory, so that it can be stored as the "extra" data for the + * standby_slot_names GUC. + */ +typedef struct +{ + int slot_num; + + /* slot_names contains nmembers consecutive nul-terminated C strings */ + char slot_names[FLEXIBLE_ARRAY_MEMBER]; +} StandbySlotConfigData; + 1a. To avoid any ambiguity this 1st field is somehow a slot ID number, I felt a better name would be 'nslotnames' or even just 'n' or 'count', ~ 1b. (fix typo) SUGGESTION for the 2nd field comment slot_names is a chunk of 'n' X consecutive null-terminated C strings ~ 1c. A more explanatory name for this typedef maybe is 'StandbySlotNamesConfigData' ? ~~~ 2. +/* This is parsed and cached configuration for standby_slot_names */ +static StandbySlotConfigData *standby_slot_config; 2a. /This is parsed and cached configuration for .../This is the parsed and cached configuration for .../ ~ 2b. Similar to above -- since this only has name information maybe it is more correct to call it 'standby_slot_names_config'? ~~~ 3. +/* + * A helper function to validate slots specified in GUC standby_slot_names. + * + * The rawname will be parsed, and the parsed result will be saved into + * *elemlist. + */ +static bool +validate_standby_slots(char *rawname, List **elemlist) /and the parsed result/and the result/ ~~~ 4. check_standby_slot_names + /* Need a modifiable copy of string */ + rawname = pstrdup(*newval); /copy of string/copy of the GUC string/ ~~~ 5. +assign_standby_slot_names(const char *newval, void *extra) +{ + /* + * The standby slots may have changed, so we must recompute the oldest + * LSN. + */ + ss_oldest_flush_lsn = InvalidXLogRecPtr; + + standby_slot_config = (StandbySlotConfigData *) extra; +} To avoid leaking don't we need to somewhere take care to free any memory used by a previous value (if any) of this 'standby_slot_config'? ~~~ 6. AcquiredStandbySlot +/* + * Return true if the currently acquired slot is specified in + * standby_slot_names GUC; otherwise, return false. + */ +bool +AcquiredStandbySlot(void) +{ + const char *name; + + /* Return false if there is no value in standby_slot_names */ + if (standby_slot_config == NULL) + return false; + + name = standby_slot_config->slot_names; + for (int i = 0; i < standby_slot_config->slot_num; i++) + { + if (strcmp(name, NameStr(MyReplicationSlot->data.name)) == 0) + return true; + + name += strlen(name) + 1; + } + + return false; +} 6a. Just checking "(standby_slot_config == NULL)" doesn't seem enough to me, because IIUIC it is possible when 'standby_slot_names' has no value then maybe standby_slot_config is not NULL but standby_slot_config->slot_num is 0. ~ 6b. IMO this function would be tidier written such that the MyReplicationSlot->data.name is passed as a parameter. Then you can name the function more naturally like: IsSlotInStandbySlotNames(const char *slot_name) ~ 6c. IMO the body of the function will be tidier if written so there are only 2 returns instead of 3 like SUGGESTION: if (...) { for (...) { ... return true; } } return false; ~~~ 7. + /* + * Don't need to wait for the standbys to catch up if there is no value in + * standby_slot_names. + */ + if (standby_slot_config == NULL) + return true; (similar to a previous review comment) This check doesn't seem enough because IIUIC it is possible when 'standby_slot_names' has no value then maybe standby_slot_config is not NULL but standby_slot_config->slot_num is 0. ~~~ 8. WaitForStandbyConfirmation + /* + * Don't need to wait for the standby to catch up if the current acquired + * slot is not a logical failover slot, or there is no value in + * standby_slot_names. + */ + if (!MyReplicationSlot->data.failover || !standby_slot_config) + return; (similar to a previous review comment) IIUIC it is possible that when 'standby_slot_names' has no value, then standby_slot_config is not NULL but standby_slot_config->slot_num is 0. So shouldn't that be checked too? Perhaps it is convenient to encapsulate this check using some macro: #define StandbySlotNamesHasNoValue() (standby_slot_config = NULL || standby_slot_config->slot_num == 0) -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve eviction algorithm in ReorderBuffer
h_indexed) (void) bh_nodeidx_delete(heap->bh_nodeidx, node); ~~~ 10. +/* + * Replace the node at 'idx' with the given node 'replaced_by'. Also + * update their positions accordingly. + */ +static void +bh_replace_node(binaryheap *heap, int idx, bh_node_type replaced_by) 10a. Would 'node' or 'new_node' or 'replacement' be a better name than 'replaced_by'? ~ 10b. I noticed that the index param is called 'idx' here but in other functions, it is called 'index'. I think either is good (I prefer 'idx') but at least everywhere should use the same name for consistency. ~~~ 11. +static void +bh_replace_node(binaryheap *heap, int idx, bh_node_type replaced_by) +{ + /* Remove overwritten node's index */ + bh_delete_nodeidx(heap, heap->bh_nodes[idx]); + + /* Replace it with the given new node */ + if (idx < heap->bh_size) + { + bool found PG_USED_FOR_ASSERTS_ONLY; + + found = bh_set_node(heap, replaced_by, idx); + + /* The overwritten node's index must already be tracked */ + Assert(!heap->bh_indexed || found); + } +} I did not understand the condition. e.g. Can you explain when is idx NOT less than heap->bh_size? e.g. If this condition failed then nothing gets replaced (??) ~~~ == src/include/lib/binaryheap.h 12. +/* + * Struct for A hash table element to store the node's index in the bh_nodes + * array. + */ +typedef struct bh_nodeidx_entry /for A hash table/for a hash table/ ~~~ 13. +/* define parameters necessary to generate the hash table interface */ Suggest uppercase "Define" and add a period. ~~~ 14. + + /* + * If bh_indexed is true, the bh_nodeidx is used to track of each node's + * index in bh_nodes. This enables the caller to perform + * binaryheap_remove_node_ptr(), binaryheap_update_up/down in O(log n). + */ + bool bh_indexed; + bh_nodeidx_hash *bh_nodeidx; } binaryheap; I'm wondering why the separate 'bh_indexed' is necessary at all. Can't you just use the bh_nodeidx value? E.g. If bh_nodeidx == NULL then it means there is no index tracking, otherwise there is. -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve eviction algorithm in ReorderBuffer
Hi, Here are some review comments for v7-0001 1. /* * binaryheap_free * * Releases memory used by the given binaryheap. */ void binaryheap_free(binaryheap *heap) { pfree(heap); } Shouldn't the above function (not modified by the patch) also firstly free the memory allocated for the heap->bh_nodes? ~~~ 2. +/* + * Make sure there is enough space for nodes. + */ +static void +bh_enlarge_node_array(binaryheap *heap) +{ + heap->bh_space *= 2; + heap->bh_nodes = repalloc(heap->bh_nodes, + sizeof(bh_node_type) * heap->bh_space); +} Strictly speaking, this function doesn't really "Make sure" of anything because the caller does the check whether we need more space. All that happens here is allocating more space. Maybe this function comment should say something like "Double the space allocated for nodes." -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Here are some review comments for v105-0001 == doc/src/sgml/config.sgml 1. + +The standbys corresponding to the physical replication slots in +standby_slot_names must configure +sync_replication_slots = true so they can receive +logical failover slots changes from the primary. + /slots changes/slot changes/ == doc/src/sgml/func.sgml 2. +The function may be waiting if the specified slot is a logical failover +slot and standby_slot_names +is configured. I know this has been through multiple versions already, but this latest wording "may be waiting..." doesn't seem very good to me. How about one of these? * The function may not be able to return immediately if the specified slot is a logical failover slot and standby_slot_names is configured. * The function return might be blocked if the specified slot is a logical failover slot and standby_slot_names is configured. * If the specified slot is a logical failover slot then the function will block until all physical slots specified in standby_slot_names have confirmed WAL receipt. * If the specified slot is a logical failover slot then the function will not return until all physical slots specified in standby_slot_names have confirmed WAL receipt. ~~~ 3. +slot may return to an earlier position. The function may be waiting if +the specified slot is a logical failover slot and +standby_slot_names Same as previous review comment #2 == src/backend/replication/slot.c 4. WaitForStandbyConfirmation + * Used by logical decoding SQL functions that acquired logical failover slot. IIUC it doesn't work like that. pg_logical_slot_get_changes_guts() calls here unconditionally (i.e. the SQL functions don't even check if they are failover slots before calling this) so the comment seems misleading/redundant. == src/backend/replication/walsender.c 5. NeedToWaitForWal + /* + * Check if the standby slots have caught up to the flushed position. It + * is good to wait up to the flushed position and then let the WalSender + * send the changes to logical subscribers one by one which are already + * covered by the flushed position without needing to wait on every change + * for standby confirmation. + */ + if (NeedToWaitForStandbys(flushed_lsn, wait_event)) + return true; + + *wait_event = 0; + return false; +} + 5a. The comment (or part of it?) seems misplaced because it is talking WalSender sending changes, but that is not happening in this function. Also, isn't what this is saying already described by the other comment in the caller? e.g.: + /* + * Within the loop, we wait for the necessary WALs to be flushed to + * disk first, followed by waiting for standbys to catch up if there + * are enough WALs or upon receiving the shutdown signal. To avoid the + * scenario where standbys need to catch up to a newer WAL location in + * each iteration, we update our idea of the currently flushed + * position only if we are not waiting for standbys to catch up. + */ ~ 5b. Most of the code is unnecessary. AFAICT all this is exactly same as just 1 line: return NeedToWaitForStandbys(flushed_lsn, wait_event); ~~~ 6. WalSndWaitForWal + /* + * Within the loop, we wait for the necessary WALs to be flushed to + * disk first, followed by waiting for standbys to catch up if there + * are enough WALs or upon receiving the shutdown signal. To avoid the + * scenario where standbys need to catch up to a newer WAL location in + * each iteration, we update our idea of the currently flushed + * position only if we are not waiting for standbys to catch up. + */ Regarding that 1st sentence: maybe this logic used to be done explicitly "within the loop" but IIUC this logic is now hidden inside NeedToWaitForWal() so the comment should mention that. -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Mon, Mar 4, 2024 at 2:38 PM Amit Kapila wrote: > > On Mon, Mar 4, 2024 at 6:57 AM Peter Smith wrote: > > > > On Sun, Mar 3, 2024 at 2:56 PM Amit Kapila wrote: > > > > > > On Sun, Mar 3, 2024 at 5:17 AM Peter Smith wrote: > > > > > > ... > > > > 9. NeedToWaitForWal > > > > > > > > + /* > > > > + * Check if the standby slots have caught up to the flushed position. > > > > It > > > > + * is good to wait up to flushed position and then let it send the > > > > changes > > > > + * to logical subscribers one by one which are already covered in > > > > flushed > > > > + * position without needing to wait on every change for standby > > > > + * confirmation. Note that after receiving the shutdown signal, an > > > > ERROR > > > > + * is reported if any slots are dropped, invalidated, or inactive. This > > > > + * measure is taken to prevent the walsender from waiting indefinitely. > > > > + */ > > > > + if (NeedToWaitForStandby(target_lsn, flushed_lsn, wait_event)) > > > > + return true; > > > > > > > > I felt it was confusing things for this function to also call to the > > > > other one -- it seems an overlapping/muddling of the purpose of these. > > > > I think it will be easier to understand if the calling code just calls > > > > to one or both of these functions as required. > > > > > > > > > > I felt otherwise because the caller has to call these functions at > > > more than one place which makes the caller's code difficult to follow. > > > It is better to encapsulate the computation of wait_event. > > > > > > > You may have misinterpreted my review comment because I didn't say > > anything about changing the encapsulation of the computation of the > > wait_event. > > > > No, I have understood it in the same way as you have outlined in this > email and liked the way the current patch has it. > OK, if the code will remain as-is wouldn't it be better to anyway change the function name to indicate what it really does? e.g. NeedToWaitForWal --> NeedToWaitForWalFlushOrStandbys -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Mon, Mar 4, 2024 at 2:49 PM Amit Kapila wrote: > > On Mon, Mar 4, 2024 at 7:25 AM Peter Smith wrote: > > > > > > == > > doc/src/sgml/config.sgml > > > > 2. > > +pg_logical_slot_peek_changes, > > +when used with failover enabled logical slots, will block until all > > +physical slots specified in standby_slot_names > > have > > +confirmed WAL receipt. > > > > /failover enabled logical slots/failover-enabled logical slots/ > > > > How about just saying logical failover slots at this and other places? > Yes, that wording works too. -- Kind Regards, Peter Smith. Fujitsu Australia.
Re: Synchronizing slots from primary to standby
On Sun, Mar 3, 2024 at 6:51 PM Zhijie Hou (Fujitsu) wrote: > > On Sunday, March 3, 2024 7:47 AM Peter Smith wrote: > > > > 3. > > + > > + > > + Value * is not accepted as it is inappropriate > > to > > + block logical replication for physical slots that either lack > > + associated standbys or have standbys associated that are not > > enabled > > + for replication slot synchronization. (see > > + > linkend="logicaldecoding-replication-slots-synchronization"/>). > > + > > + > > > > Why does the document need to provide an excuse/reason for the rule? > > You could just say something like: > > > > SUGGESTION > > The slots must be named explicitly. For example, specifying wildcard values > > like > > * is not permitted. > > As suggested by Amit, I moved this to code comments. Was the total removal of this note deliberate? I only suggested removing the *reason* for the rule, not the entire rule. Otherwise, the user won't know to avoid doing this until they try it and get an error. > > > > 9. NeedToWaitForWal > > > > + /* > > + * Check if the standby slots have caught up to the flushed position. > > + It > > + * is good to wait up to flushed position and then let it send the > > + changes > > + * to logical subscribers one by one which are already covered in > > + flushed > > + * position without needing to wait on every change for standby > > + * confirmation. Note that after receiving the shutdown signal, an > > + ERROR > > + * is reported if any slots are dropped, invalidated, or inactive. This > > + * measure is taken to prevent the walsender from waiting indefinitely. > > + */ > > + if (NeedToWaitForStandby(target_lsn, flushed_lsn, wait_event)) return > > + true; > > > > I felt it was confusing things for this function to also call to the other > > one -- it > > seems an overlapping/muddling of the purpose of these. > > I think it will be easier to understand if the calling code just calls to > > one or both > > of these functions as required. > > Same as Amit, I didn't change this. AFAICT my previous review comment was misinterpreted. Please see [1] for more details. Here are some more review comments for v104-1 == Commit message 1. Additionally, The SQL functions pg_logical_slot_get_changes, pg_logical_slot_peek_changes and pg_replication_slot_advance are modified to wait for the replication slots specified in 'standby_slot_names' to catch up before returning. ~ Maybe that should be expressed using similar wording as the docs... SUGGESTION Additionally, The SQL functions ... are modified. Now, when used with failover-enabled logical slots, these functions will block until all physical slots specified in 'standby_slot_names' have confirmed WAL receipt. == doc/src/sgml/config.sgml 2. +pg_logical_slot_peek_changes, +when used with failover enabled logical slots, will block until all +physical slots specified in standby_slot_names have +confirmed WAL receipt. /failover enabled logical slots/failover-enabled logical slots/ == doc/src/sgml/func.sgml 3. +The function may be blocked if the specified slot is a failover enabled +slot and standby_slot_names +is configured. /a failover enabled slot//a failover-enabled slot/ ~~~ 4. +slot may return to an earlier position. The function may be blocked if +the specified slot is a failover enabled slot and +standby_slot_names +is configured. /a failover enabled slot//a failover-enabled slot/ == src/backend/replication/slot.c 5. +/* + * Wait for physical standbys to confirm receiving the given lsn. + * + * Used by logical decoding SQL functions that acquired failover enabled slot. + * It waits for physical standbys corresponding to the physical slots specified + * in the standby_slot_names GUC. + */ +void +WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) /failover enabled slot/failover-enabled slot/ ~~~ 6. + /* + * Don't need to wait for the standby to catch up if the current acquired + * slot is not a failover enabled slot, or there is no value in + * standby_slot_names. + */ /failover enabled slot/failover-enabled slot/ == src/backend/replication/slotfuncs.c 7. + + /* + * Wake up logical walsenders holding failover enabled slots after + * updating the restart_lsn of the physical slot. + */ + PhysicalWakeupLogicalWalSnd(); /failover enabled slots/failover-enabled slots/ == src/backend/replication/walsender.c 8. +/* + * Wake up the logical walsender processes with failover enabled slots if the + * currently acquired physical s
Re: Synchronizing slots from primary to standby
On Sun, Mar 3, 2024 at 2:56 PM Amit Kapila wrote: > > On Sun, Mar 3, 2024 at 5:17 AM Peter Smith wrote: > > ... > > 9. NeedToWaitForWal > > > > + /* > > + * Check if the standby slots have caught up to the flushed position. It > > + * is good to wait up to flushed position and then let it send the changes > > + * to logical subscribers one by one which are already covered in flushed > > + * position without needing to wait on every change for standby > > + * confirmation. Note that after receiving the shutdown signal, an ERROR > > + * is reported if any slots are dropped, invalidated, or inactive. This > > + * measure is taken to prevent the walsender from waiting indefinitely. > > + */ > > + if (NeedToWaitForStandby(target_lsn, flushed_lsn, wait_event)) > > + return true; > > > > I felt it was confusing things for this function to also call to the > > other one -- it seems an overlapping/muddling of the purpose of these. > > I think it will be easier to understand if the calling code just calls > > to one or both of these functions as required. > > > > I felt otherwise because the caller has to call these functions at > more than one place which makes the caller's code difficult to follow. > It is better to encapsulate the computation of wait_event. > You may have misinterpreted my review comment because I didn't say anything about changing the encapsulation of the computation of the wait_event. I only wrote it is better IMO for the functions to stick to just one job each according to their function name. E.g.: - NeedToWaitForStandby should *only* check if we need to wait for standbys - NeedToWaitForWal should *only* check if we need to wait for WAL flush; i.e. this shouldn't be also calling NeedToWaitForStandby(). Also, AFAICT the caller changes should not be difficult. Indeed, these changes will make the code aligned properly with what the comments are saying: BEFORE /* * Fast path to avoid acquiring the spinlock in case we already know we * have enough WAL available and all the standby servers have confirmed * receipt of WAL up to RecentFlushPtr. This is particularly interesting * if we're far behind. */ if (!XLogRecPtrIsInvalid(RecentFlushPtr) && !NeedToWaitForWal(loc, RecentFlushPtr, _event)) return RecentFlushPtr; SUGGESTED ... if (!XLogRecPtrIsInvalid(RecentFlushPtr) && !NeedToWaitForWal(loc, RecentFlushPtr, _event) && !NeedToWaitForStandby(loc, RecentFlushPtr, _event) return RecentFlushPtr; ~~~ BEFORE /* * Exit the loop if already caught up and doesn't need to wait for * standby slots. */ if (!wait_for_standby_at_stop && !NeedToWaitForWal(loc, RecentFlushPtr, _event)) break; SUGGESTED ... if (!wait_for_standby_at_stop && !NeedToWaitForWal(loc, RecentFlushPtr, _event) && !NeedToWaitForStandby(loc, RecentFlushPtr, _event)) break; -- Kind Regards, Peter Smith. Fujitsu Australia
Re: pub/sub - specifying optional parameters without values.
On Fri, Jan 12, 2024 at 4:07 PM Peter Smith wrote: > > On Mon, Jan 30, 2023 at 8:36 AM Tom Lane wrote: > > > > Zheng Li writes: > > > The behavior is due to the following code > > > https://github.com/postgres/postgres/blob/master/src/backend/commands/define.c#L113 > > > > Yeah, so you can grep for places that have this behavior by looking > > for defGetBoolean calls ... and there are quite a few. That leads > > me to the conclusion that we'd better invent a fairly stylized > > documentation solution that we can plug into a lot of places, > > rather than thinking of slightly different ways to say it and > > places to say it. I'm not necessarily opposed to Peter's desire > > to fix replication-related commands first, but we have more to do > > later. > > > > I'm also not that thrilled with putting the addition up at the top > > of the relevant text. This behavior is at least two decades old, > > so if we've escaped documenting it at all up to now, it can't be > > that important to most people. > > > > I also notice that ALTER SUBSCRIPTION has fully three different > > sub-sections with about equal claims on this note, if we're going > > to stick it directly into the affected option lists. > > > > That all leads me to propose that we add the new text at the end of > > the Parameters in the affected man pages. So about > > like the attached. (I left out alter_publication.sgml, as I'm not > > sure it needs its own copy of this text --- it doesn't describe > > individual parameters at all, just refer to CREATE PUBLICATION.) > > > > regards, tom lane > > > > Hi, > > Here is a similar update for another page: "55.4 Streaming Replication > Protocol" [0]. This patch was prompted by a review comment reply at > [1] (#2). > > I've used text almost the same as the boilerplate text added by the > previous commit [2] > > ~ > > PSA patch v4. > > == > [0] https://www.postgresql.org/docs/devel/protocol-replication.html > [1] > https://www.postgresql.org/message-id/OS0PR01MB571663BCE8B28597D462FADE946A2%40OS0PR01MB5716.jpnprd01.prod.outlook.com > [2] > https://github.com/postgres/postgres/commit/ec7e053a98f39a9e3c7e6d35f0d2e83933882399 > > Kind Regards, > Peter Smith. > Fujitsu Australia Bump. Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Sat, Mar 2, 2024 at 2:51 PM Zhijie Hou (Fujitsu) wrote: > > On Friday, March 1, 2024 12:23 PM Peter Smith wrote: > > ... > > == > > src/backend/replication/slot.c > > > > 2. validate_standby_slots > > > > + else if (!ReplicationSlotCtl) > > + { > > + /* > > + * We cannot validate the replication slot if the replication slots' > > + * data has not been initialized. This is ok as we will validate the > > + * specified slot when waiting for them to catch up. See > > + * StandbySlotsHaveCaughtup for details. > > + */ > > + } > > + else > > + { > > + /* > > + * If the replication slots' data have been initialized, verify if the > > + * specified slots exist and are logical slots. > > + */ > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > > > IMO that 2nd comment does not need to say "If the replication slots' > > data have been initialized," because that is implicit from the if/else. > > Removed. I only meant to suggest removing the 1st part, not the entire comment. I thought it is still useful to have a comment like: /* Check that the specified slots exist and are logical slots.*/ == And, here are some review comments for v103-0001. == Commit message 1. Additionally, The SQL functions pg_logical_slot_get_changes, pg_logical_slot_peek_changes and pg_replication_slot_advance are modified to wait for the replication slots mentioned in 'standby_slot_names' to catch up before returning. ~ (use the same wording as previous in this message) /mentioned in/specified in/ == doc/src/sgml/config.sgml 2. +Additionally, when using the replication management functions + +pg_replication_slot_advance, + +pg_logical_slot_get_changes, and + +pg_logical_slot_peek_changes, +with failover enabled logical slots, the functions will wait for the +physical slots specified in standby_slot_names to +confirm WAL receipt before proceeding. + It says "the ... functions" twice so maybe reword it slightly. BEFORE Additionally, when using the replication management functions pg_replication_slot_advance, pg_logical_slot_get_changes, and pg_logical_slot_peek_changes, with failover enabled logical slots, the functions will wait for the physical slots specified in standby_slot_names to confirm WAL receipt before proceeding. SUGGESTION Additionally, the replication management functions pg_replication_slot_advance, pg_logical_slot_get_changes, and pg_logical_slot_peek_changes, when used with failover enabled logical slots, will wait for the physical slots specified in standby_slot_names to confirm WAL receipt before proceeding. (Actually the "will wait ... before proceeding" is also a bit tricky, so below is another possible rewording) SUGGESTION #2 Additionally, the replication management functions pg_replication_slot_advance, pg_logical_slot_get_changes, and pg_logical_slot_peek_changes, when used with failover enabled logical slots, will block until all physical slots specified in standby_slot_names have confirmed WAL receipt. ~~~ 3. + + + Value * is not accepted as it is inappropriate to + block logical replication for physical slots that either lack + associated standbys or have standbys associated that are not enabled + for replication slot synchronization. (see + ). + + Why does the document need to provide an excuse/reason for the rule? You could just say something like: SUGGESTION The slots must be named explicitly. For example, specifying wildcard values like * is not permitted. == doc/src/sgml/func.sgml 4. @@ -28150,7 +28150,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset - + pg_logical_slot_get_changes @@ -28177,7 +28177,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset - + pg_logical_slot_peek_changes @@ -28232,7 +28232,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset - + pg_replication_slot_advance Should these 3 functions say something about how their behaviour is affected by 'standby_slot_names' and give a link back to the GUC 'standby_slot_names' docs? == src/backend/replication/slot.c 5. StandbySlotsHaveCaughtup + if (!slot) + { + /* + * If the provided slot does not exist, report a message and exit + * the loop. It is possible for a user to specify a slot name in + * standby_slot_names that does not exist just before the server + * startup. The GUC check_hook(validate_standby_slots) cannot + * validate such a slot during startup
Re: Synchronizing slots from primary to standby
On Fri, Mar 1, 2024 at 5:11 PM Masahiko Sawada wrote: > ... > +/* > + * "*" is not accepted as in that case primary will not be able to > know > + * for which all standbys to wait for. Even if we have physical slots > + * info, there is no way to confirm whether there is any standby > + * configured for the known physical slots. > + */ > +if (strcmp(*newval, "*") == 0) > +{ > +GUC_check_errdetail("\"*\" is not accepted for > standby_slot_names"); > +return false; > +} > > Why only '*' is checked aside from validate_standby_slots()? I think > that the doc doesn't mention anything about '*' and '*' cannot be used > as a replication slot name. So even if we don't have this check, it > might be no problem. > Hi, a while ago I asked this same question. See [1 #28] for the response.. -- [1] https://www.postgresql.org/message-id/OS0PR01MB571646B8186F6A06404BD50194BDA%40OS0PR01MB5716.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
the shutdown signal + * is received. + * + * Set failover_slot to true if the current acquired slot is a failover enabled + * slot and we are streaming. + * + * If returning true, the function sets the appropriate wait event in + * wait_event; otherwise, wait_event is set to 0. + */ +static bool +NeedToWaitForWal(XLogRecPtr target_lsn, XLogRecPtr flushed_lsn, + bool prioritize_stop, bool failover_slot, + uint32 *wait_event) 8a. /Set prioritize_stop to true/Specify prioritize_stop=true/ /Set failover_slot to true/Specify failover_slot=true/ ~ 8b. Aren't the static function names typically snake_case? ~~~ 9. + /* + * Check if we need to wait for WALs to be flushed to disk. We don't need + * to wait for WALs after receiving the shutdown signal unless the + * wait_for_wal_on_stop is true. + */ + if (target_lsn > flushed_lsn && !(prioritize_stop && got_STOPPING)) + *wait_event = WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL; The comment says 'wait_for_wal_on_stop' but the code says 'prioritize_stop' (??) ~~~ 10. + /* + * Check if we need to wait for WALs to be flushed to disk. We don't need + * to wait for WALs after receiving the shutdown signal unless the + * wait_for_wal_on_stop is true. + */ + if (target_lsn > flushed_lsn && !(prioritize_stop && got_STOPPING)) + *wait_event = WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL; + + /* + * Check if the standby slots have caught up to the flushed position. It is + * good to wait up to RecentFlushPtr and then let it send the changes to + * logical subscribers one by one which are already covered in + * RecentFlushPtr without needing to wait on every change for standby + * confirmation. Note that after receiving the shutdown signal, an ERROR is + * reported if any slots are dropped, invalidated, or inactive. This + * measure is taken to prevent the walsender from waiting indefinitely. + */ + else if (failover_slot && !StandbySlotsHaveCaughtup(flushed_lsn, elevel)) + *wait_event = WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION; + else + return false; + + return true; This if/else/else seems overly difficult to read. IMO it will be easier if written like: SUGGESTION if (target_lsn > flushed_lsn && !(prioritize_stop && got_STOPPING)) { *wait_event = WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL; return true; } if (failover_slot && !StandbySlotsHaveCaughtup(flushed_lsn, elevel)) { *wait_event = WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION; return true; } return false; -- Kind Regards, Peter Smith. Fujitsu Australia
DOCS: Avoid using abbreviation "aka"
While checking some recently pushed changes [1] I noticed documentation [2] that includes the abbreviation "aka". IMO it is preferable to avoid informal abbreviations like "aka" in the documents, because not everyone will understand the meaning. Furthermore, I think this is reinforced by the fact this was the *only* example of "aka" that I could find in all of the .sgml. Indeed, assuming that "aka" is short for "also known as" then the sentence still doesn't seem correct even after those words are substituted. HEAD For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby aka primary_slot_name should be configured on the standby, and hot_standby_feedback must be enabled on the standby. SUGGESTION For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby (i.e., primary_slot_name should be configured on the standby), and hot_standby_feedback must be enabled on the standby. ~ I found that the "aka" was introduced in v86-0001 [3]. So my replacement text above restores to something similar to how it was in v85-0001. PSA a patch for the same. -- [1] https://github.com/postgres/postgres/commit/ddd5f4f54a026db6a6692876d0d44aef902ab686#diff-29c2d2e0480177b04f9c3d82c1454f8c00a11b8e761a9c9f5f4f6d61e6f19252 [2] https://www.postgresql.org/docs/devel/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS-SYNCHRONIZATION [3] [1] https://www.postgresql.org/message-id/OS0PR01MB5716E581B4227DDEB4DE6C30944F2%40OS0PR01MB5716.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia v1-0001-Replace-aka-in-docs.patch Description: Binary data
Re: Synchronizing slots from primary to standby
On Tue, Feb 27, 2024 at 11:35 PM Amit Kapila wrote: > > On Tue, Feb 27, 2024 at 12:48 PM Peter Smith wrote: > > > > Here are some review comments for v99-0001 > > > > == > > 0. GENERAL. > > > > +#standby_slot_names = '' # streaming replication standby server slot names > > that > > + # logical walsender processes will wait for > > > > IMO the GUC name is too generic. There is nothing in this name to > > suggest it has anything to do with logical slot synchronization; that > > meaning is only found in the accompanying comment -- it would be > > better if the GUC name itself were more self-explanatory. > > > > e.g. Maybe like 'wal_sender_sync_standby_slot_names' or > > 'wal_sender_standby_slot_names', 'wal_sender_wait_for_standby_slots', > > or ... > > > > It would be wrong and or misleading to append wal_sender to this GUC > name as this is used during SQL APIs as well. Fair enough, but the fact that some SQL functions might wait is also not mentioned in the config file comment, nor in the documentation for GUC 'standby_slot_names'. Seems like a docs omission? > Also, adding wait sounds > more like a boolean. So, I don't see the proposed names any better > than the current one. > Anyway, the point is that the current GUC name 'standby_slot_names' is not ideal IMO because it doesn't have enough meaning by itself -- e.g. you have to read the accompanying comment or documentation to have any idea of its purpose. My suggested GUC names were mostly just to get people thinking about it. Maybe others can come up with better names. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Wed, Feb 28, 2024 at 1:23 PM Zhijie Hou (Fujitsu) wrote: > > On Tuesday, February 27, 2024 3:18 PM Peter Smith > wrote: ... > > 20. > > +# > > +# | > standby1 (primary_slot_name = sb1_slot) # | > standby2 > > +(primary_slot_name = sb2_slot) # primary - | # | > subscriber1 > > +(failover = true) # | > subscriber2 (failover = false) > > > > In the diagram, the "--->" means a mixture of physical standbys and logical > > pub/sub replication. Maybe it can be a bit clearer? > > > > SUGGESTION > > > > # primary (publisher) > > # > > # (physical standbys) > > # | > standby1 (primary_slot_name = sb1_slot) > > # | > standby2 (primary_slot_name = sb2_slot) > > # > > # (logical replication) > > # | > subscriber1 (failover = true, slot_name = lsub1_slot) > > # | > subscriber2 (failover = false, slot_name = lsub2_slot) > > > > I think one can distinguish it based on the 'standby' and 'subscriber' as > well, because > 'standby' normally refer to physical standby while the other refer to logical. > Ok, but shouldn't it at least include info about the logical slot names associated with the subscribers (slot_name = lsub1_slot, slot_name = lsub2_slot) like suggested above? == Here are some more review comments for v100-0001 == doc/src/sgml/config.sgml 1. + +Lists the streaming replication standby server slot names that logical +WAL sender processes will wait for. Logical WAL sender processes will +send decoded changes to plugins only after the specified replication +slots confirm receiving WAL. This guarantees that logical replication +slots with failover enabled do not consume changes until those changes +are received and flushed to corresponding physical standbys. If a +logical replication connection is meant to switch to a physical standby +after the standby is promoted, the physical replication slot for the +standby should be listed here. Note that logical replication will not +proceed if the slots specified in the standby_slot_names do not exist or +are invalidated. + Is that note ("Note that logical replication will not proceed if the slots specified in the standby_slot_names do not exist or are invalidated") meant only for subscriptions marked for 'failover' or any subscription? Maybe wording can be modified to help clarify it? == src/backend/replication/slot.c 2. +/* + * A helper function to validate slots specified in GUC standby_slot_names. + */ +static bool +validate_standby_slots(char **newval) +{ + char*rawname; + List*elemlist; + bool ok; + + /* Need a modifiable copy of string */ + rawname = pstrdup(*newval); + + /* Verify syntax and parse string into a list of identifiers */ + ok = SplitIdentifierString(rawname, ',', ); + + if (!ok) + { + GUC_check_errdetail("List syntax is invalid."); + } + + /* + * If the replication slots' data have been initialized, verify if the + * specified slots exist and are logical slots. + */ + else if (ReplicationSlotCtl) + { + foreach_ptr(char, name, elemlist) + { + ReplicationSlot *slot; + + slot = SearchNamedReplicationSlot(name, true); + + if (!slot) + { + GUC_check_errdetail("replication slot \"%s\" does not exist", + name); + ok = false; + break; + } + + if (!SlotIsPhysical(slot)) + { + GUC_check_errdetail("\"%s\" is not a physical replication slot", + name); + ok = false; + break; + } + } + } + + pfree(rawname); + list_free(elemlist); + return ok; +} 2a. I didn't mention this previously because I thought this function was not going to change anymore, but since Bertrand suggested some changes [1], I will say IMO the { } are fine here for the single statement, but I think it will be better to rearrange this code to be like below. Having a 2nd NOP 'else' gives a much better place where you can put your ReplicationSlotCtl comment. if (!ok) { GUC_check_errdetail("List syntax is invalid."); } else if (!ReplicationSlotCtl) { } else { foreach_ptr ... } ~ 2b. In any case even if don't refactor anything I still think you need to extend that comment to explain how skipping ReplicationSlotCtl is NULL is only OK because there will be some later checking also done in the FilterStandbySlots() function to catch any config problems. -- [1] https://www.postgresql.org/message-id/Zd8ahZXw82ieFxX/%40ip-10-97-1-34.eu-west-3.compute.internal Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Wed, Feb 28, 2024 at 10:21 PM Amit Kapila wrote: > > On Wed, Feb 28, 2024 at 3:26 PM shveta malik wrote: > > > > > > Here is the patch which addresses the above comments. Also optimized > > the test a little bit. Now we use pg_sync_replication_slots() function > > instead of worker to test the operator-redirection using search-patch. > > This has been done to simplify the test case and reduce the added > > time. > > > > I have slightly adjusted the comments in the attached, otherwise, LGTM. > > -- - if (logical) + /* + * Set always-secure search path for the cases where the connection is + * used to run SQL queries, so malicious users can't get control. + */ + if (logical || !replication) { PGresult *res; I found this condition a bit confusing. According to the libpqrcv_connect function comment: * This function can be used for both replication and regular connections. * If it is a replication connection, it could be either logical or physical * based on input argument 'logical'. IIUC that comment is saying the 'replication' flag is like the main categorization and the 'logical' flag is like a subcategory (for when 'replication' is true). Therefore, won't the modified check be better to be written the other way around? This will also be consistent with the way the Assert was written. SUGGESTION if (!replication || logical) { ... == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
29. +# Stop the standby associated with the specified physical replication slot so +# that the logical replication slot won't receive changes until the standby +# slot's restart_lsn is advanced or the slot is removed from the +# standby_slot_names list. +$primary->safe_psql('postgres', "TRUNCATE tab_int;"); +$primary->wait_for_catchup('regress_mysub1'); +$standby1->stop; Isn't this fragment more like the first step of the *next* TEST instead of the last step of this one? ~~~ 30. +## +# Verify that when using pg_logical_slot_get_changes to consume changes from a +# logical slot with failover enabled, it will also wait for the slots specified +# in standby_slot_names to catch up. +## AFAICT this test is checking only that the function cannot return while waiting for the stopped standby, but it doesn't seem to check that it *does* return when the stopped standby comes alive again. ~~~ 31. +$result = + $subscriber1->safe_psql('postgres', "SELECT count(*) = 0 FROM tab_int;"); +is($result, 't', + "subscriber1 doesn't get data as the sb1_slot doesn't catch up"); Do you think this fragment should have a comment? == Kind Regards, Peter Smith. Fujitsu Australia
Re: Add publisher and subscriber to glossary documentation.
Hi, the patch v4 LGTM. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Add publisher and subscriber to glossary documentation.
Here are some comments for patch v3: 1. + + Publication node + + + A node where a + publication is defined + for logical replication. + + + + I felt the word "node" here should link to the glossary term "Node", instead of directly to the term "Instance". ~~ 2. + + Subscription node + + + A node where a + subscription is defined + for logical replication. + + + + (same comment as above) I felt the word "node" here should link to the glossary term "Node", instead of directly to the term "Instance". ~~ Apart from those links, it looks good to me. Let's see what others think. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Logging parallel worker draught
On Mon, Feb 26, 2024 at 6:13 AM Tomas Vondra wrote: > > 1) name of the GUC > > I find the "log_parallel_worker_draught" to be rather unclear :-( Maybe > it's just me and everyone else just immediately understands what this > does / what will happen after it's set to "on", but I find it rather > non-intuitive. > Also, I don't understand how the word "draught" (aka "draft") makes sense here -- I assume the intended word was "drought" (???). == Kind Regards, Peter Smith. Fujitsu Australia.
Re: About a recently-added message
On Thu, Feb 22, 2024 at 11:36 AM Kyotaro Horiguchi wrote: > > At Wed, 21 Feb 2024 14:57:42 +0530, Amit Kapila > wrote in > > On Tue, Feb 20, 2024 at 3:21 PM shveta malik wrote: > > > > > > okay, attached v2 patch with changed error msgs and double quotes > > > around logical. > > > > > > > Horiguchi-San, does this address all your concerns related to > > translation with these new messages? > > Yes, I'm happy with all of the changes. The proposed patch appears to > cover all instances related to slotsync.c, and it looks fine to > me. Thanks! > > I found that logica.c is also using the policy that I complained > about, but it is a separate issue. > > ./logical.c 122:errmsg("logical decoding requires wal_level >= > logical"))); > Hmm. I have a currently stalled patch-set to simplify the quoting of all the GUC names by using one rule. The consensus is to *always* quote them. See [1]. And those patches already are addressing that logical.c code mentioned above. IMO it would be good if we could try to get that patch set approved to fix everything in one go, instead of ad-hoc hunting for and fixing cases one at a time. == [1] https://www.postgresql.org/message-id/CAHut%2BPsf3NewXbsFKY88Qn1ON1_dMD6343MuWdMiiM2Ds9a_wA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Add publisher and subscriber to glossary documentation.
Here are some comments for patch v2. == 1. There are whitespace problems [postgres@CentOS7-x64 oss_postgres_misc]$ git apply ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:43: trailing whitespace. A node where publication is defined for ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:45: trailing whitespace. It replicates a set of changes from a table or a group of tables in ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:66: trailing whitespace. A node where subscription is defined for ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:67: trailing whitespace. logical replication. ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:68: trailing whitespace. It subscribes to one or more publications on a publisher node and pulls warning: squelched 2 whitespace errors warning: 7 lines add whitespace errors. ~~~ 2. Publisher node + + Publisher node + + + A node where publication is defined for + logical replication. + It replicates a set of changes from a table or a group of tables in + publication to the subscriber node. + + + For more information, see + . + + + + 2a. I felt the term here should be "Publication node". Indeed, there should also be additional glossary terms like "Publisher" (i.e. it is the same as "Publication node") and "Subscriber" (i.e. it is the same as "Subscription node"). These definitions will then be consistent with node descriptions already in sections 30.1 and 30.2 ~ 2b. "node" should include a link to the glossary term. Same for any other terms mentioned ~ 2c. /A node where publication is defined for/A node where a publication is defined for/ ~ 2d. "It replicates" is misleading because it is the PUBLICATION doing the replicating, not the node. IMO it will be better to include 2 more glossary terms "Publication" and "Subscription" where you can say this kind of information. Then the link "logical-replication-publication" also belongs under the "Publication" term. ~~~ 3. + + Subscriber node + + + A node where subscription is defined for + logical replication. + It subscribes to one or more publications on a publisher node and pulls + a set of changes from a table or a group of tables in publications it + subscribes to. + + + For more information, see + . + + + All comments are similar to those above. == In summary, IMO it should be a bit more like below: SUGGESTION (include all the necessary links etc) Publisher See "Publication node" Publication A publication replicates the changes of one or more tables to a subscription. For more information, see Section 30.1 Publication node A node where a publication is defined for logical replication. Subscriber See "Subscription node" Subscription A subscription receives the changes of one or more tables from the publications it subscribes to. For more information, see Section 30.2 Subscription node A node where a subscription is defined for logical replication. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Documentation to upgrade logical replication cluster
Hi Vignesh, I have no further comments. Patch v9 LGTM. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Add lookup table for replication slot invalidation causes
On Thu, Feb 22, 2024 at 5:56 PM Michael Paquier wrote: > > On Thu, Feb 22, 2024 at 05:30:08PM +1100, Peter Smith wrote: > > Oops. Perhaps I meant more like below -- in any case, the point was > > the same -- to ensure RS_INVAL_NONE is what returns if something > > unexpected happens. > > You are right that this could be a bit confusing, even if we should > never reach this state. How about avoiding to return the index of the > loop as result, as of the attached? Would you find that cleaner? > -- Hi, yes, it should never happen, but thanks for making the changes. I would've just removed every local variable instead of adding more of them. I also felt the iteration starting from RS_INVAL_NONE instead of 0 is asserting RS_INVAL_NONE must always be the first enum and can't be rearranged. Probably it will never happen, but why require it? -- ReplicationSlotInvalidationCause GetSlotInvalidationCause(const char *conflict_reason) { for (ReplicationSlotInvalidationCause cause = 0; cause <= RS_INVAL_MAX_CAUSES; cause++) if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0) return cause; Assert(0); return RS_INVAL_NONE; } -- But maybe those nits are a matter of personal choice. Your patch code addressed my main concern, so it LGTM. -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Add lookup table for replication slot invalidation causes
On Thu, Feb 22, 2024 at 5:19 PM Peter Smith wrote: > > Hi, Sorry for the late comment but isn't the pushed logic now > different to what it was there before? > > IIUC previously (in a non-debug build) if the specified > conflict_reason was not found, it returned RS_INVAL_NONE -- now it > seems to return whatever enum happens to be last. > > How about something more like below: > > -- > ReplicationSlotInvalidationCause > GetSlotInvalidationCause(const char *conflict_reason) > { > ReplicationSlotInvalidationCause cause; > boolfound = false; > > for (cause = 0; !found && cause <= RS_INVAL_MAX_CAUSES; cause++) > found = strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0; > > Assert(found); > return found ? cause : RS_INVAL_NONE; > } > -- > Oops. Perhaps I meant more like below -- in any case, the point was the same -- to ensure RS_INVAL_NONE is what returns if something unexpected happens. ReplicationSlotInvalidationCause GetSlotInvalidationCause(const char *conflict_reason) { ReplicationSlotInvalidationCause cause; for (cause = 0; cause <= RS_INVAL_MAX_CAUSES; cause++) { if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0) return cause; } Assert(0); return RS_INVAL_NONE; } -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Add lookup table for replication slot invalidation causes
Hi, Sorry for the late comment but isn't the pushed logic now different to what it was there before? IIUC previously (in a non-debug build) if the specified conflict_reason was not found, it returned RS_INVAL_NONE -- now it seems to return whatever enum happens to be last. How about something more like below: -- ReplicationSlotInvalidationCause GetSlotInvalidationCause(const char *conflict_reason) { ReplicationSlotInvalidationCause cause; boolfound = false; for (cause = 0; !found && cause <= RS_INVAL_MAX_CAUSES; cause++) found = strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0; Assert(found); return found ? cause : RS_INVAL_NONE; } -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Documentation to upgrade logical replication cluster
Here are some minor comments for patch v8-0001. == doc/src/sgml/glossary.sgml 1. + + + A set of publisher and subscriber instances with publisher instance + replicating changes to the subscriber instance. + + /with publisher instance/with the publisher instance/ ~~~ 2. There are 2 SQL fragments but they are wrapped differently (see below). e.g. it is not clear why is the 2nd fragment wrapped since it is shorter than the 1st. OTOH, maybe you want the 1st fragment to wrap. Either way, consistency wrapping would be better. postgres=# SELECT slot_name FROM pg_replication_slots WHERE slot_type = 'logical' AND conflict_reason IS NOT NULL; slot_name --- (0 rows) versus SELECT count(*) FROM pg_replication_slots WHERE slot_type = 'logical' AND temporary IS false; == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
FYI -- I checked patch v81-0001 to find which of the #includes are strictly needed. == src/backend/replication/logical/slotsync.c 1. +#include "postgres.h" + +#include + +#include "access/genam.h" +#include "access/table.h" +#include "access/xlog_internal.h" +#include "access/xlogrecovery.h" +#include "catalog/pg_database.h" +#include "commands/dbcommands.h" +#include "libpq/pqsignal.h" +#include "pgstat.h" +#include "postmaster/bgworker.h" +#include "postmaster/fork_process.h" +#include "postmaster/interrupt.h" +#include "postmaster/postmaster.h" +#include "replication/logical.h" +#include "replication/logicallauncher.h" +#include "replication/walreceiver.h" +#include "replication/slotsync.h" +#include "storage/ipc.h" +#include "storage/lmgr.h" +#include "storage/procarray.h" +#include "tcop/tcopprot.h" +#include "utils/builtins.h" +#include "utils/fmgroids.h" +#include "utils/guc_hooks.h" +#include "utils/pg_lsn.h" +#include "utils/ps_status.h" +#include "utils/timeout.h" +#include "utils/varlena.h" Many of these #includes seem unnecessary. e.g. I was able to remove all those that are commented-out below, and the file still compiles OK with no warnings: #include "postgres.h" //#include //#include "access/genam.h" //#include "access/table.h" #include "access/xlog_internal.h" #include "access/xlogrecovery.h" #include "catalog/pg_database.h" #include "commands/dbcommands.h" //#include "libpq/pqsignal.h" //#include "pgstat.h" //#include "postmaster/bgworker.h" //#include "postmaster/fork_process.h" //#include "postmaster/interrupt.h" //#include "postmaster/postmaster.h" #include "replication/logical.h" //#include "replication/logicallauncher.h" //#include "replication/walreceiver.h" #include "replication/slotsync.h" #include "storage/ipc.h" #include "storage/lmgr.h" #include "storage/procarray.h" //#include "tcop/tcopprot.h" #include "utils/builtins.h" //#include "utils/fmgroids.h" //#include "utils/guc_hooks.h" #include "utils/pg_lsn.h" //#include "utils/ps_status.h" //#include "utils/timeout.h" //#include "utils/varlena.h" == src/backend/replication/slot.c 2. #include "pgstat.h" +#include "replication/slotsync.h" #include "replication/slot.h" +#include "replication/walsender.h" #include "storage/fd.h" The #include "replication/walsender.h" seems to be unnecessary. == src/backend/replication/walsender.c 3. #include "replication/logical.h" +#include "replication/slotsync.h" #include "replication/slot.h" The #include "replication/slotsync.h" is needed, but only for Assert code: Assert(am_cascading_walsender || IsSyncingReplicationSlots()); So you could #ifdef around that #include if you wish to. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Documentation to upgrade logical replication cluster
Here are some review comments for patch v7-0001. == doc/src/sgml/glossary.sgml 1. + + Logical replication cluster + + + A set of publisher and subscriber instance with publisher instance + replicating changes to the subscriber instance. + + + 1a. /instance with/instances with/ ~~~ 1b. The description then made me want to look up the glossary definition of a "publisher instance" and "subscriber instance", but then I was quite surprised that even "Publisher" and "Subscriber" terms are not described in the glossary. Should this patch add those, or should we start another thread for adding them? == doc/src/sgml/logical-replication.sgml 2. + + Migration of logical replication clusters is possible only when all the + members of the old logical replication clusters are version 17.0 or later. + Here is where "logical replication clusters" is mentioned. Shouldn't this first reference be linked to that new to the glossary entry -- e.g. logical replication clusters ~~~ 3. + +Following are the prerequisites for pg_upgrade +to be able to upgrade the logical slots. If these are not met an error +will be reported. + SUGGESTION The following prerequisites are required for ... ~~~ 4. + + All slots on the old cluster must be usable, i.e., there are no slots + whose + pg_replication_slots.conflict_reason + is not NULL. + The double-negative is too tricky "no slots whose ... not NULL", needs rewording. Maybe it is better to instead use an example as the next bullet point does. ~~~ 5. + + +Following are the prerequisites for pg_upgrade to +be able to upgrade the subscriptions. If these are not met an error +will be reported. + SUGGESTION The following prerequisites are required for ... == doc/src/sgml/ref/pgupgrade.sgml 6. + + + The steps to upgrade logical replication clusters are not covered here; + refer to for details. + + Maybe here too there should be a link to the glossary term "logical replication clusters". == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
irmed_lsn or catalog_xmin is invalid but slot + * is not invalidated, that means we have fetched the remote_slot in + * its RS_EPHEMERAL state itself. In such a case, avoid syncing it + * yet. We can always sync it in the next sync cycle when the + * remote_slot is persisted and has valid lsn(s) and xmin values. + * + * XXX: In future, if we plan to expose 'slot->data.persistency' in + * pg_replication_slots view, then we can avoid fetching RS_EPHEMERAL + * slots in the first place. + */ SUGGESTION (1st para) If restart_lsn, confirmed_lsn or catalog_xmin is invalid but the slot is valid, that means we have fetched the remote_slot in its RS_EPHEMERAL state. In such a case, don't sync it; we can always sync it in the next ... ~~~ 7. + /* + * Use shared lock to prevent a conflict with + * ReplicationSlotsDropDBSlots(), trying to drop the same slot during + * a drop-database operation. + */ + LockSharedObject(DatabaseRelationId, remote_dbid, 0, AccessShareLock); + + synchronize_one_slot(remote_slot, remote_dbid); + + UnlockSharedObject(DatabaseRelationId, remote_dbid, 0, AccessShareLock); IMO remove the blank lines (e.g., you don't use this kind of formatting for spin locks) == Kind Regards, Peter Smith. Fujitsu Australia
Re: Make documentation builds reproducible
On Thu, Feb 8, 2024 at 9:47 PM Peter Eisentraut wrote: > > On 23.01.24 02:06, Peter Smith wrote: > > This has been working forever, but seems to have broken due to commit > > [1] having an undeclared variable. > > > runtime error: file stylesheet-html-common.xsl line 452 element if > > Variable 'autolink.index.see' has not been declared. > > make: *** [html-stamp] Error 10 > > I have committed a fix for this. I have successfully tested docbook-xsl > 1.77.1 through 1.79.*. > Yes, the latest is working for me now. Thanks! == Kind Regards, Peter Smith. Fujitsu Australia.
Re: Synchronizing slots from primary to standby
press this comment? ~~~ 10. +/* + * Validates if all necessary GUCs for slot synchronization are set + * appropriately, otherwise raise ERROR. + */ /Validates if all/Check all/ == Kind Regards, Peter Smith. Fujitsu Australia
Re: Question about behavior of deletes with REPLICA IDENTITY NOTHING
On Thu, Feb 8, 2024 at 11:12 AM James Coleman wrote: > > On Wed, Feb 7, 2024 at 6:04 PM Peter Smith wrote: > > > > On Thu, Feb 8, 2024 at 9:04 AM James Coleman wrote: > > > > > > On Wed, Feb 7, 2024 at 3:22 PM Laurenz Albe > > > wrote: > > > > > > > > On Wed, 2024-02-07 at 15:12 -0500, James Coleman wrote: > > > > > We recently noticed some behavior that seems reasonable but also > > > > > surprised our engineers based on the docs. > > > > > > > > > > If we have this setup: > > > > > create table items(i int); > > > > > insert into items(i) values (1); > > > > > create publication test_pub for all tables; > > > > > > > > > > Then when we: > > > > > delete from items where i = 1; > > > > > > > > > > we get: > > > > > ERROR: cannot delete from table "items" because it does not have a > > > > > replica identity and publishes deletes > > > > > HINT: To enable deleting from the table, set REPLICA IDENTITY using > > > > > ALTER TABLE. > > > > > > > > > > Fair enough. But if we do this: > > > > > alter table items replica identity nothing; > > > > > > > > > > because the docs [1] say that NOTHING means "Records no information > > > > > about the old row." We still get the same error when we try the DELETE > > > > > again. > > > > > > > > Well, "REPLICA IDENTITY NOTHING" is the same as "has no replica > > > > identity". > > > > So is "REPLICA IDENTITY DEFAULT" if there is no primary key, or > > > > "REPLICA IDENTITY USING INDEX ..." if the index is dropped. > > > > > > > > See "pg_class": the column "relreplident" is not nullable. > > > > > > Right, I think the confusing point for us is that the docs for NOTHING > > > ("Records no information about the old row") imply you can decide you > > > don't have to record anything if you don't want to do so, but the > > > publication feature is effectively overriding that and asserting that > > > you can't make that choice. > > > > > > > Hi, I can see how the current docs could be interpreted in a way that > > was not intended. > > > > ~~~ > > > > To emphasise the DEFAULT behaviour that Laurenze described, I felt > > there could be another sentence about DEFAULT, the same as there is > > already for the USING INDEX case. > > > > BEFORE [1] > > Records the old values of the columns of the primary key, if any. This > > is the default for non-system tables. > > > > SUGGESTION > > Records the old values of the columns of the primary key, if any. This > > is the default for non-system tables. If there is no primary key, the > > behavior is the same as NOTHING. > > > > ~~~ > > > > If that is done, then would a publication docs tweak like the one > > below clarify things sufficiently? > > > > BEFORE [2] > > If a table without a replica identity is added to a publication that > > replicates UPDATE or DELETE operations then subsequent UPDATE or > > DELETE operations will cause an error on the publisher. > > > > SUGGESTION > > If a table without a replica identity (or with replica identity > > behavior equivalent to NOTHING) is added to a publication that > > replicates UPDATE or DELETE operations then subsequent UPDATE or > > DELETE operations will cause an error on the publisher. > > > > == > > [1] > > https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY > > [2] > > https://www.postgresql.org/docs/current/logical-replication-publication.html > > > > Kind Regards, > > Peter Smith. > > Fujitsu Australia > > Thanks for looking at this! > > Yes, both of those changes together would make this unambiguous (and, > I think, easier to mentally parse). > OK, here then is a patch to do like that. == Kind Regards, Peter Smith. Fujitsu Australia v1-0001-replica-identity-clarifications.patch Description: Binary data
Re: Question about behavior of deletes with REPLICA IDENTITY NOTHING
On Thu, Feb 8, 2024 at 9:04 AM James Coleman wrote: > > On Wed, Feb 7, 2024 at 3:22 PM Laurenz Albe wrote: > > > > On Wed, 2024-02-07 at 15:12 -0500, James Coleman wrote: > > > We recently noticed some behavior that seems reasonable but also > > > surprised our engineers based on the docs. > > > > > > If we have this setup: > > > create table items(i int); > > > insert into items(i) values (1); > > > create publication test_pub for all tables; > > > > > > Then when we: > > > delete from items where i = 1; > > > > > > we get: > > > ERROR: cannot delete from table "items" because it does not have a > > > replica identity and publishes deletes > > > HINT: To enable deleting from the table, set REPLICA IDENTITY using > > > ALTER TABLE. > > > > > > Fair enough. But if we do this: > > > alter table items replica identity nothing; > > > > > > because the docs [1] say that NOTHING means "Records no information > > > about the old row." We still get the same error when we try the DELETE > > > again. > > > > Well, "REPLICA IDENTITY NOTHING" is the same as "has no replica identity". > > So is "REPLICA IDENTITY DEFAULT" if there is no primary key, or > > "REPLICA IDENTITY USING INDEX ..." if the index is dropped. > > > > See "pg_class": the column "relreplident" is not nullable. > > Right, I think the confusing point for us is that the docs for NOTHING > ("Records no information about the old row") imply you can decide you > don't have to record anything if you don't want to do so, but the > publication feature is effectively overriding that and asserting that > you can't make that choice. > Hi, I can see how the current docs could be interpreted in a way that was not intended. ~~~ To emphasise the DEFAULT behaviour that Laurenze described, I felt there could be another sentence about DEFAULT, the same as there is already for the USING INDEX case. BEFORE [1] Records the old values of the columns of the primary key, if any. This is the default for non-system tables. SUGGESTION Records the old values of the columns of the primary key, if any. This is the default for non-system tables. If there is no primary key, the behavior is the same as NOTHING. ~~~ If that is done, then would a publication docs tweak like the one below clarify things sufficiently? BEFORE [2] If a table without a replica identity is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. SUGGESTION If a table without a replica identity (or with replica identity behavior equivalent to NOTHING) is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. == [1] https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY [2] https://www.postgresql.org/docs/current/logical-replication-publication.html Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
EL_LOGICAL) + { + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("bad configuration for slot synchronization"), + errhint("\"wal_level\" must be >= logical.")); + return false; + } + + /* + * The primary_conninfo is required to make connection to primary for + * getting slots information. + */ + if (PrimaryConnInfo == NULL || *PrimaryConnInfo == '\0') + { + ereport(ERROR, + /* translator: %s is a GUC variable name */ + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("bad configuration for slot synchronization"), + errhint("\"%s\" must be defined.", "primary_conninfo")); + return false; + } + + /* + * The slot synchronization needs a database connection for walrcv_exec to + * work. + */ + dbname = walrcv_get_dbname_from_conninfo(PrimaryConnInfo); + if (dbname == NULL) + { + ereport(ERROR, + + /* + * translator: 'dbname' is a specific option; %s is a GUC variable + * name + */ + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("bad configuration for slot synchronization"), + errhint("'dbname' must be specified in \"%s\".", "primary_conninfo")); + return false; + } + + return true; +} The code of this function has been flip-flopping between versions. Now, it is always giving an ERROR when something is wrong, so all of the "return false" are unreachable. It also means the function comment is wrong, and the boolean return is unused/unnecessary. ~~~ 6. SlotSyncShmemInit +/* + * Allocate and initialize slot sync shared memory. + */ This comment should use the same style wording as the other nearby shmem function comments. SUGGESTION Allocate and initialize the shared memory of slot synchronization. ~~~ 7. +/* + * Cleanup the shared memory of slot synchronization. + */ +static void +SlotSyncShmemExit(int code, Datum arg) Since this is static, should it use the snake case naming convention? -- e.g. slot_sync_shmem_exit. ~~~ 8. +/* + * Register the callback function to clean up the shared memory of slot + * synchronization. + */ +void +SlotSyncInitialize(void) +{ + before_shmem_exit(SlotSyncShmemExit, 0); +} This is only doing registration for cleanup of shmem stuff. So, does it really need it to be a separate function, or can this be registered within SlotSyncShmemInit() itself? ~~~ 9. SyncReplicationSlots + PG_TRY(); + { + validate_primary_slot_name(wrconn); + + (void) synchronize_slots(wrconn); + } + PG_FINALLY(); + { + if (syncing_slots) + { + SpinLockAcquire(>mutex); + SlotSyncCtx->syncing = false; + SpinLockRelease(>mutex); + + syncing_slots = false; + } + + walrcv_disconnect(wrconn); + } + PG_END_TRY(); IIUC, the "if (syncing_slots)" part is not really for normal operation, but it is a safe-guard for cleaning up if some unexpected ERROR happens. Maybe there should be a comment to say that. == src/test/recovery/t/040_standby_failover_slots_sync.pl 10. +# Confirm that the logical failover slot is created on the standby and is +# flagged as 'synced' +is($standby1->safe_psql('postgres', + q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN ('lsub1_slot', 'lsub2_slot') AND synced;}), + "t", + 'logical slots have synced as true on standby'); /slot is created/slots are created/ /and is flagged/and are flagged/ == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Hi, I took another high-level look at all the funtion names of the slotsync.c file. == src/backend/replication/logical/slotsync.c +static bool +local_slot_update(RemoteSlot * remote_slot, Oid remote_dbid) +static List * +get_local_synced_slots(void) +static bool +check_sync_slot_on_remote(ReplicationSlot *local_slot, List *remote_slots, +static void +drop_obsolete_slots(List *remote_slot_list) +static void +reserve_wal_for_slot(XLogRecPtr restart_lsn) +static bool +update_and_persist_slot(RemoteSlot * remote_slot, Oid remote_dbid) +static bool +synchronize_one_slot(RemoteSlot * remote_slot, Oid remote_dbid) +get_slot_invalidation_cause(char *conflict_reason) +static bool +synchronize_slots(WalReceiverConn *wrconn) +static void +validate_primary_slot(WalReceiverConn *wrconn, int slot_invalid_elevel) +static bool +validate_slotsync_params(int elevel) +bool +IsSyncingReplicationSlots(void) +Datum +pg_sync_replication_slots(PG_FUNCTION_ARGS) ~~~ There seems some muddling of names here: - "local" versus ? and "remote" versus "primary"; or sometimes the function does not give an indication. - "sync_slot" versus "synced_slot" versus nothing - "check" versus "validate" - etc. Below are some suggestions (some are unchanged); probably there are better ideas for names but my point is that the current names could be improved: CURRENT SUGGESTION get_local_synced_slots get_local_synced_slots check_sync_slot_on_remote check_local_synced_slot_exists_on_remote drop_obsolete_slots drop_local_synced_slots reserve_wal_for_slot reserve_wal_for_local_slot local_slot_update update_local_synced_slot update_and_persist_slot update_and_persist_local_synced_slot get_slot_invalidation_cause get_slot_conflict_reason synchronize_slots synchronize_remote_slots_to_local synchronize_one_slotsynchronize_remote_slot_to_local validate_primary_slotcheck_remote_synced_slot_exists validate_slotsync_params check_local_config IsSyncingReplicationSlots IsSyncingReplicationSlots pg_sync_replication_slots pg_sync_replication_slots == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Hi, Previously ([1] #19 and #22) I had suggested that some conflict_reason code could be split off and pushed first as a prerequisite/ preparatory/ independent patch. At the time, my suggestion was premature because there was still a lot of code under development. But now the affected code is in the first patch 1, and there are already other precedents of slot-sync preparatory patches getting pushed. So I thought to resurrect this splitting suggestion again, as perhaps is the right time to do it. Details are below: == IMO the new #defines for the conflict_reason stuff plus where they get used can be pushed as an independent patch. Specifically, this stuff: ~~~ >From src/include/replication/slot.h: +/* + * The possible values for 'conflict_reason' returned in + * pg_get_replication_slots. + */ +#define SLOT_INVAL_WAL_REMOVED_TEXT "wal_removed" +#define SLOT_INVAL_HORIZON_TEXT "rows_removed" +#define SLOT_INVAL_WAL_LEVEL_TEXT "wal_level_insufficient" + ~~~ >From src/backend/replication/logical/slotsync.c: Also, IMO this function should live in slot.c; Although slotsync.c might be the only caller, this is not really a slot-sync specific function. +/* + * Maps the pg_replication_slots.conflict_reason text value to + * ReplicationSlotInvalidationCause enum value + */ +static ReplicationSlotInvalidationCause +get_slot_invalidation_cause(char *conflict_reason) +{ + Assert(conflict_reason); + + if (strcmp(conflict_reason, SLOT_INVAL_WAL_REMOVED_TEXT) == 0) + return RS_INVAL_WAL_REMOVED; + else if (strcmp(conflict_reason, SLOT_INVAL_HORIZON_TEXT) == 0) + return RS_INVAL_HORIZON; + else if (strcmp(conflict_reason, SLOT_INVAL_WAL_LEVEL_TEXT) == 0) + return RS_INVAL_WAL_LEVEL; + else + Assert(0); + + /* Keep compiler quiet */ + return RS_INVAL_NONE; +} ~~~ >From src/backend/replication/slotfuncs.c: case RS_INVAL_WAL_REMOVED: - values[i++] = CStringGetTextDatum("wal_removed"); + values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_REMOVED_TEXT); break; case RS_INVAL_HORIZON: - values[i++] = CStringGetTextDatum("rows_removed"); + values[i++] = CStringGetTextDatum(SLOT_INVAL_HORIZON_TEXT); break; case RS_INVAL_WAL_LEVEL: - values[i++] = CStringGetTextDatum("wal_level_insufficient"); + values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_LEVEL_TEXT); break; ~~~ Thoughts? == [1] https://www.postgresql.org/message-id/CAHut%2BPtJAAPghc4GPt0k%3DjeMz1qu4H7mnaDifOHsVsMqi-qOLA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
ndStringInfo(_name, "%s_", cluster_name); appendStringInfoString(_name, "slotsync"); ~~~ 22. + /* + * Establish the connection to the primary server for slots + * synchronization. + */ + wrconn = walrcv_connect(PrimaryConnInfo, false, false, false, + app_name.data, ); Unnecessarily verbose? SUGGESTION Connect to the primary server. ~~~ 23. + syncing_slots = true; + + PG_TRY(); + { + /* + * Using the specified primary server connection, validates the slot + * in primary_slot_name. + */ + validate_primary_slot(wrconn, ERROR); + + (void) synchronize_slots(wrconn); + } + PG_FINALLY(); + { + syncing_slots = false; + walrcv_disconnect(wrconn); + } + PG_END_TRY(); 23a. IMO the "syncing_slots = true;" can be deferred until immediately before call to synchronize_slots(); ~ 23b. I felt the comment seems backwards, so can be worded as suggested elsewhere in this post. SUGGESTION Validate the 'primary_slot_name' using the specified primary server connection. OTOH, if you can change the function name to validate_primary_slot_name() then no comment is needed because then it becomes self-explanatory. == src/backend/replication/slot.c 24. + /* + * Do not allow users to create the slots with failover enabled on the + * standby as we do not support sync to the cascading standby. + * + * Slots with failover enabled can still be created when doing slot + * synchronization, as it needs to maintain this value in sync with the + * remote slots. + */ + if (failover && RecoveryInProgress() && !IsSyncingReplicationSlots()) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot enable failover for a replication slot" +" created on the standby")); I felt it started to become confusing using "synchronization" and "sync" in the same sentence. SUGGESTION However, slots with failover enabled can be created during slot synchronization because we need to retain the same values as the remote slot. == .../t/040_standby_failover_slots_sync.pl 25. + +$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();"); Since this is where we use the function added by this patch, it deserves to have a comment. SUGGESTION # Synchronize the primary server slots to the standby. == src/tools/pgindent/typedefs.list 26. It looks like 'RemoteSlot' should be included in the typedefs.list file. Probably this is the explanation for the space problems I reported earlier in this post. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Fri, Feb 2, 2024 at 11:18 PM shveta malik wrote: > > On Fri, Feb 2, 2024 at 12:25 PM Peter Smith wrote: > > > > Here are some review comments for v750002. > > Thanks for the feedback Peter. Addressed all in v76 except one. > > > (this is a WIP but this is what I found so far...) > > > I wonder if it is better to log all the problems in one go instead of > > making users stumble onto them one at a time after fixing one and then > > hitting the next problem. e.g. just set some variable "all_ok = > > false;" each time instead of all the "return false;" > > > > Then at the end of the function just "return all_ok;" > > If we do this way, then we need to find a way to combine the msgs as > well, otherwise the same msg will be repeated multiple times. For the > concerned functionality (which needs one time config effort by user), > I feel the existing way looks okay. We may consider optimizing it if > we get more comments here. > I don't think combining messages is necessary; I considered these all as different (not the same msg repeated multiple times) since they all have different errhints. I felt a user would only know to make a configuration correction when they are informed something is wrong, so my review point was we could tell them all the wrong things up-front so then those can all be fixed with a "one time config effort by user". Otherwise, if multiple settings (e.g. from the list below) have wrong values, I imagined the user will fix the first reported one, then the next bad config will be reported, then the user will fix that one, then the next bad config will be reported, then the user will fix that one, and so on. It just seemed potentially/unnecessarilly painful. - errhint("\"%s\" must be defined.", "primary_slot_name")); - errhint("\"%s\" must be enabled.", "hot_standby_feedback")); - errhint("\"wal_level\" must be >= logical.")); - errhint("\"%s\" must be defined.", "primary_conninfo")); - errhint("'dbname' must be specified in \"%s\".", "primary_conninfo")); ~ Anyway, I just wanted to explain my review comment some more because maybe my reason wasn't clear the first time. Whatever your decision is, it is fine by me. == Kind Regards, Peter Smith. Fujitsu Australia
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Sat, Feb 3, 2024 at 5:28 PM Amit Kapila wrote: > > On Thu, Feb 1, 2024 at 5:58 AM Peter Smith wrote: > > > > OK. Now using your suggested 2nd sentence: > > > > +# The subscription's running status and failover option should be preserved > > +# in the upgraded instance. So regress_sub1 should still have > > subenabled,subfailover > > +# set to true, while regress_sub2 should have both set to false. > > > > ~ > > > > I also tweaked some other nearby comments/messages which did not > > mention the 'failover' preservation. > > > > Looks mostly good to me. One minor nitpick: > * > along with retaining the replication origin's remote lsn > -# and subscription's running status. > +# and subscription's running status and failover option. > > I don't think we need to use 'and' twice in the above sentence. We > should use ',' between different properties. I can change this on > Monday and push it unless you think otherwise. > +1 == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Info == '\0') + { + ereport(LOG, + /* translator: %s is a GUC variable name */ + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("bad configuration for slot synchronization"), + errhint("\"%s\" must be defined.", "primary_conninfo")); + return false; + } + + /* + * The slot sync worker needs a database connection for walrcv_exec to + * work. + */ + *dbname = walrcv_get_dbname_from_conninfo(PrimaryConnInfo); + if (*dbname == NULL) + { + ereport(LOG, + + /* + * translator: 'dbname' is a specific option; %s is a GUC variable + * name + */ + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("bad configuration for slot synchronization"), + errhint("'dbname' must be specified in \"%s\".", "primary_conninfo")); + return false; + } + + return true; +} I wonder if it is better to log all the problems in one go instead of making users stumble onto them one at a time after fixing one and then hitting the next problem. e.g. just set some variable "all_ok = false;" each time instead of all the "return false;" Then at the end of the function just "return all_ok;" == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Here are some review comments for v750001. == Commit message 1. This patch provides support for non-replication connection in libpqrcv_connect(). ~ 1a. /connection/connections/ ~ 1b. Maybe there needs to be a few more sentences just to describe what you mean by "non-replication connection". ~ 1c. IIUC although the 'replication' parameter is added, in this patch AFAICT every call to the connect function is still passing that argument as true. If that's correct, probably this patch comment should emphasise that this patch doesn't change any functionality at all but is just preparation for later patches which *will* pass false for the replication arg. ~~~ 2. This patch also implements a new API libpqrcv_get_dbname_from_conninfo() to extract database name from the given connection-info ~ /extract database name/the extract database name/ == .../libpqwalreceiver/libpqwalreceiver.c 3. + * Apart from walreceiver, the libpq-specific routines here are now being used + * by logical replication worker as well. /worker/workers/ ~~~ 4. libpqrcv_connect /* - * Establish the connection to the primary server for XLOG streaming + * Establish the connection to the primary server. + * + * The connection established could be either a replication one or + * a non-replication one based on input argument 'replication'. And further + * if it is a replication connection, it could be either logical or physical + * based on input argument 'logical'. That first comment ("could be either a replication one or...") seemed a bit meaningless (e.g. it like saying "this boolean argument can be true or false") because it doesn't describe what is the meaning of a "replication connection" versus what is a "non-replication connection". ~~~ 5. /* We can not have logical without replication */ Assert(replication || !logical); if (replication) { keys[++i] = "replication"; vals[i] = logical ? "database" : "true"; if (!logical) { /* * The database name is ignored by the server in replication mode, * but specify "replication" for .pgpass lookup. */ keys[++i] = "dbname"; vals[i] = "replication"; } } keys[++i] = "fallback_application_name"; vals[i] = appname; if (logical) { ... } ~ The Assert already says we cannot be 'logical' if not 'replication', therefore IMO it seemed strange that the code was not refactored to bring that 2nd "if (logical)" code to within the scope of the "if (replication)". e.g. Can't you do something like this: Assert(replication || !logical); if (replication) { ... if (logical) { ... } else { ... } } keys[++i] = "fallback_application_name"; vals[i] = appname; ~~~ 6. libpqrcv_get_dbname_from_conninfo + for (PQconninfoOption *opt = opts; opt->keyword != NULL; ++opt) + { + /* + * If multiple dbnames are specified, then the last one will be + * returned + */ + if (strcmp(opt->keyword, "dbname") == 0 && opt->val && + opt->val[0] != '\0') + dbname = pstrdup(opt->val); + } Should you also pfree the old dbname instead of gathering a bunch of strdups if there happened to be multiple dbnames specified ? SUGGESTION if (strcmp(opt->keyword, "dbname") == 0 && opt->val && *opt->val) { if (dbname) pfree(dbname); dbname = pstrdup(opt->val); } == src/include/replication/walreceiver.h 7. /* * walrcv_connect_fn * * Establish connection to a cluster. 'logical' is true if the * connection is logical, and false if the connection is physical. * 'appname' is a name associated to the connection, to use for example * with fallback_application_name or application_name. Returns the * details about the connection established, as defined by * WalReceiverConn for each WAL receiver module. On error, NULL is * returned with 'err' including the error generated. */ typedef WalReceiverConn *(*walrcv_connect_fn) (const char *conninfo, bool replication, bool logical, bool must_use_password, const char *appname, char **err); ~ The comment is missing any description of the new parameter 'replication'. ~~~ 8. +/* + * walrcv_get_dbname_from_conninfo_fn + * + * Returns the dbid from the primary_conninfo + */ +typedef char *(*walrcv_get_dbname_from_conninfo_fn) (const char *conninfo); + /dbid/database name/ == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Here are some review comments for v740001. == src/sgml/logicaldecoding.sgml 1. + +Replication Slot Synchronization + + A logical replication slot on the primary can be synchronized to the hot + standby by enabling the failover option during slot + creation and setting + enable_syncslot + on the standby. For the synchronization + to work, it is mandatory to have a physical replication slot between the + primary and the standby, and + hot_standby_feedback + must be enabled on the standby. It is also necessary to specify a valid + dbname in the + primary_conninfo + string, which is used for slot synchronization and is ignored for streaming. + IMO we don't need to repeat that last part ", which is used for slot synchronization and is ignored for streaming." because that is a detail about the primary_conninfo GUC, and the same information is already described in that GUC section. == 2. ALTER_REPLICATION_SLOT slot_name ( option [, ...] ) # - If true, the slot is enabled to be synced to the standbys. + If true, the slot is enabled to be synced to the standbys + so that logical replication can be resumed after failover. This also should have the sentence "The default is false.", e.g. the same as the same option in CREATE_REPLICATION_SLOT says. == synchronize_one_slot 3. + /* + * Make sure that concerned WAL is received and flushed before syncing + * slot to target lsn received from the primary server. + * + * This check will never pass if on the primary server, user has + * configured standby_slot_names GUC correctly, otherwise this can hit + * frequently. + */ + latestFlushPtr = GetStandbyFlushRecPtr(NULL); + if (remote_slot->confirmed_lsn > latestFlushPtr) BEFORE This check will never pass if on the primary server, user has configured standby_slot_names GUC correctly, otherwise this can hit frequently. SUGGESTION (simpler way to say the same thing?) This will always be the case unless the standby_slot_names GUC is not correctly configured on the primary server. ~~~ 4. + /* User created slot with the same name exists, raise ERROR. */ /User created/User-created/ ~~~ 5. synchronize_slots, and also drop_obsolete_slots + /* + * Use shared lock to prevent a conflict with + * ReplicationSlotsDropDBSlots(), trying to drop the same slot while + * drop-database operation. + */ (same code comment is in a couple of places) SUGGESTION (while -> during, etc.) Use a shared lock to prevent conflicting with ReplicationSlotsDropDBSlots() trying to drop the same slot during a drop-database operation. ~~~ 6. validate_parameters_and_get_dbname strcmp() just for the empty string "" might be overkill. 6a. + if (PrimarySlotName == NULL || strcmp(PrimarySlotName, "") == 0) SUGGESTION if (PrimarySlotName == NULL || *PrimarySlotName == '\0') ~~ 6b. + if (PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0) SUGGESTION if (PrimaryConnInfo == NULL || *PrimaryConnInfo == '\0') == Kind Regards, Peter Smith. Fujitsu Australia
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Wed, Jan 31, 2024 at 7:48 PM Alvaro Herrera wrote: > > How about rewording it more extensively? It doesn't read great IMO. > I would use something like > > # In the upgraded instance, the running status and failover option of the > # subscription with the failover option should have been preserved; the other > # should not. > # So regress_sub1 should still have subenabled,subfailover set to true, > # while regress_sub2 should have both set to false. > IIUC this suggested comment is implying that the running status is *only* preserved when the failover option is true. But AFAIK that is not correct. e.g. I hacked the test to keep subscription regress_sub2 as ENABLED but the result after the upgrade was subenabled/subfailover = t/f, not f/f. > I think the symmetry between the two lines confuses more than helps. > It's not a huge thing but since we're editing anyway, why not? > OK. Now using your suggested 2nd sentence: +# The subscription's running status and failover option should be preserved +# in the upgraded instance. So regress_sub1 should still have subenabled,subfailover +# set to true, while regress_sub2 should have both set to false. ~ I also tweaked some other nearby comments/messages which did not mention the 'failover' preservation. PSA v2. == Kind Regards, Peter Smith. Fujitsu Australia v2-0001-Fix-pg_upgrade-test-comment.patch Description: Binary data
Re: Documentation: warn about two_phase when altering a subscription
On Wed, Jan 31, 2024 at 4:55 PM Bertrand Drouvot wrote: ... > As a non native English speaker somehow I have to rely on you for those > suggestions ;-) > > They make sense to me so applied both in v2 attached. > Patch v2 looks OK to me, but probably there is still room for improvement. Let's see what others think. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve the connection failure error messages
On Wed, Jan 31, 2024 at 9:58 PM Nisha Moond wrote: > > >> AFAIK some recent commits patches (e,g [1] for the "slot sync" >> development) have created some more cases of "could not connect..." >> messages. So, you might need to enhance your patch to deal with any >> new ones in the latest HEAD. >> >> == >> [1] >> https://github.com/postgres/postgres/commit/776621a5e4796fa214b6b29a7ca134f6c138572a >> > Thank you for the update. > The v3 patch has the changes needed as per the latest HEAD. > Hi, just going by visual inspection of the v2/v3 patch diffs, the latest v3 LGTM. == Kind Regards, Peter Smith. Fujitsu Australia
Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
On Wed, Jan 31, 2024 at 4:51 PM vignesh C wrote: > > On Wed, 31 Jan 2024 at 10:27, Peter Smith wrote: > > > > Hi, > > > > PSA a small fix for a misleading comment found in the pg_upgrade test code. > > Is the double # intentional here, as I did not see this style of > commenting used elsewhere: > +# # Upgraded regress_sub1 should still have enabled and failover as true. > +# # Upgraded regress_sub2 should still have enabled and failover as false. > Unintentional caused by copy/paste. Thanks for reporting. I will post a fixed patch tomorrow. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve the connection failure error messages
AFAIK some recent commits patches (e,g [1] for the "slot sync" development) have created some more cases of "could not connect..." messages. So, you might need to enhance your patch to deal with any new ones in the latest HEAD. == [1] https://github.com/postgres/postgres/commit/776621a5e4796fa214b6b29a7ca134f6c138572a Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
On Wed, Jan 31, 2024 at 2:18 PM Amit Kapila wrote: > > On Wed, Jan 31, 2024 at 7:27 AM Peter Smith wrote: > > > > I saw that v73-0001 was pushed, but it included some last-minute > > changes that I did not get a chance to check yesterday. > > > > Here are some review comments for the new parts of that patch. > > > > == > > doc/src/sgml/ref/create_subscription.sgml > > > > 1. > > connect (boolean) > > > > Specifies whether the CREATE SUBSCRIPTION command should connect > > to the publisher at all. The default is true. Setting this to false > > will force the values of create_slot, enabled, copy_data, and failover > > to false. (You cannot combine setting connect to false with setting > > create_slot, enabled, copy_data, or failover to true.) > > > > ~ > > > > I don't think the first part "Setting this to false will force the > > values ... failover to false." is strictly correct. > > > > I think is correct to say all those *other* properties (create_slot, > > enabled, copy_data) are forced to false because those otherwise have > > default true values. > > > > So, won't when connect=false, the user has to explicitly provide such > values (create_slot, enabled, etc.) as false? If so, is using 'force' > strictly correct? Perhaps the original docs text could be worded differently; I think the word "force" here just meant setting connection=false forces/causes/makes those other options behave "as if" they had been set to false without the user explicitly doing anything to them. The point is they are made to behave *differently* to their normal defaults. So, connect=false ==> this actually sets enabled=false (you can see this with \dRs+), which is different to the default setting of 'enabled' connect=false ==> will not create a slot (because there is no connection), which is different to the default behaviour for 'create-slot' connect=false ==> will not copy tables (because there is no connection), which is different to the default behaviour for 'copy_data;' OTOH, failover is different connect=false ==> failover is not possible (because there is no connection), but the 'failover' default is false anyway, so no change to the default behaviour for this one. > > > ~~~ > > > > 2. > > > >Since no connection is made when this option is > >false, no tables are subscribed. To initiate > >replication, you must manually create the replication slot, > > enable > > - the subscription, and refresh the subscription. See > > + the failover if required, enable the subscription, and refresh > > the > > + subscription. See > > > linkend="logical-replication-subscription-examples-deferred-slot"/> > >for examples. > > > > > > IMO "see the failover if required" is very vague. See what failover? > > > > AFAICS, the committed docs says: "To initiate replication, you must > manually create the replication slot, enable the failover if required, > ...". I am not sure what you are referring to. > My mistake. I was misreading the patch code without applying the patch. Sorry for the noise. > > > > == > > src/bin/pg_upgrade/t/004_subscription.pl > > > > 5. > > -# The subscription's running status should be preserved. Old subscription > > -# regress_sub1 should be enabled and old subscription regress_sub2 should > > be > > -# disabled. > > +# The subscription's running status and failover option should be > > preserved. > > +# Old subscription regress_sub1 should have enabled and failover as true > > while > > +# old subscription regress_sub2 should have enabled and failover as false. > > $result = > >$new_sub->safe_psql('postgres', > > - "SELECT subname, subenabled FROM pg_subscription ORDER BY subname"); > > -is( $result, qq(regress_sub1|t > > -regress_sub2|f), > > + "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER > > BY subname"); > > +is( $result, qq(regress_sub1|t|t > > +regress_sub2|f|f), > > "check that the subscription's running status are preserved"); > > > > ~ > > > > Calling those "old subscriptions" seems misleading. Aren't these the > > new/upgraded subscriptions being checked here? > > > > Again the quoted wording is not introduced by this patch. But, I see > your point and it is better if you can start a separate thread for it. > OK. I created a separate thread for this [1] == [1] https://www.postgresql.org/message-id/cahut+pu1uslphrysptacy1k_q-ddsrxnfhmj_2u1nfqbc1y...@mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia.
src/bin/pg_upgrade/t/004_subscription.pl test comment fix
Hi, PSA a small fix for a misleading comment found in the pg_upgrade test code. == Kind Regards, Peter Smith. Fujitsu Australia v1-0001-Fix-test-comment.patch Description: Binary data
Re: Documentation: warn about two_phase when altering a subscription
Hi, thanks for the patch. Here are some review comments for v1 == (below is not showing the links and other sgml rendering -- all those LGTM) BEFORE When altering the slot_name, the failover and two_phase properties values of the named slot may differ from their counterparts failover and two_phase parameters specified in the subscription. When creating the slot, ensure the slot failover and two_phase properties match their counterparts parameters values of the subscription. SUGGESTION When altering the slot_name, the failover and two_phase property values of the named slot may differ from the counterpart failover and two_phase parameters specified by the subscription. When creating the slot, ensure the slot properties failover and two_phase match their counterpart parameters of the subscription. ~ BEFORE Otherwise, the slot on the publisher may behave differently from what subscription's failover and two_phase options say: for example, the slot on the publisher could ... SUGGESTION: Otherwise, the slot on the publisher may behave differently from what these subscription options say: for example, the slot on the publisher could ... == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Hi, I saw that v73-0001 was pushed, but it included some last-minute changes that I did not get a chance to check yesterday. Here are some review comments for the new parts of that patch. == doc/src/sgml/ref/create_subscription.sgml 1. connect (boolean) Specifies whether the CREATE SUBSCRIPTION command should connect to the publisher at all. The default is true. Setting this to false will force the values of create_slot, enabled, copy_data, and failover to false. (You cannot combine setting connect to false with setting create_slot, enabled, copy_data, or failover to true.) ~ I don't think the first part "Setting this to false will force the values ... failover to false." is strictly correct. I think is correct to say all those *other* properties (create_slot, enabled, copy_data) are forced to false because those otherwise have default true values. But the 'failover' has default false, so it cannot get force-changed at all because you can't set connect to false when failover is true as the second part ("You cannot combine...") explains. IMO remove 'failover' from that first sentence. ~~~ 2. Since no connection is made when this option is false, no tables are subscribed. To initiate replication, you must manually create the replication slot, enable - the subscription, and refresh the subscription. See + the failover if required, enable the subscription, and refresh the + subscription. See for examples. IMO "see the failover if required" is very vague. See what failover? The slot property failover, or the subscription option failover? And "see" it for what purpose? I think the intention was probably to say something like "ensure the manually created slot has the same matching failover property value as the subscriber failover option", but that is not clear from the current text. == doc/src/sgml/ref/pg_dump.sgml 3. dump can be restored without requiring network access to the remote servers. It is then up to the user to reactivate the subscriptions in a suitable way. If the involved hosts have changed, the connection - information might have to be changed. It might also be appropriate to + information might have to be changed. If the subscription needs to + be enabled for + failover, + then same needs to be done by executing + + ALTER SUBSCRIPTION ... SET(failover = true) + after the slot has been created. It might also be appropriate to "then same needs to be done" (English?) BEFORE If the subscription needs to be enabled for failover, then same needs to be done by executing ALTER SUBSCRIPTION ... SET(failover = true) after the slot has been created. SUGGESTION If the subscription needs to be enabled for failover, execute ALTER SUBSCRIPTION ... SET(failover = true) after the slot has been created. == src/backend/commands/subscriptioncmds.c 4. #define SUBOPT_RUN_AS_OWNER 0x1000 -#define SUBOPT_LSN 0x2000 -#define SUBOPT_ORIGIN 0x4000 +#define SUBOPT_FAILOVER 0x2000 +#define SUBOPT_LSN 0x4000 +#define SUBOPT_ORIGIN 0x8000 + A spurious blank line was added. == src/bin/pg_upgrade/t/004_subscription.pl 5. -# The subscription's running status should be preserved. Old subscription -# regress_sub1 should be enabled and old subscription regress_sub2 should be -# disabled. +# The subscription's running status and failover option should be preserved. +# Old subscription regress_sub1 should have enabled and failover as true while +# old subscription regress_sub2 should have enabled and failover as false. $result = $new_sub->safe_psql('postgres', - "SELECT subname, subenabled FROM pg_subscription ORDER BY subname"); -is( $result, qq(regress_sub1|t -regress_sub2|f), + "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER BY subname"); +is( $result, qq(regress_sub1|t|t +regress_sub2|f|f), "check that the subscription's running status are preserved"); ~ Calling those "old subscriptions" seems misleading. Aren't these the new/upgraded subscriptions being checked here? Should the comment be more like: # The subscription's running status and failover option should be preserved. # Upgraded regress_sub1 should still have enabled and failover as true. # Upgraded regress_sub2 should still have enabled and failover as false. == Kind Regards, Peter Smith. Fujitsu Australia.
Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo
On Fri, Jan 19, 2024 at 7:21 PM Masahiko Sawada wrote: > > On Thu, Jan 18, 2024 at 6:54 PM Amit Kapila wrote: > > > > On Thu, Jan 18, 2024 at 11:15 AM Peter Smith wrote: > > > > > > On Thu, Jan 18, 2024 at 12:55 PM Masahiko Sawada > > > wrote: > > > > > > > ... > > > > > > > > Although we can improve it to handle this case too, I'm not sure it's > > > > a bug. The doc says[1]: > > > > > > > > Specifies whether the subscription should be automatically disabled if > > > > any errors are detected by subscription workers during data > > > > replication from the publisher. > > > > > > > > When an apply worker is trying to establish a connection, it's not > > > > replicating data from the publisher. > > > > > > > > Regards, > > > > > > > > [1] > > > > https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-PARAMS-WITH-DISABLE-ON-ERROR > > > > > > > > -- > > > > Masahiko Sawada > > > > Amazon Web Services: https://aws.amazon.com > > > > > > Yeah, I had also seen that wording of those docs. And I agree it > > > leaves open some room for doubts because strictly from that wording it > > > can be interpreted that establishing the connection is not actually > > > "data replication from the publisher" in which case maybe there is no > > > bug. > > > > > > > As far as I remember that was the intention. The idea was if there is > > any conflict during apply that users manually need to fix, they have > > the provision to stop repeating the error. If we wish to extend the > > purpose of this option for another valid use case and there is a good > > way to achieve the same then we can discuss but I don't think we need > > to change it in back-branches. > > I agree not to change it in back-branches. > > What is the use case of extending disable_on_error? > The use-case is that with my user-hat on I had assumed disable_on_error behaviour was as per the name implied so the subscription would disable on getting (any) error. OTOH I agree, my expectation is not exactly what the current docs say. Also, I had thought the motivation for that option was to avoid having infinite repeating errors that might be caused by the user or data. e.g. a simple typo in the conninfo can cause this error and AFAIK the ALTER will appear successful so the user won't know anything about it unless they also check the logs. OTOH something like a connection error may only be temporary (caused by a network issue?) and not caused by a user typo at all, so I can see perhaps that is why disable_on_error is OK to be excluding connection errors. TBH I think there are pros and cons to doing nothing and leaving the existing behaviour as-is or extending it -- I'm happy to go either way. Another idea is to leave behaviour unchanged, but add a note in the docs like "Note: connection errors (e.g. specifying a bad conninfo using ALTER SUBSCRIPTION) will not cause the subscription to become disabled" Thoughts? == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Here are some review comments for v72-0001 == doc/src/sgml/ref/alter_subscription.sgml 1. + parameter value of the subscription. Otherwise, the slot on the + publisher may behave differently from what subscription's + failover + option says. The slot on the publisher could either be + synced to the standbys even when the subscription's + failover + option is disabled or could be disabled for sync + even when the subscription's + failover + option is enabled. + It is a bit wordy to keep saying "disabled/enabled" BEFORE The slot on the publisher could either be synced to the standbys even when the subscription's failover option is disabled or could be disabled for sync even when the subscription's failover option is enabled. SUGGESTION The slot on the publisher could be synced to the standbys even when the subscription's failover = false or may not be syncing even when the subscription's failover = true. == .../t/040_standby_failover_slots_sync.pl 2. +# Enable subscription +$subscriber1->safe_psql('postgres', + "ALTER SUBSCRIPTION regress_mysub1 ENABLE"); + +# Disable failover for enabled subscription +my ($result, $stdout, $stderr) = $subscriber1->psql('postgres', + "ALTER SUBSCRIPTION regress_mysub1 SET (failover = false)"); +ok( $stderr =~ /ERROR: cannot set failover for enabled subscription/, + "altering failover is not allowed for enabled subscription"); + Currently, those tests are under scope the big comment: +## +# Test that changing the failover property of a subscription updates the +# corresponding failover property of the slot. +## But that comment is not quite relevant to these tests. So, add another one just these: SUGGESTION: ## # Test that cannot modify the failover option for enabled subscriptions. ###### == Kind Regards, Peter Smith. Fujitsu Australia
Re: Make documentation builds reproducible
On Thu, Jan 25, 2024 at 9:12 AM Peter Smith wrote: > > On Tue, Jan 23, 2024 at 12:32 PM Peter Smith wrote: > > > > On Tue, Jan 23, 2024 at 12:13 PM Tom Lane wrote: > > > > > > Peter Smith writes: > > > > I usually the HTML documentation locally using command: > > > > make STYLE=website html > > > > This has been working forever, but seems to have broken due to commit > > > > [1] having an undeclared variable. > > > > > > Interestingly, that still works fine for me, on RHEL8 with > > > > > > docbook-dtds-1.0-69.el8.noarch > > > docbook-style-xsl-1.79.2-9.el8.noarch > > > docbook-style-dsssl-1.79-25.el8.noarch > > > > > > What docbook version are you using? > > > > > > > [postgres@CentOS7-x64 sgml]$ sudo yum list installed | grep docbook > > docbook-dtds.noarch 1.0-60.el7 @anaconda > > docbook-style-dsssl.noarch1.79-18.el7@base > > docbook-style-xsl.noarch 1.78.1-3.el7 @anaconda > > > > IIUC these releases notes [1] say autolink.index.see existed since > v1.79.1, but unfortunately, that is more recent than my ancient > installed v1.78.1 > > From the release notes: > -- > Robert Stayton: autolink.index.see.xml > > New param to control automatic links in index from see and > seealso to indexterm primary. > -- > > == > [1] https://docbook.sourceforge.net/release/xsl/1.79.1/RELEASE-NOTES.html > Is anything going to be changed for this? Since the recent commit [1] when building the docs now each time I need to first hack (e.g. either the Makefile or stylesheet-html-common.xml) to declare the missing ‘autolink.index.see’ variable. I know that my old OS is approaching EOL but I thought my docbook installation was still valid. == [1] https://github.com/postgres/postgres/commit/b0f0a9432d0b6f53634a96715f2666f6d4ea25a1 Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Here are some review comments for v70-0001. == doc/src/sgml/protocol.sgml 1. Related to this, please also review my other patch to the same docs page protocol.sgml [1]. == src/backend/replication/logical/tablesync.c 2. walrcv_create_slot(LogRepWorkerWalRcvConn, slotname, false /* permanent */ , false /* two_phase */ , +false, CRS_USE_SNAPSHOT, origin_startpos); I know it was previously mentioned in this thread that inline parameter comments are unnecessary, but here they are already in the existing code so shouldn't we do the same? == src/backend/replication/repl_gram.y 3. +/* ALTER_REPLICATION_SLOT slot options */ +alter_replication_slot: + K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')' + { + AlterReplicationSlotCmd *cmd; + cmd = makeNode(AlterReplicationSlotCmd); + cmd->slotname = $2; + cmd->options = $4; + $$ = (Node *) cmd; + } + ; + IMO write that comment with parentheses, so it matches the code. SUGGESTION ALTER_REPLICATION_SLOT slot ( options ) == [1] https://www.postgresql.org/message-id/CAHut%2BPtDWSmW8uiRJF1LfGQJikmo7V2jdysLuRmtsanNZc7fNw%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Documentation to upgrade logical replication cluster
Hi Vignesh, Here are some review comments for patch v4. These are cosmetic only; otherwise v4 LGTM. == doc/src/sgml/ref/pgupgrade.sgml 1. Configure the servers for log shipping. (You do not need to run pg_backup_start() and pg_backup_stop() or take a file system backup as the standbys are still synchronized - with the primary.) Only logical slots on the primary are copied to the - new standby, but other slots on the old standby are not copied so must - be recreated manually. + with the primary.) If the old cluster is prior to 17.0, then no slots + on the primary are copied to the new standby, so all the slots must be + recreated manually. If the old cluster is 17.0 or later, then only + logical slots on the primary are copied to the new standby, but other + slots on the old standby are not copied so must be recreated manually. Perhaps the part from "If the old cluster is prior..." should be in a new paragraph. == doc/src/sgml/logical-replication.sgml 2. + +Setup the +subscriber configurations in the new subscriber. +pg_upgrade attempts to migrate subscription +dependencies which includes the subscription's table information present in +pg_subscription_rel +system catalog and also the subscription's replication origin. This allows +logical replication on the new subscriber to continue from where the +old subscriber was up to. Migration of subscription dependencies is only +supported when the old cluster is version 17.0 or later. Subscription +dependencies on clusters before version 17.0 will silently be ignored. + Perhaps the part from "pg_upgrade attempts..." should be in a new paragraph. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Here are some review comments for v67-0002. == src/backend/replication/logical/slotsync.c 1. +/* The sleep time (ms) between slot-sync cycles varies dynamically + * (within a MIN/MAX range) according to slot activity. See + * wait_for_slot_activity() for details. + */ +#define MIN_WORKER_NAPTIME_MS 200 +#define MAX_WORKER_NAPTIME_MS 3 /* 30s */ + +static long sleep_ms = MIN_WORKER_NAPTIME_MS; In my previous review for this, I meant for there to be no whitespace between the #defines and the static long sleep_ms so the prior comment then clearly belongs to all 3 lines ~~~ 2. synchronize_one_slot + /* + * Sanity check: Make sure that concerned WAL is received and flushed + * before syncing slot to target lsn received from the primary server. + * + * This check should never pass as on the primary server, we have waited + * for the standby's confirmation before updating the logical slot. + */ + latestFlushPtr = GetStandbyFlushRecPtr(NULL); + if (remote_slot->confirmed_lsn > latestFlushPtr) + { + ereport(LOG, + errmsg("skipping slot synchronization as the received slot sync" +" LSN %X/%X for slot \"%s\" is ahead of the standby position %X/%X", +LSN_FORMAT_ARGS(remote_slot->confirmed_lsn), +remote_slot->name, +LSN_FORMAT_ARGS(latestFlushPtr))); + + return false; + } Previously in v65 this was an elog, but now it is an ereport. But since this is a sanity check condition that "should never pass" wasn't the elog the more appropriate choice? ~~~ 3. synchronize_one_slot + /* + * We don't want any complicated code while holding a spinlock, so do + * namestrcpy() and get_database_oid() outside. + */ + namestrcpy(_name, remote_slot->plugin); + dbid = get_database_oid(remote_slot->database, false); IMO just simplify the whole comment, here and for the other similar comment in local_slot_update(). SUGGESTION /* Avoid expensive operations while holding a spinlock. */ ~~~ 4. synchronize_slots + /* Construct the remote_slot tuple and synchronize each slot locally */ + tupslot = MakeSingleTupleTableSlot(res->tupledesc, ); + while (tuplestore_gettupleslot(res->tuplestore, true, false, tupslot)) + { + bool isnull; + RemoteSlot *remote_slot = palloc0(sizeof(RemoteSlot)); + Datum d; + + remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, 1, )); + Assert(!isnull); + + remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, 2, )); + Assert(!isnull); + + /* + * It is possible to get null values for LSN and Xmin if slot is + * invalidated on the primary server, so handle accordingly. + */ + d = slot_getattr(tupslot, 3, ); + remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr : + DatumGetLSN(d); + + d = slot_getattr(tupslot, 4, ); + remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d); + + d = slot_getattr(tupslot, 5, ); + remote_slot->catalog_xmin = isnull ? InvalidTransactionId : + DatumGetTransactionId(d); + + remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, 6, )); + Assert(!isnull); + + remote_slot->database = TextDatumGetCString(slot_getattr(tupslot, + 7, )); + Assert(!isnull); + + d = slot_getattr(tupslot, 8, ); + remote_slot->invalidated = isnull ? RS_INVAL_NONE : + get_slot_invalidation_cause(TextDatumGetCString(d)); Would it be better to get rid of the hardwired column numbers and then be able to use the SLOTSYNC_COLUMN_COUNT already defined as a sanity check? SUGGESTION int col = 0; ... remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, ++col, )); ... remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, ++col, )); ... d = slot_getattr(tupslot, ++col, ); remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d); ... d = slot_getattr(tupslot, ++col, ); remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d); ... d = slot_getattr(tupslot, ++col, ); remote_slot->catalog_xmin = isnull ? InvalidTransactionId : DatumGetTransactionId(d); ... remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, ++col, )); ... remote_slot->database = TextDatumGetCString(slot_getattr(tupslot, ++col, )); ... d = slot_getattr(tupslot, ++col, ); remote_slot->invalidated = isnull ? RS_INVAL_NONE : get_slot_invalidation_cause(TextDatumGetCString(d)); /* Sanity check */ Asert(col == SLOTSYNC_COLUMN_COUNT); ~~~ 5. +static char * +validate_parameters_and_get_dbname(void) +{ + char*dbname; These are configuration issues, so probably all these ereports could also set errcode(ERRCODE_INVALID_PARAMETER_VALUE). == Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
Here are some review comments for the patch v67-0001. == 1. There are a couple of places checking for failover usage on a standby. + if (RecoveryInProgress() && failover) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot enable failover for a replication slot" +" created on the standby")); and + if (RecoveryInProgress() && failover) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot enable failover for a replication slot" +" on the standby")); IMO the conditions should be written the other way around (failover && RecoveryInProgress()) to avoid the unnecessary function calls when 'failover' flag is probably mostly default false anyway. == Kind Regards, Peter Smith. Fujitsu Australia
Re: Documentation to upgrade logical replication cluster
Here are some review comments for patch v3. == doc/src/sgml/ref/pgupgrade.sgml 1. + + + This page does not cover steps to upgrade logical replication clusters, refer +for details on upgrading + logical replication clusters. + + I felt that maybe this note was misplaced. Won't it be better to put this down in the "Usage" section of this page? BEFORE These are the steps to perform an upgrade with pg_upgrade: SUGGESTION (or something like this) Below are the steps to perform an upgrade with pg_upgrade. Note, the steps to upgrade logical replication clusters are not covered here; refer to for details. ~~~ 2. Configure the servers for log shipping. (You do not need to run pg_backup_start() and pg_backup_stop() or take a file system backup as the standbys are still synchronized - with the primary.) Only logical slots on the primary are copied to the - new standby, but other slots on the old standby are not copied so must - be recreated manually. + with the primary.) In version 17.0 or later, only logical slots on the + primary are copied to the new standby, but other slots on the old standby + are not copied so must be recreated manually. This para was still unclear to me. What is version 17.0 referring to -- the old_cluster version? Do you mean something like: If the old cluster is < v17 then logical slots are not copied. If the old_cluster is >= v17 then... == doc/src/sgml/logical-replication.sgml 3. + +While upgrading a subscriber, write operations can be performed in the +publisher, these changes will be replicated to the subscriber once the +subscriber upgradation is completed. + 3a. /publisher, these changes/publisher. These changes/ ~ 3b. "upgradation" ??. See [1] maybe just /upgradation/upgrade/ ~~~ 4. GENERAL - prompts/paths I noticed in v3 you removed all the cmd prompts like: dba@node1:/opt/PostgreSQL/postgres/17/bin$ dba@node2:/opt/PostgreSQL/postgres/18/bin$ etc. I thought those were helpful to disambiguate which server/version was being operated on. I wonder if there is some way to keep information still but not make it look like a real current directory that Kuroda-san did not like: e.g. Maybe something like the below is possible? (dba@node1: v17) pg_upgrade... (dba@node2: v18) pg_upgrade... == [1] https://english.stackexchange.com/questions/192187/upgradation-not-universally-accepted#:~:text=Not%20all%20dictionaries%20(or%20native,by%20most%20non%2DIE%20speakers. Kind Regards, Peter Smith. Fujitsu Australia
Re: Make documentation builds reproducible
On Tue, Jan 23, 2024 at 12:32 PM Peter Smith wrote: > > On Tue, Jan 23, 2024 at 12:13 PM Tom Lane wrote: > > > > Peter Smith writes: > > > I usually the HTML documentation locally using command: > > > make STYLE=website html > > > This has been working forever, but seems to have broken due to commit > > > [1] having an undeclared variable. > > > > Interestingly, that still works fine for me, on RHEL8 with > > > > docbook-dtds-1.0-69.el8.noarch > > docbook-style-xsl-1.79.2-9.el8.noarch > > docbook-style-dsssl-1.79-25.el8.noarch > > > > What docbook version are you using? > > > > [postgres@CentOS7-x64 sgml]$ sudo yum list installed | grep docbook > docbook-dtds.noarch 1.0-60.el7 @anaconda > docbook-style-dsssl.noarch1.79-18.el7@base > docbook-style-xsl.noarch 1.78.1-3.el7 @anaconda > IIUC these releases notes [1] say autolink.index.see existed since v1.79.1, but unfortunately, that is more recent than my ancient installed v1.78.1 >From the release notes: -- Robert Stayton: autolink.index.see.xml New param to control automatic links in index from see and seealso to indexterm primary. -- == [1] https://docbook.sourceforge.net/release/xsl/1.79.1/RELEASE-NOTES.html Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
as we do not support sync to the cascading standby. + */ + if (RecoveryInProgress() && failover) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot alter replication slot to have failover" +" enabled on the standby")); I felt the errmsg could be expressed with less ambiguity: SUGGESTION: cannot enable failover for a replication slot on the standby == src/backend/replication/slotfuncs.c 11. create_physical_replication_slot /* acquire replication slot, this will check for conflicting names */ ReplicationSlotCreate(name, false, - temporary ? RS_TEMPORARY : RS_PERSISTENT, false); + temporary ? RS_TEMPORARY : RS_PERSISTENT, false, + false); Having an inline comment might be helpful here instead of passing "false,false" SUGGESTION ReplicationSlotCreate(name, false, temporary ? RS_TEMPORARY : RS_PERSISTENT, false, false /* failover */); ~~~ 12. create_logical_replication_slot + /* + * Do not allow users to create the slots with failover enabled on the + * standby as we do not support sync to the cascading standby. + */ + if (RecoveryInProgress() && failover) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot create replication slot with failover" +" enabled on the standby")); (similar to previous comment) SUGGESTION: cannot enable failover for a replication slot created on the standby ~~~ 13. copy_replication_slot * hence pass find_startpoint false. confirmed_flush will be set * below, by copying from the source slot. + * + * To avoid potential issues with the slotsync worker when the + * restart_lsn of a replication slot goes backwards, we set the + * failover option to false here. This situation occurs when a slot on + * the primary server is dropped and immediately replaced with a new + * slot of the same name, created by copying from another existing + * slot. However, the slotsync worker will only observe the restart_lsn + * of the same slot going backwards. */ create_logical_replication_slot(NameStr(*dst_name), plugin, temporary, false, + false, src_restart_lsn, false); (similar to an earlier comment) Having an inline comment might be helpful here. e.g. false /* failover */, == src/backend/replication/walreceiver.c 14. - walrcv_create_slot(wrconn, slotname, true, false, 0, NULL); + walrcv_create_slot(wrconn, slotname, true, false, false, 0, NULL); (similar to an earlier comment) Having an inline comment might be helpful here: SUGGESTION walrcv_create_slot(wrconn, slotname, true, false, false /* failover */, 0, NULL); == src/backend/replication/walsender.c 15. CreateReplicationSlot ReplicationSlotCreate(cmd->slotname, false, cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT, - false); + false, false); (similar to an earlier comment) Having an inline comment might be helpful here. e.g. false /* failover */, ~~~ 16. CreateReplicationSlot + /* + * Do not allow users to create the slots with failover enabled on the + * standby as we do not support sync to the cascading standby. + */ + if (RecoveryInProgress() && failover) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot create replication slot with failover" +" enabled on the standby")); + /* * Initially create persistent slot as ephemeral - that allows us to * nicely handle errors during initialization because it'll get @@ -1243,7 +1265,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd) */ ReplicationSlotCreate(cmd->slotname, true, cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL, - two_phase); + two_phase, failover); This errmsg seems to be repeated in a few places, so I wondered if this code can be refactored to call direct to create_logical_replication_slot() so the errmsg can be just once in a common place. OTOH, if it cannot be refactored, then needs to be using same errmsg as suggested by earlier review comments (see above). == src/include/catalog/pg_subscription.h 17. + bool subfailover; /* True if the associated replication slots + * (i.e. the main slot and the table sync + * slots) in the upstream database are enabled + * to be synchronized to the physical + * standbys. */ (same as earlier) /physical standbys/physical standby/ ~~~ 18. + bool failover; /* True if the associated replication slots + * (i.e. the main slot and the table sync + * slots) in the upstream database are enabled + * to be synchronized to the physical + * standbys. */ (same as earlier) /physical standbys/physical standby/ == src/include/replication/slot.h 19. + + /* + * Is this a failover slot (sync candidate for physical standbys)? Only + * relevant for logical slots on the primary server. + */ + bool failover; (same as earlier) /physical standbys/physical standby/ == [1] Nisha errmsg - https://www.postgresql.org/message-id/CABdArM5-VR4Akt_AHap_0Ofne0cTcsdnN6FcNe%2BMU8eXsa_ERQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Synchronizing slots from primary to standby
overy", should there also be a check for that (e.g. RecoveryInProgress) in the added Assert? == src/include/replication/walreceiver.h 5. typedef char *(*walrcv_identify_system_fn) (WalReceiverConn *conn, TimeLineID *primary_tli); +/* + * walrcv_get_dbname_from_conninfo_fn + * + * Returns the dbid from the primary_conninfo + */ +typedef char *(*walrcv_get_dbname_from_conninfo_fn) (const char *conninfo); It looks like a blank line that previously existed has been lost. == [1] https://www.postgresql.org/message-id/CAHut%2BPsf3NewXbsFKY88Qn1ON1_dMD6343MuWdMiiM2Ds9a_wA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: Make documentation builds reproducible
On Tue, Jan 23, 2024 at 12:13 PM Tom Lane wrote: > > Peter Smith writes: > > I usually the HTML documentation locally using command: > > make STYLE=website html > > This has been working forever, but seems to have broken due to commit > > [1] having an undeclared variable. > > Interestingly, that still works fine for me, on RHEL8 with > > docbook-dtds-1.0-69.el8.noarch > docbook-style-xsl-1.79.2-9.el8.noarch > docbook-style-dsssl-1.79-25.el8.noarch > > What docbook version are you using? > [postgres@CentOS7-x64 sgml]$ sudo yum list installed | grep docbook docbook-dtds.noarch 1.0-60.el7 @anaconda docbook-style-dsssl.noarch1.79-18.el7@base docbook-style-xsl.noarch 1.78.1-3.el7 @anaconda == Kind Regards, Peter Smith. Fujitsu Australia
Re: Make documentation builds reproducible
Hi, I usually the HTML documentation locally using command: make STYLE=website html ~ This has been working forever, but seems to have broken due to commit [1] having an undeclared variable. e.g. [postgres@CentOS7-x64 sgml]$ make STYLE=website html { \ echo ""; \ echo ""; \ } > version.sgml '/usr/bin/perl' ./mk_feature_tables.pl YES ../../../src/backend/catalog/sql_feature_packages.txt ../../../src/backend/catalog/sql_features.txt > features-supported.sgml '/usr/bin/perl' ./mk_feature_tables.pl NO ../../../src/backend/catalog/sql_feature_packages.txt ../../../src/backend/catalog/sql_features.txt > features-unsupported.sgml '/usr/bin/perl' ./generate-errcodes-table.pl ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml '/usr/bin/perl' ./generate-targets-meson.pl targets-meson.txt generate-targets-meson.pl > targets-meson.sgml '/usr/bin/perl' ../../../src/backend/utils/activity/generate-wait_event_types.pl --docs ../../../src/backend/utils/activity/wait_event_names.txt /usr/bin/xmllint --nonet --path . --path . --output postgres-full.xml --noent --valid postgres.sgml /usr/bin/xsltproc --nonet --path . --path . --stringparam pg.version '17devel' --param website.stylesheet 1 stylesheet.xsl postgres-full.xml runtime error: file stylesheet-html-common.xsl line 452 element if Variable 'autolink.index.see' has not been declared. make: *** [html-stamp] Error 10 == [1] https://github.com/postgres/postgres/commit/b0f0a9432d0b6f53634a96715f2666f6d4ea25a1 Kind Regards, Peter Smith. Fujitsu Australia
Re: POC: Extension for adding distributed tracing - pg_tracing
2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems there were CFbot test failures last time it was run [2]. Please have a look and post an updated version if necessary. == [1] https://commitfest.postgresql.org/46/4456/ [2] https://cirrus-ci.com/task/5581154296791040 Kind Regards, Peter Smith.
Re: Shared detoast Datum proposal
2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems there were CFbot test failures last time it was run [2]. Please have a look and post an updated version if necessary. == [1] https://commitfest.postgresql.org/46/4759/ [2] https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4759 Kind Regards, Peter Smith.
Re: Things I don't like about \du's "Attributes" column
2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems there were CFbot test failures last time it was run [2]. Please have a look and post an updated version if necessary. == [1] https://commitfest.postgresql.org/46/4738/ [2] https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4738 Kind Regards, Peter Smith.
Re: pg_stat_statements and "IN" conditions
2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems there was a CFbot test failure last time it was run [2]. Please have a look and post an updated version if necessary. == [1] https://commitfest.postgresql.org/46/2837/ [2] https://cirrus-ci.com/task/6688578378399744 Kind Regards, Peter Smith.