Re: Synchronizing slots from primary to standby
On Wed, Jun 28, 2023 at 12:19 PM Drouvot, Bertrand wrote: > > On 6/26/23 12:34 PM, Amit Kapila wrote: > > On Mon, Jun 26, 2023 at 11:15 AM Drouvot, Bertrand > > wrote: > >> > >> On 6/20/23 12:22 PM, Amit Kapila wrote: > >>> On Mon, Jun 19, 2023 at 9:56 PM Drouvot, Bertrand > >>> wrote: > >> > In such a case (slot valid on the primary but invalidated on the > standby) then I think we > could drop and recreate the invalidated slot on the standby. > > >>> > >>> Will it be safe? Because after recreating the slot, it will reserve > >>> the new WAL location and build the snapshot based on that which might > >>> miss some important information in the snapshot. For example, to > >>> update the slot's position with new information from the primary, the > >>> patch uses pg_logical_replication_slot_advance() which means it will > >>> process all records and update the snapshot via > >>> DecodeCommit->SnapBuildCommitTxn(). > >> > >> Your concern is that the slot could have been consumed on the standby? > >> > >> I mean, if we suppose the "synchronized" slot can't be consumed on the > >> standby then > >> drop/recreate such an invalidated slot would be ok? > >> > > > > That also may not be sufficient because as soon as the slot is > > invalidated/dropped, the required WAL could be removed on standby. > > > > Yeah, I think once the slot is dropped we just have to wait for the slot to > be re-created on the standby according to the new synchronize_slot_names GUC. > > Assuming the initial slot "creation" on the standby (coming from the > synchronize_slot_names usage) > is working "correctly" then it should also work "correctly" once the slot is > dropped. > I also think so. > If we agree that a synchronized slot can not/should not be consumed (will > implement this behavior) then > I think the proposed scenario above should make sense, do you agree? > Yeah, I also can't think of a use case for this. So, we can probably disallow it and document the same. I guess if we came across a use case for this, we can rethink allowing to consume the changes from synchronized slots. -- With Regards, Amit Kapila.
RE: Synchronizing slots from primary to standby
Dear Drouvot, Hi, I'm also interested in the feature. Followings are my high-level comments. I did not mention some detailed notations because this patch is not at the stage. And very sorry that I could not follow all of this discussions. 1. I thought that we should not reuse logical replication launcher for another purpose. The background worker should have only one task. I wanted to ask opinions some other people... 2. I want to confirm the reason why new replication command is added. IIUC the launcher connects to primary by using primary_conninfo connection string, but it establishes the physical replication connection so that any SQL cannot be executed. Is it right? Another approach not to use is to specify the target database via GUC, whereas not smart. How do you think? 3. You chose the per-db worker approach, however, it is difficult to extend the feature to support physical slots. This may be problematic. Was there any reasons for that? I doubted ReplicationSlotCreate() or advance functions might not be used from other databases and these may be reasons, but not sure. If these operations can do without connecting to specific database, I think the architecture can be changed. 4. Currently the launcher establishes the connection every time. Isn't it better to reuse the same one instead? Following comments are assumed the configuration, maybe the straightfoward: primary->standby |->subscriber 5. After constructing the system, I dropped the subscription on the subscriber. In this case the logical slot on primary was removed, but that was not replicated to standby server. Did you support the workload or not? ``` $ psql -U postgres -p $port_sub -c "DROP SUBSCRIPTION sub" NOTICE: dropped replication slot "sub" on publisher DROP SUBSCRIPTION $ psql -U postgres -p $port_primary -c "SELECT * FROM pg_replication_slots" slot_name | plugin | slot_type | datoid | database |... ---+--+---++--+... (0 rows) $ psql -U postgres -p $port_standby -c "SELECT * FROM pg_replication_slots" slot_name | plugin | slot_type | datoid | database |... ---+--+---++--+... sub | pgoutput | logical | 5 | postgres |... (1 row) ``` 6. Current approach may delay the startpoint of sync. Assuming that physical replication system is created first, and then the subscriber connects to the publisher node. In this case the launcher connects to primary earlier than the apply worker, and reads the slot. At that time there are no slots on primary, so launcher disconnects from primary and waits a time period (up to 3min). Even if the apply worker creates the slot on publisher, but the launcher on standby cannot notice that. The synchronization may start 3 min later. I'm not sure how to fix or it could be acceptable. Thought? Best Regards, Hayato Kuroda FUJITSU LIMITED
Re: Synchronizing slots from primary to standby
Hi, On 6/26/23 12:34 PM, Amit Kapila wrote: On Mon, Jun 26, 2023 at 11:15 AM Drouvot, Bertrand wrote: On 6/20/23 12:22 PM, Amit Kapila wrote: On Mon, Jun 19, 2023 at 9:56 PM Drouvot, Bertrand wrote: In such a case (slot valid on the primary but invalidated on the standby) then I think we could drop and recreate the invalidated slot on the standby. Will it be safe? Because after recreating the slot, it will reserve the new WAL location and build the snapshot based on that which might miss some important information in the snapshot. For example, to update the slot's position with new information from the primary, the patch uses pg_logical_replication_slot_advance() which means it will process all records and update the snapshot via DecodeCommit->SnapBuildCommitTxn(). Your concern is that the slot could have been consumed on the standby? I mean, if we suppose the "synchronized" slot can't be consumed on the standby then drop/recreate such an invalidated slot would be ok? That also may not be sufficient because as soon as the slot is invalidated/dropped, the required WAL could be removed on standby. Yeah, I think once the slot is dropped we just have to wait for the slot to be re-created on the standby according to the new synchronize_slot_names GUC. Assuming the initial slot "creation" on the standby (coming from the synchronize_slot_names usage) is working "correctly" then it should also work "correctly" once the slot is dropped. If we agree that a synchronized slot can not/should not be consumed (will implement this behavior) then I think the proposed scenario above should make sense, do you agree? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Synchronizing slots from primary to standby
On Mon, Jun 26, 2023 at 11:15 AM Drouvot, Bertrand wrote: > > On 6/20/23 12:22 PM, Amit Kapila wrote: > > On Mon, Jun 19, 2023 at 9:56 PM Drouvot, Bertrand > > wrote: > > >> In such a case (slot valid on the primary but invalidated on the standby) > >> then I think we > >> could drop and recreate the invalidated slot on the standby. > >> > > > > Will it be safe? Because after recreating the slot, it will reserve > > the new WAL location and build the snapshot based on that which might > > miss some important information in the snapshot. For example, to > > update the slot's position with new information from the primary, the > > patch uses pg_logical_replication_slot_advance() which means it will > > process all records and update the snapshot via > > DecodeCommit->SnapBuildCommitTxn(). > > Your concern is that the slot could have been consumed on the standby? > > I mean, if we suppose the "synchronized" slot can't be consumed on the > standby then > drop/recreate such an invalidated slot would be ok? > That also may not be sufficient because as soon as the slot is invalidated/dropped, the required WAL could be removed on standby. -- With Regards, Amit Kapila.
Re: Synchronizing slots from primary to standby
Hi, On 6/20/23 12:22 PM, Amit Kapila wrote: On Mon, Jun 19, 2023 at 9:56 PM Drouvot, Bertrand wrote: In such a case (slot valid on the primary but invalidated on the standby) then I think we could drop and recreate the invalidated slot on the standby. Will it be safe? Because after recreating the slot, it will reserve the new WAL location and build the snapshot based on that which might miss some important information in the snapshot. For example, to update the slot's position with new information from the primary, the patch uses pg_logical_replication_slot_advance() which means it will process all records and update the snapshot via DecodeCommit->SnapBuildCommitTxn(). Your concern is that the slot could have been consumed on the standby? I mean, if we suppose the "synchronized" slot can't be consumed on the standby then drop/recreate such an invalidated slot would be ok? Asking, because I'm not sure we should allow consumption of a "synchronized" slot until the standby gets promoted. When the patch has been initially proposed, logical decoding from a standby was not implemented yet. The other related thing is that do we somehow need to ensure that WAL is replayed on standby before moving the slot's position to the target location received from the primary? Yeah, will check if this is currently done that way in the patch proposal. BTW, does the patch handles drop of logical slots on standby when the same slot is dropped on publisher/primary? from what I've seen, yes it looks like it behaves that way (will look closer). Okay, I have asked because I don't see a call to ReplicationSlotDrop() in the patch. Right. I'd need to look closer to understand how it works (for the moment the "only" thing I've done was the re-base shared up-thread). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Synchronizing slots from primary to standby
On Mon, Jun 19, 2023 at 9:56 PM Drouvot, Bertrand wrote: > > On 6/19/23 12:03 PM, Amit Kapila wrote: > > On Mon, Jun 19, 2023 at 11:34 AM Drouvot, Bertrand > > wrote: > >> > >> Also I think we need to handle the case of invalidated replication > >> slot(s): should > >> we drop/recreate it/them? (as the main goal is to have sync slot(s) on the > >> standby). > >> > > > > Do you intend to ask what happens to logical slots invalidated (due to > > say max_slot_wal_keep_size) on publisher? I think those should be > > invalidated on standby too. > > Agree that it should behave that way. > > > Another thought whether there is chance > > that the slot on standby gets invalidated due to conflict (say > > required rows removed on primary)? > > That's the scenario I had in mind when asking the question above. > > > I think in such cases the slot on > > primary/publisher should have been dropped/invalidated by that time. > > I don't think so. > > For example, such a scenario could occur: > > - there is no physical slot between the standby and the primary > - the standby is shut down > - logical decoding on the primary is moving forward and now there is vacuum > operations that will conflict on the standby > - the standby starts and reports the logical slot being invalidated (while it > is > not on the primary) > > In such a case (slot valid on the primary but invalidated on the standby) > then I think we > could drop and recreate the invalidated slot on the standby. > Will it be safe? Because after recreating the slot, it will reserve the new WAL location and build the snapshot based on that which might miss some important information in the snapshot. For example, to update the slot's position with new information from the primary, the patch uses pg_logical_replication_slot_advance() which means it will process all records and update the snapshot via DecodeCommit->SnapBuildCommitTxn(). The other related thing is that do we somehow need to ensure that WAL is replayed on standby before moving the slot's position to the target location received from the primary? > > BTW, does the patch handles drop of logical slots on standby when the > > same slot is dropped on publisher/primary? > > > > from what I've seen, yes it looks like it behaves that way (will look closer). > Okay, I have asked because I don't see a call to ReplicationSlotDrop() in the patch. -- With Regards, Amit Kapila.
Re: Synchronizing slots from primary to standby
Hi, On 6/19/23 12:03 PM, Amit Kapila wrote: On Mon, Jun 19, 2023 at 11:34 AM Drouvot, Bertrand wrote: Also I think we need to handle the case of invalidated replication slot(s): should we drop/recreate it/them? (as the main goal is to have sync slot(s) on the standby). Do you intend to ask what happens to logical slots invalidated (due to say max_slot_wal_keep_size) on publisher? I think those should be invalidated on standby too. Agree that it should behave that way. Another thought whether there is chance that the slot on standby gets invalidated due to conflict (say required rows removed on primary)? That's the scenario I had in mind when asking the question above. I think in such cases the slot on primary/publisher should have been dropped/invalidated by that time. I don't think so. For example, such a scenario could occur: - there is no physical slot between the standby and the primary - the standby is shut down - logical decoding on the primary is moving forward and now there is vacuum operations that will conflict on the standby - the standby starts and reports the logical slot being invalidated (while it is not on the primary) In such a case (slot valid on the primary but invalidated on the standby) then I think we could drop and recreate the invalidated slot on the standby. BTW, does the patch handles drop of logical slots on standby when the same slot is dropped on publisher/primary? from what I've seen, yes it looks like it behaves that way (will look closer). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Synchronizing slots from primary to standby
On Mon, Jun 19, 2023 at 11:34 AM Drouvot, Bertrand wrote: > > Also I think we need to handle the case of invalidated replication slot(s): > should > we drop/recreate it/them? (as the main goal is to have sync slot(s) on the > standby). > Do you intend to ask what happens to logical slots invalidated (due to say max_slot_wal_keep_size) on publisher? I think those should be invalidated on standby too. Another thought whether there is chance that the slot on standby gets invalidated due to conflict (say required rows removed on primary)? I think in such cases the slot on primary/publisher should have been dropped/invalidated by that time. BTW, does the patch handles drop of logical slots on standby when the same slot is dropped on publisher/primary? -- With Regards, Amit Kapila.
Re: Synchronizing slots from primary to standby
Hi, On 6/16/23 11:56 AM, Amit Kapila wrote: On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand wrote: Please find attached V5 (a rebase of V4 posted up-thread). In addition to the "rebasing" work, the TAP test adds a test about conflict handling (logical slot invalidation) relying on the work done in the "Minimal logical decoding on standby" patch series. I did not look more at the patch (than what's was needed for the rebase) but plan to do so. Are you still planning to continue working on this? Yes, I think it would be great to have such a feature in core. Some miscellaneous comments while going through this patch are as follows? Thanks! I'll look at them and will try to come back to you by mid of next week. Also I think we need to handle the case of invalidated replication slot(s): should we drop/recreate it/them? (as the main goal is to have sync slot(s) on the standby). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Synchronizing slots from primary to standby
On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand wrote: > > Please find attached V5 (a rebase of V4 posted up-thread). > > In addition to the "rebasing" work, the TAP test adds a test about conflict > handling (logical slot invalidation) > relying on the work done in the "Minimal logical decoding on standby" patch > series. > > I did not look more at the patch (than what's was needed for the rebase) but > plan to do so. > Are you still planning to continue working on this? Some miscellaneous comments while going through this patch are as follows? 1. Can you please try to explain the functionality of the overall patch somewhere in the form of comments and or commit message? 2. It seems that the initially synchronized list of slots is only used to launch a per-database worker to synchronize all the slots corresponding to that database. If so, then why do we need to fetch all the slot-related information via LIST_SLOTS command? 3. As mentioned in the initial email, I think it would be better to replace LIST_SLOTS command with a SELECT query. 4. How the limit of sync_slot workers is decided? Can we document such a piece of information? Do we need a new GUC to decide the number of workers? Ideally, it would be better to avoid GUC, can we use any existing logical replication workers related GUC? 5. Can we separate out the functionality related to standby_slot_names in a separate patch, probably the first one? I think that will patch easier to review. 6. In libpqrcv_list_slots(), two-phase related slot information is not retrieved. Is there a reason for the same? 7. +static void +wait_for_standby_confirmation(XLogRecPtr commit_lsn) Some comments atop this function would make it easier to review. 8. +/*- + * slotsync.c + *PostgreSQL worker for synchronizing slots to a standby from primary + * + * Copyright (c) 2016-2018, PostgreSQL Global Development Group + * The copyright notice is out-of-date. 9. Why synchronize_one_slot() compares MyReplicationSlot->data.restart_lsn with the value of confirmed_flush_lsn passed to it? Also, why it does only for new slots but not existing slots? 10. Can we somehow test if the restart_lsn is advanced properly after sync? I think it is important to ensure that because otherwise after standby's promotion, the subscriber can start syncing from the wrong position. -- With Regards, Amit Kapila.
Re: Synchronizing slots from primary to standby
Hi, On 4/14/23 3:22 PM, Drouvot, Bertrand wrote: Now that the "Minimal logical decoding on standby" patch series (mentioned up-thread) has been committed, I think we can resume working on this one ("Synchronizing slots from primary to standby"). I'll work on a rebase and share it once done (unless someone already started working on a rebase). Please find attached V5 (a rebase of V4 posted up-thread). In addition to the "rebasing" work, the TAP test adds a test about conflict handling (logical slot invalidation) relying on the work done in the "Minimal logical decoding on standby" patch series. I did not look more at the patch (than what's was needed for the rebase) but plan to do so. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.comFrom 655359eedf37d8f2e522aeb1ec8c48adfc1759b1 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date: Thu, 13 Apr 2023 11:32:28 + Subject: [PATCH v5] Synchronize logical replication slots from primary to standby --- doc/src/sgml/config.sgml | 34 ++ src/backend/commands/subscriptioncmds.c | 4 +- src/backend/postmaster/bgworker.c | 3 + .../libpqwalreceiver/libpqwalreceiver.c | 95 src/backend/replication/logical/Makefile | 1 + src/backend/replication/logical/launcher.c| 263 +++ src/backend/replication/logical/meson.build | 1 + .../replication/logical/reorderbuffer.c | 86 src/backend/replication/logical/slotsync.c| 413 ++ src/backend/replication/logical/tablesync.c | 13 +- src/backend/replication/logical/worker.c | 3 +- src/backend/replication/repl_gram.y | 32 +- src/backend/replication/repl_scanner.l| 2 + src/backend/replication/slotfuncs.c | 2 +- src/backend/replication/walsender.c | 195 + src/backend/utils/activity/wait_event.c | 3 + src/backend/utils/misc/guc_tables.c | 26 ++ src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/commands/subscriptioncmds.h | 3 + src/include/nodes/replnodes.h | 9 + src/include/replication/logicallauncher.h | 2 + src/include/replication/logicalworker.h | 9 + src/include/replication/slot.h| 5 +- src/include/replication/walreceiver.h | 20 + src/include/replication/worker_internal.h | 8 +- src/include/utils/wait_event.h| 1 + src/test/recovery/meson.build | 1 + src/test/recovery/t/037_slot_sync.pl | 130 ++ 28 files changed, 1272 insertions(+), 94 deletions(-) 3.8% doc/src/sgml/ 7.1% src/backend/replication/libpqwalreceiver/ 54.7% src/backend/replication/logical/ 14.9% src/backend/replication/ 3.3% src/backend/ 4.0% src/include/replication/ 10.9% src/test/recovery/t/ diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 091a79d4f3..1360885208 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4466,6 +4466,23 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows + + standby_slot_names (string) + + standby_slot_names configuration parameter + + + + +List of physical replication slots that logical replication waits for. +If a logical replication connection is meant to switch to a physical +standby after the standby is promoted, the physical replication slot +for the standby should be listed here. This ensures that logical +replication is not ahead of the physical standby. + + + + @@ -4649,6 +4666,23 @@ ANY num_sync ( + synchronize_slot_names (string) + + synchronize_slot_names configuration parameter + + + + +Specifies a list of logical replication slots that a physical standby +should synchronize from the primary server. This is necessary to be +able to retarget those logical replication connections to this standby +if it gets promoted. Specify * to synchronize all +logical replication slots. The default is empty. + + + + diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 3251d89ba8..8721706b79 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -991,7 +991,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data, RemoveSubscriptionRel(sub->oid, relid); - logicalrep_worker_stop(sub->oid, relid); + logicalrep_worker_stop(MyDatabaseId, sub->oid, relid); /* * For READY state, we would have already dropped the @@
Re: Synchronizing slots from primary to standby
Hi, On 11/15/22 10:02 AM, Drouvot, Bertrand wrote: Hi, On 2/11/22 3:26 PM, Peter Eisentraut wrote: On 10.02.22 22:47, Bruce Momjian wrote: On Tue, Feb 8, 2022 at 08:27:32PM +0530, Ashutosh Sharma wrote: Which means that if e.g. the standby_slot_names GUC differs from synchronize_slot_names on the physical replica, the slots synchronized on the physical replica are not going to be valid. Or if the primary drops its logical slots. Should the redo function for the drop replication slot have the capability to drop it on standby and its subscribers (if any) as well? Slots are not WAL logged (and shouldn't be). I think you pretty much need the recovery conflict handling infrastructure I referenced upthread, which recognized during replay if a record has a conflict with a slot on a standby. And then ontop of that you can build something like this patch. OK. Understood, thanks Andres. I would love to see this feature in PG 15. Can someone explain its current status? Thanks. The way I understand it: 1. This feature (probably) depends on the "Minimal logical decoding on standbys" patch. The details there aren't totally clear (to me). That patch had some activity lately but I don't see it in a state that it's nearing readiness. FWIW, a proposal has been submitted in [1] to add information in the WAL records in preparation for logical slot conflict handling. [1]: https://www.postgresql.org/message-id/178cf7da-9bd7-e328-9c49-e28ac4701...@gmail.com Now that the "Minimal logical decoding on standby" patch series (mentioned up-thread) has been committed, I think we can resume working on this one ("Synchronizing slots from primary to standby"). I'll work on a rebase and share it once done (unless someone already started working on a rebase). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Synchronizing slots from primary to standby
Hi, On 2/11/22 3:26 PM, Peter Eisentraut wrote: On 10.02.22 22:47, Bruce Momjian wrote: On Tue, Feb 8, 2022 at 08:27:32PM +0530, Ashutosh Sharma wrote: Which means that if e.g. the standby_slot_names GUC differs from synchronize_slot_names on the physical replica, the slots synchronized on the physical replica are not going to be valid. Or if the primary drops its logical slots. Should the redo function for the drop replication slot have the capability to drop it on standby and its subscribers (if any) as well? Slots are not WAL logged (and shouldn't be). I think you pretty much need the recovery conflict handling infrastructure I referenced upthread, which recognized during replay if a record has a conflict with a slot on a standby. And then ontop of that you can build something like this patch. OK. Understood, thanks Andres. I would love to see this feature in PG 15. Can someone explain its current status? Thanks. The way I understand it: 1. This feature (probably) depends on the "Minimal logical decoding on standbys" patch. The details there aren't totally clear (to me). That patch had some activity lately but I don't see it in a state that it's nearing readiness. FWIW, a proposal has been submitted in [1] to add information in the WAL records in preparation for logical slot conflict handling. [1]: https://www.postgresql.org/message-id/178cf7da-9bd7-e328-9c49-e28ac4701...@gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Synchronizing slots from primary to standby
Hi, I have spent little time trying to understand the concern raised by Andres and while doing so I could think of a couple of issues which I would like to share here. Although I'm not quite sure how inline these are with the problems seen by Andres. 1) Firstly, what if we come across a situation where the failover occurs when the confirmed flush lsn has been updated on primary, but is yet to be updated on the standby? I believe this may very well be the case especially considering that standby sends sql queries to the primary to synchronize the replication slots at regular intervals and if the primary dies just after updating the confirmed flush lsn of its logical subscribers then the standby may not be able to get this information/update from the primary which means we'll probably end up having a broken logical replication slot on the new primary. 2) Secondly, if the standby goes down, the logical subscribers will stop receiving new changes from the primary as per the design of this patch OR if standby lags behind the primary for whatever reason, it will have a direct impact on logical subscribers as well. -- With Regards, Ashutosh Sharma. On Sat, Feb 19, 2022 at 3:53 AM Andres Freund wrote: > > Hi, > > On 2022-02-11 15:28:19 +0100, Peter Eisentraut wrote: > > On 05.02.22 20:59, Andres Freund wrote: > > > On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > > > > From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 > > > > From: Peter Eisentraut > > > > Date: Mon, 3 Jan 2022 14:43:36 +0100 > > > > Subject: [PATCH v3] Synchronize logical replication slots from primary > > > > to > > > > standby > > > I've just skimmed the patch and the related threads. As far as I can tell > > > this > > > cannot be safely used without the conflict handling in [1], is that > > > correct? > > > > This or similar questions have been asked a few times about this or similar > > patches, but they always come with some doubt. > > I'm certain it's a problem - the only reason I couched it was that there could > have been something clever in the patch preventing problems that I missed > because I just skimmed it. > > > > If we think so, it would be > > useful perhaps if we could come up with test cases that would demonstrate > > why that other patch/feature is necessary. (I'm not questioning it > > personally, I'm just throwing out ideas here.) > > The patch as-is just breaks one of the fundamental guarantees necessary for > logical decoding, that no rows versions can be removed that are still required > for logical decoding (signalled via catalog_xmin). So there needs to be an > explicit mechanism upholding that guarantee, but there is not right now from > what I can see. > > One piece of the referenced patchset is that it adds information about removed > catalog rows to a few WAL records, and then verifies during replay that no > record can be replayed that removes resources that are still needed. If such a > conflict exists it's dealt with as a recovery conflict. > > That itself doesn't provide prevention against removal of required, but it > provides detection. The prevention against removal can then be done using a > physical replication slot with hot standby feedback or some other mechanism > (e.g. slot syncing mechanism could maintain a "placeholder" slot on the > primary for all sync targets or something like that). > > Even if that infrastructure existed / was merged, the slot sync stuff would > still need some very careful logic to protect against problems due to > concurrent WAL replay and "synchronized slot" creation. But that's doable. > > Greetings, > > Andres Freund > >
Re: Synchronizing slots from primary to standby
On Thu, Feb 24, 2022 at 12:46 AM James Coleman wrote: > I've been working on adding test coverage to prove this out, but I've > encountered the problem reported in [1]. > > My assumption, but Andres please correct me if I'm wrong, that we > should see issues with the following steps (given the primary, > physical replica, and logical subscriber already created in the test): > > 1. Ensure both logical subscriber and physical replica are caught up > 2. Disable logical subscription > 3. Make a catalog change on the primary (currently renaming the > primary key column) > 4. Vacuum pg_class > 5. Ensure physical replication is caught up > 6. Stop primary and promote the replica > 7. Write to the changed table > 8. Update subscription to point to promoted replica > 9. Re-enable logical subscription > > I'm attaching my test as an additional patch in the series for > reference. Currently I have steps 3 and 4 commented out to show that > the issues in [1] occur without any attempt to trigger the catalog > xmin problem. > > Given this error seems pretty significant in terms of indicating > fundamental lack of test coverage (the primary stated benefit of the > patch is physical failover), and it currently is a blocker to testing > more deeply. Few of my initial concerns specified at [1] are this: 1) Instead of a new LIST_SLOT command, can't we use READ_REPLICATION_SLOT (slight modifications needs to be done to make it support logical replication slots and to get more information from the subscriber). 2) How frequently the new bg worker is going to sync the slot info? How can it ensure that the latest information exists say when the subscriber is down/crashed before it picks up the latest slot information? 4) IIUC, the proposal works only for logical replication slots but do you also see the need for supporting some kind of synchronization of physical replication slots as well? IMO, we need a better and consistent way for both types of replication slots. If the walsender can somehow push the slot info from the primary (for physical replication slots)/publisher (for logical replication slots) to the standby/subscribers, this will be a more consistent and simplistic design. However, I'm not sure if this design is doable at all. Can anyone help clarify these? [1] https://www.postgresql.org/message-id/CALj2ACUGNGfWRtwwZwT-Y6feEP8EtOMhVTE87rdeY14mBpsRUA%40mail.gmail.com Regards, Bharath Rupireddy.
Re: Synchronizing slots from primary to standby
On Fri, Feb 18, 2022 at 5:23 PM Andres Freund wrote: > > Hi, > > On 2022-02-11 15:28:19 +0100, Peter Eisentraut wrote: > > On 05.02.22 20:59, Andres Freund wrote: > > > On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > > > > From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 > > > > From: Peter Eisentraut > > > > Date: Mon, 3 Jan 2022 14:43:36 +0100 > > > > Subject: [PATCH v3] Synchronize logical replication slots from primary > > > > to > > > > standby > > > I've just skimmed the patch and the related threads. As far as I can tell > > > this > > > cannot be safely used without the conflict handling in [1], is that > > > correct? > > > > This or similar questions have been asked a few times about this or similar > > patches, but they always come with some doubt. > > I'm certain it's a problem - the only reason I couched it was that there could > have been something clever in the patch preventing problems that I missed > because I just skimmed it. > > > > If we think so, it would be > > useful perhaps if we could come up with test cases that would demonstrate > > why that other patch/feature is necessary. (I'm not questioning it > > personally, I'm just throwing out ideas here.) > > The patch as-is just breaks one of the fundamental guarantees necessary for > logical decoding, that no rows versions can be removed that are still required > for logical decoding (signalled via catalog_xmin). So there needs to be an > explicit mechanism upholding that guarantee, but there is not right now from > what I can see. I've been working on adding test coverage to prove this out, but I've encountered the problem reported in [1]. My assumption, but Andres please correct me if I'm wrong, that we should see issues with the following steps (given the primary, physical replica, and logical subscriber already created in the test): 1. Ensure both logical subscriber and physical replica are caught up 2. Disable logical subscription 3. Make a catalog change on the primary (currently renaming the primary key column) 4. Vacuum pg_class 5. Ensure physical replication is caught up 6. Stop primary and promote the replica 7. Write to the changed table 8. Update subscription to point to promoted replica 9. Re-enable logical subscription I'm attaching my test as an additional patch in the series for reference. Currently I have steps 3 and 4 commented out to show that the issues in [1] occur without any attempt to trigger the catalog xmin problem. Given this error seems pretty significant in terms of indicating fundamental lack of test coverage (the primary stated benefit of the patch is physical failover), and it currently is a blocker to testing more deeply. Thanks, James Coleman 1: https://www.postgresql.org/message-id/TYCPR01MB684949EA7AA904EE938548C79F3A9%40TYCPR01MB6849.jpnprd01.prod.outlook.com v4-0002-Add-failover-test.patch Description: Binary data v4-0001-Synchronize-logical-replication-slots-from-primar.patch Description: Binary data
RE: Synchronizing slots from primary to standby
Hello, This patch status is already returned with feedback. However, I've applied this patch on 27b77ecf9f and tested so report it. make installcheck-world is passed. However, when I promote the standby server and update on the new primary server, apply worker could not start logical replication and emmit the following error. LOG: background worker "logical replication worker" (PID 14506) exited with exit code 1 LOG: logical replication apply worker for subscription "sub1" has started ERROR: terminating logical replication worker due to timeout LOG: background worker "logical replication worker" (PID 14535) exited with exit code 1 LOG: logical replication apply worker for subscription "sub1" has started It seems that apply worker does not start because wal sender is already exist on the new primary. Do you have any thoughts about what the cause might be? test script is attached. regards, sho kato failover.sh Description: failover.sh
Re: Synchronizing slots from primary to standby
Hi, On 2022-02-11 15:28:19 +0100, Peter Eisentraut wrote: > On 05.02.22 20:59, Andres Freund wrote: > > On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > > > From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 > > > From: Peter Eisentraut > > > Date: Mon, 3 Jan 2022 14:43:36 +0100 > > > Subject: [PATCH v3] Synchronize logical replication slots from primary to > > > standby > > I've just skimmed the patch and the related threads. As far as I can tell > > this > > cannot be safely used without the conflict handling in [1], is that correct? > > This or similar questions have been asked a few times about this or similar > patches, but they always come with some doubt. I'm certain it's a problem - the only reason I couched it was that there could have been something clever in the patch preventing problems that I missed because I just skimmed it. > If we think so, it would be > useful perhaps if we could come up with test cases that would demonstrate > why that other patch/feature is necessary. (I'm not questioning it > personally, I'm just throwing out ideas here.) The patch as-is just breaks one of the fundamental guarantees necessary for logical decoding, that no rows versions can be removed that are still required for logical decoding (signalled via catalog_xmin). So there needs to be an explicit mechanism upholding that guarantee, but there is not right now from what I can see. One piece of the referenced patchset is that it adds information about removed catalog rows to a few WAL records, and then verifies during replay that no record can be replayed that removes resources that are still needed. If such a conflict exists it's dealt with as a recovery conflict. That itself doesn't provide prevention against removal of required, but it provides detection. The prevention against removal can then be done using a physical replication slot with hot standby feedback or some other mechanism (e.g. slot syncing mechanism could maintain a "placeholder" slot on the primary for all sync targets or something like that). Even if that infrastructure existed / was merged, the slot sync stuff would still need some very careful logic to protect against problems due to concurrent WAL replay and "synchronized slot" creation. But that's doable. Greetings, Andres Freund
Re: Synchronizing slots from primary to standby
On Fri, Feb 11, 2022 at 9:26 AM Peter Eisentraut wrote: > > The way I understand it: > > 1. This feature (probably) depends on the "Minimal logical decoding on > standbys" patch. The details there aren't totally clear (to me). That > patch had some activity lately but I don't see it in a state that it's > nearing readiness. > > 2. I think the way this (my) patch is currently written needs some > refactoring about how we launch and manage workers. Right now, it's all > mangled together with logical replication, since that is a convenient > way to launch and manage workers, but it really doesn't need to be tied > to logical replication, since it can also be used for other logical slots. > > 3. It's an open question how to configure this. My patch show a very > minimal configuration that allows you to keep all logical slots always > behind one physical slot, which addresses one particular use case. In > general, you might have things like, one set of logical slots should > stay behind one physical slot, another set behind another physical slot, > another set should not care, etc. This could turn into something like > the synchronous replication feature, where it ends up with its own > configuration language. > > Each of these are clearly significant jobs on their own. > Thanks for bringing this topic back up again. I haven't gotten a chance to do any testing on the patch yet, but here are my initial notes from reviewing it: First, reusing the logical replication launcher seems a bit gross. It's obviously a pragmatic choice, but I find it confusing and likely to become only moreso given the fact there's nothing about slot syncing that's inherently limited to logical slots. Plus the feature currently is about syncing slots on a physical replica. So I think that probably should change. Second, it seems to me that the worker-per-DB architecture means that this is unworkable on a cluster with a large number of DBs. The original thread said that was because "logical slots are per database, walrcv_exec needs db connection, etc". As to the walrcv_exec, we're (re)connecting to the primary for each synchronization anyway, so that doesn't seem like a significant reason. I don't understand why logical slots being per-database means we have to do it this way. Is there something about the background worker architecture (I'm revealing my own ignorance here I suppose) that requires this? Also it seems that we reconnect to the primary every time we want to synchronize slots. Maybe that's OK, but it struck me as a bit odd, so I wanted to ask about it. Third, wait_for_standby_confirmation() needs a function comment. Andres noted this earlier, but it seems like we're doing quite a bit of work in this function for each commit. Some of it is obviously duplicative like the parsing of standby_slot_names. The waiting introduced also doesn't seem like a good idea. Andres also commented on that earlier; I'd echo his comments here absent an explanation of why it's preferable/necessary to do it this way. > + if (flush_pos >= commit_lsn && wait_slots_remaining > 0) > + wait_slots_remaining --; I might be missing something re: project style, but the space before the "--" looks odd to my eyes. >* Call either PREPARE (for two-phase transactions) or COMMIT (for >* regular ones). >*/ > + > + wait_for_standby_confirmation(commit_lsn); > + > if (rbtxn_prepared(txn)) > rb->prepare(rb, txn, commit_lsn); > else It appears the addition of this call splits the comment from the code it goes with. > + * Wait for remote slot to pass localy reserved position. Typo ("localy" -> "locally"). This patch would be a significant improvement for us; I'm hoping we can see some activity on it. I'm also hoping to try to do some testing next week and see if I can poke any holes in the functionality (with the goal of verifying Andres's concerns about the safety without the minimal logical decoding on a replica patch). Thanks, James Coleman
Re: Synchronizing slots from primary to standby
On Fri, Feb 11, 2022 at 9:26 AM Peter Eisentraut wrote: > > On 10.02.22 22:47, Bruce Momjian wrote: > > I would love to see this feature in PG 15. Can someone explain its > > current status? Thanks. > > The way I understand it: > ... Hi Peter, I'm starting to review this patch, and last time I checked I noticed it didn't seem to apply cleanly to master anymore. Would you be able to send a rebased version? Thanks, James Coleman
Re: Synchronizing slots from primary to standby
On 05.02.22 20:59, Andres Freund wrote: On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 3 Jan 2022 14:43:36 +0100 Subject: [PATCH v3] Synchronize logical replication slots from primary to standby I've just skimmed the patch and the related threads. As far as I can tell this cannot be safely used without the conflict handling in [1], is that correct? This or similar questions have been asked a few times about this or similar patches, but they always come with some doubt. If we think so, it would be useful perhaps if we could come up with test cases that would demonstrate why that other patch/feature is necessary. (I'm not questioning it personally, I'm just throwing out ideas here.)
Re: Synchronizing slots from primary to standby
On 10.02.22 22:47, Bruce Momjian wrote: On Tue, Feb 8, 2022 at 08:27:32PM +0530, Ashutosh Sharma wrote: Which means that if e.g. the standby_slot_names GUC differs from synchronize_slot_names on the physical replica, the slots synchronized on the physical replica are not going to be valid. Or if the primary drops its logical slots. Should the redo function for the drop replication slot have the capability to drop it on standby and its subscribers (if any) as well? Slots are not WAL logged (and shouldn't be). I think you pretty much need the recovery conflict handling infrastructure I referenced upthread, which recognized during replay if a record has a conflict with a slot on a standby. And then ontop of that you can build something like this patch. OK. Understood, thanks Andres. I would love to see this feature in PG 15. Can someone explain its current status? Thanks. The way I understand it: 1. This feature (probably) depends on the "Minimal logical decoding on standbys" patch. The details there aren't totally clear (to me). That patch had some activity lately but I don't see it in a state that it's nearing readiness. 2. I think the way this (my) patch is currently written needs some refactoring about how we launch and manage workers. Right now, it's all mangled together with logical replication, since that is a convenient way to launch and manage workers, but it really doesn't need to be tied to logical replication, since it can also be used for other logical slots. 3. It's an open question how to configure this. My patch show a very minimal configuration that allows you to keep all logical slots always behind one physical slot, which addresses one particular use case. In general, you might have things like, one set of logical slots should stay behind one physical slot, another set behind another physical slot, another set should not care, etc. This could turn into something like the synchronous replication feature, where it ends up with its own configuration language. Each of these are clearly significant jobs on their own.
Re: Synchronizing slots from primary to standby
On Tue, Feb 8, 2022 at 08:27:32PM +0530, Ashutosh Sharma wrote: > > Which means that if e.g. the standby_slot_names GUC differs from > > synchronize_slot_names on the physical replica, the slots synchronized on > > the > > physical replica are not going to be valid. Or if the primary drops its > > logical slots. > > > > > > > Should the redo function for the drop replication slot have the capability > > > to drop it on standby and its subscribers (if any) as well? > > > > Slots are not WAL logged (and shouldn't be). > > > > I think you pretty much need the recovery conflict handling infrastructure I > > referenced upthread, which recognized during replay if a record has a > > conflict > > with a slot on a standby. And then ontop of that you can build something > > like > > this patch. > > > > OK. Understood, thanks Andres. I would love to see this feature in PG 15. Can someone explain its current status? Thanks. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: Synchronizing slots from primary to standby
On Tue, Feb 8, 2022 at 2:02 AM Andres Freund wrote: > > Hi, > > On 2022-02-07 13:38:38 +0530, Ashutosh Sharma wrote: > > Are you talking about this scenario - what if the logical replication > > slot on the publisher is dropped, but is being referenced by the > > standby where the slot is synchronized? > > It's a bit hard to say, because neither in this thread nor in the patch I've > found a clear description of what the syncing needs to & tries to > guarantee. It might be that that was discussed in one of the precursor > threads, but... > > Generally I don't think we can permit scenarios where a slot can be in a > "corrupt" state, i.e. missing required catalog entries, after "normal" > administrative commands (i.e. not mucking around in catalog entries / on-disk > files). Even if the sequence of commands may be a bit weird. All such cases > need to be either prevented or detected. > > > As far as I can tell, the way this patch keeps slots on physical replicas > "valid" is solely by reorderbuffer.c blocking during replay via > wait_for_standby_confirmation(). > > Which means that if e.g. the standby_slot_names GUC differs from > synchronize_slot_names on the physical replica, the slots synchronized on the > physical replica are not going to be valid. Or if the primary drops its > logical slots. > > > > Should the redo function for the drop replication slot have the capability > > to drop it on standby and its subscribers (if any) as well? > > Slots are not WAL logged (and shouldn't be). > > I think you pretty much need the recovery conflict handling infrastructure I > referenced upthread, which recognized during replay if a record has a conflict > with a slot on a standby. And then ontop of that you can build something like > this patch. > OK. Understood, thanks Andres. -- With Regards, Ashutosh Sharma.
Re: Synchronizing slots from primary to standby
Hi, On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > +static void > +ApplyLauncherStartSlotSync(TimestampTz *last_start_time, long *wait_time) > +{ > [...] > + > + foreach(lc, slots) > + { > + WalRecvReplicationSlotData *slot_data = lfirst(lc); > + LogicalRepWorker *w; > + > + if (!OidIsValid(slot_data->database)) > + continue; > + > + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED); > + w = logicalrep_worker_find(slot_data->database, InvalidOid, > +InvalidOid, > false); > + LWLockRelease(LogicalRepWorkerLock); > + > + if (w == NULL) > + { > + *last_start_time = now; > + *wait_time = wal_retrieve_retry_interval; > + > + logicalrep_worker_launch(slot_data->database, > InvalidOid, NULL, > + > BOOTSTRAP_SUPERUSERID, InvalidOid); Do we really need a dedicated worker for each single slot? That seems excessively expensive. > +++ b/src/backend/replication/logical/reorderbuffer.c > [...] > +static void > +wait_for_standby_confirmation(XLogRecPtr commit_lsn) > +{ > + char *rawname; > + List *namelist; > + ListCell *lc; > + XLogRecPtr flush_pos = InvalidXLogRecPtr; > + > + if (strcmp(standby_slot_names, "") == 0) > + return; > + > + rawname = pstrdup(standby_slot_names); > + SplitIdentifierString(rawname, ',', ); > + > + while (true) > + { > + int wait_slots_remaining; > + XLogRecPtr oldest_flush_pos = InvalidXLogRecPtr; > + int rc; > + > + wait_slots_remaining = list_length(namelist); > + > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > + for (int i = 0; i < max_replication_slots; i++) > + { > + ReplicationSlot *s = > >replication_slots[i]; > + boolinlist; > + > + if (!s->in_use) > + continue; > + > + inlist = false; > + foreach (lc, namelist) > + { > + char *name = lfirst(lc); > + if (strcmp(name, NameStr(s->data.name)) == 0) > + { > + inlist = true; > + break; > + } > + } > + if (!inlist) > + continue; > + > + SpinLockAcquire(>mutex); It doesn't seem like a good idea to perform O(max_replication_slots * #standby_slot_names) work on each decoded commit. Nor to SplitIdentifierString(pstrdup(standby_slot_names)) every time. > + if (s->data.database == InvalidOid) > + /* Physical slots advance restart_lsn on flush > and ignore confirmed_flush_lsn */ > + flush_pos = s->data.restart_lsn; > + else > + /* For logical slots we must wait for commit > and flush */ > + flush_pos = s->data.confirmed_flush; > + > + SpinLockRelease(>mutex); > + > + /* We want to find out the min(flush pos) over all > named slots */ > + if (oldest_flush_pos == InvalidXLogRecPtr > + || oldest_flush_pos > flush_pos) > + oldest_flush_pos = flush_pos; > + > + if (flush_pos >= commit_lsn && wait_slots_remaining > 0) > + wait_slots_remaining --; > + } > + LWLockRelease(ReplicationSlotControlLock); > + > + if (wait_slots_remaining == 0) > + return; > + > + rc = WaitLatch(MyLatch, > +WL_LATCH_SET | WL_TIMEOUT | > WL_POSTMASTER_DEATH, > +1000L, PG_WAIT_EXTENSION); I don't think it's a good idea to block here like this - no walsender specific handling is going to happen. E.g. not sending replies to receivers will cause them to time out. And for the SQL functions this will cause blocking even though the interface expects to return when reaching the end of the WAL - which this pretty much is. I think this needs to be restructured so that you only do the checking of the "up to this point" position when needed, rather than every commit. We already *have* a check for not replaying further than the flushed WAL position, see the GetFlushRecPtr() calls in WalSndWaitForWal(), pg_logical_slot_get_changes_guts(). I think you'd basically need to
Re: Synchronizing slots from primary to standby
Hi, On 2022-02-07 13:38:38 +0530, Ashutosh Sharma wrote: > Are you talking about this scenario - what if the logical replication > slot on the publisher is dropped, but is being referenced by the > standby where the slot is synchronized? It's a bit hard to say, because neither in this thread nor in the patch I've found a clear description of what the syncing needs to & tries to guarantee. It might be that that was discussed in one of the precursor threads, but... Generally I don't think we can permit scenarios where a slot can be in a "corrupt" state, i.e. missing required catalog entries, after "normal" administrative commands (i.e. not mucking around in catalog entries / on-disk files). Even if the sequence of commands may be a bit weird. All such cases need to be either prevented or detected. As far as I can tell, the way this patch keeps slots on physical replicas "valid" is solely by reorderbuffer.c blocking during replay via wait_for_standby_confirmation(). Which means that if e.g. the standby_slot_names GUC differs from synchronize_slot_names on the physical replica, the slots synchronized on the physical replica are not going to be valid. Or if the primary drops its logical slots. > Should the redo function for the drop replication slot have the capability > to drop it on standby and its subscribers (if any) as well? Slots are not WAL logged (and shouldn't be). I think you pretty much need the recovery conflict handling infrastructure I referenced upthread, which recognized during replay if a record has a conflict with a slot on a standby. And then ontop of that you can build something like this patch. Greetings, Andres Freund
Re: Synchronizing slots from primary to standby
Hi Andres, Are you talking about this scenario - what if the logical replication slot on the publisher is dropped, but is being referenced by the standby where the slot is synchronized? Should the redo function for the drop replication slot have the capability to drop it on standby and its subscribers (if any) as well? -- With Regards, Ashutosh Sharma. On Sun, Feb 6, 2022 at 1:29 AM Andres Freund wrote: > > Hi, > > On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > > From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 > > From: Peter Eisentraut > > Date: Mon, 3 Jan 2022 14:43:36 +0100 > > Subject: [PATCH v3] Synchronize logical replication slots from primary to > > standby > > I've just skimmed the patch and the related threads. As far as I can tell this > cannot be safely used without the conflict handling in [1], is that correct? > > > Greetings, > > Andres Freund > > [1] > https://postgr.es/m/CA%2BTgmoZd-JqNL1-R3RJ0jQRD%2B-dc94X0nPJgh%2BdwdDF0rFuE3g%40mail.gmail.com > >
Re: Synchronizing slots from primary to standby
Hi, On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Mon, 3 Jan 2022 14:43:36 +0100 > Subject: [PATCH v3] Synchronize logical replication slots from primary to > standby I've just skimmed the patch and the related threads. As far as I can tell this cannot be safely used without the conflict handling in [1], is that correct? Greetings, Andres Freund [1] https://postgr.es/m/CA%2BTgmoZd-JqNL1-R3RJ0jQRD%2B-dc94X0nPJgh%2BdwdDF0rFuE3g%40mail.gmail.com
Re: Synchronizing slots from primary to standby
On Sat, Jan 22, 2022 at 4:33 AM Hsu, John wrote: > >> I might be missing something but isn’t it okay even if the new primary >> server is behind the subscribers? IOW, even if two slot's LSNs (i.e., >> restart_lsn and confirm_flush_lsn) are behind the subscriber's remote >> LSN (i.e., pg_replication_origin.remote_lsn), the primary sends only >> transactions that were committed after the remote_lsn. So the >> subscriber can resume logical replication with the new primary without >> any data loss. > > Maybe I'm misreading, but I thought the purpose of this to make > sure that the logical subscriber does not have data that has not been > replicated to the new primary. The use-case I can think of would be > if synchronous_commit were enabled and fail-over occurs. If > we didn't have this set, isn't it possible that this logical subscriber > has extra commits that aren't present on the newly promoted primary? > This is very much possible if the new primary used to be asynchronous standby. But, it seems like the current patch is trying to hold the logical replication until the data has been replicated to the physical standby when synchronous_slot_names is set. This will ensure that the logical subscriber is never ahead of the new primary. However, AFAIU that's not the primary use-case of this patch; instead this is to ensure that the logical subscribers continue getting data from the new primary when the failover occurs. > > If a logical slot was dropped on the writer, should the worker drop > logical > slots that it was previously synchronizing but are no longer present? Or > should we leave that to the user to manage? I'm trying to think why users > would want to sync logical slots to a reader but not have that be dropped > as well if it's no longer present. > AFAIU this should be taken care of by the background worker used to synchronize the replication slot. -- With Regards, Ashutosh Sharma.
Re: Synchronizing slots from primary to standby
> I might be missing something but isn’t it okay even if the new primary > server is behind the subscribers? IOW, even if two slot's LSNs (i.e., > restart_lsn and confirm_flush_lsn) are behind the subscriber's remote > LSN (i.e., pg_replication_origin.remote_lsn), the primary sends only > transactions that were committed after the remote_lsn. So the > subscriber can resume logical replication with the new primary without > any data loss. Maybe I'm misreading, but I thought the purpose of this to make sure that the logical subscriber does not have data that has not been replicated to the new primary. The use-case I can think of would be if synchronous_commit were enabled and fail-over occurs. If we didn't have this set, isn't it possible that this logical subscriber has extra commits that aren't present on the newly promoted primary? And sorry I accidentally started a new thread in my last reply. Re-pasting some of my previous questions/comments: wait_for_standby_confirmation does not update standby_slot_names once it's in a loop and can't be fixed with SIGHUP. Similarly, synchronize_slot_names isn't updated once the worker is launched. If a logical slot was dropped on the writer, should the worker drop logical slots that it was previously synchronizing but are no longer present? Or should we leave that to the user to manage? I'm trying to think why users would want to sync logical slots to a reader but not have that be dropped as well if it's no longer present. Is there a reason we're deciding to use one-worker syncing per database instead of one general worker that syncs across all the databases? I imagine I'm missing something obvious here. As for how standby_slot_names should be configured, I'd prefer the flexibility similar to what we have for synchronus_standby_names since that seems the most analogous. It'd provide flexibility for failovers, which I imagine is the most common use-case. On 1/20/22, 9:34 PM, "Masahiko Sawada" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On Wed, Dec 15, 2021 at 7:13 AM Peter Eisentraut wrote: > > On 31.10.21 11:08, Peter Eisentraut wrote: > > I want to reactivate $subject. I took Petr Jelinek's patch from [0], > > rebased it, added a bit of testing. It basically works, but as > > mentioned in [0], there are various issues to work out. > > > > The idea is that the standby runs a background worker to periodically > > fetch replication slot information from the primary. On failover, a > > logical subscriber would then ideally find up-to-date replication slots > > on the new publisher and can just continue normally. > > > So, again, this isn't anywhere near ready, but there is already a lot > > here to gather feedback about how it works, how it should work, how to > > configure it, and how it fits into an overall replication and HA > > architecture. > > The second, > standby_slot_names, is set on the primary. It holds back logical > replication until the listed physical standbys have caught up. That > way, when failover is necessary, the promoted standby is not behind the > logical replication consumers. I might be missing something but isn’t it okay even if the new primary server is behind the subscribers? IOW, even if two slot's LSNs (i.e., restart_lsn and confirm_flush_lsn) are behind the subscriber's remote LSN (i.e., pg_replication_origin.remote_lsn), the primary sends only transactions that were committed after the remote_lsn. So the subscriber can resume logical replication with the new primary without any data loss. The new primary should not be ahead of the subscribers because it forwards the logical replication start LSN to the slot’s confirm_flush_lsn in this case. But it cannot happen since the remote LSN of the subscriber’s origin is always updated first, then the confirm_flush_lsn of the slot on the primary is updated, and then the confirm_flush_lsn of the slot on the standby is synchronized. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Synchronizing slots from primary to standby
On Wed, Dec 15, 2021 at 7:13 AM Peter Eisentraut wrote: > > On 31.10.21 11:08, Peter Eisentraut wrote: > > I want to reactivate $subject. I took Petr Jelinek's patch from [0], > > rebased it, added a bit of testing. It basically works, but as > > mentioned in [0], there are various issues to work out. > > > > The idea is that the standby runs a background worker to periodically > > fetch replication slot information from the primary. On failover, a > > logical subscriber would then ideally find up-to-date replication slots > > on the new publisher and can just continue normally. > > > So, again, this isn't anywhere near ready, but there is already a lot > > here to gather feedback about how it works, how it should work, how to > > configure it, and how it fits into an overall replication and HA > > architecture. > > The second, > standby_slot_names, is set on the primary. It holds back logical > replication until the listed physical standbys have caught up. That > way, when failover is necessary, the promoted standby is not behind the > logical replication consumers. I might be missing something but isn’t it okay even if the new primary server is behind the subscribers? IOW, even if two slot's LSNs (i.e., restart_lsn and confirm_flush_lsn) are behind the subscriber's remote LSN (i.e., pg_replication_origin.remote_lsn), the primary sends only transactions that were committed after the remote_lsn. So the subscriber can resume logical replication with the new primary without any data loss. The new primary should not be ahead of the subscribers because it forwards the logical replication start LSN to the slot’s confirm_flush_lsn in this case. But it cannot happen since the remote LSN of the subscriber’s origin is always updated first, then the confirm_flush_lsn of the slot on the primary is updated, and then the confirm_flush_lsn of the slot on the standby is synchronized. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Synchronizing slots from primary to standby
Here is an updated patch to fix some build failures. No feature changes. On 14.12.21 23:12, Peter Eisentraut wrote: On 31.10.21 11:08, Peter Eisentraut wrote: I want to reactivate $subject. I took Petr Jelinek's patch from [0], rebased it, added a bit of testing. It basically works, but as mentioned in [0], there are various issues to work out. The idea is that the standby runs a background worker to periodically fetch replication slot information from the primary. On failover, a logical subscriber would then ideally find up-to-date replication slots on the new publisher and can just continue normally. So, again, this isn't anywhere near ready, but there is already a lot here to gather feedback about how it works, how it should work, how to configure it, and how it fits into an overall replication and HA architecture. Here is an updated patch. The main changes are that I added two configuration parameters. The first, synchronize_slot_names, is set on the physical standby to specify which slots to sync from the primary. By default, it is empty. (This also fixes the recovery test failures that I had to disable in the previous patch version.) The second, standby_slot_names, is set on the primary. It holds back logical replication until the listed physical standbys have caught up. That way, when failover is necessary, the promoted standby is not behind the logical replication consumers. In principle, this works now, I think. I haven't made much progress in creating more test cases for this; that's something that needs more attention. It's worth pondering what the configuration language for standby_slot_names should be. Right now, it's just a list of slots that all need to be caught up. More complicated setups are conceivable. Maybe you have standbys S1 and S2 that are potential failover targets for logical replication consumers L1 and L2, and also standbys S3 and S4 that are potential failover targets for logical replication consumers L3 and L4. Viewed like that, this setting could be a replication slot setting. The setting might also have some relationship with synchronous_standby_names. Like, if you have synchronous_standby_names set, then that's a pretty good indication that you also want some or all of those standbys in standby_slot_names. (But note that one is slots and one is application names.) So there are a variety of possibilities. From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 3 Jan 2022 14:43:36 +0100 Subject: [PATCH v3] Synchronize logical replication slots from primary to standby Discussion: https://www.postgresql.org/message-id/flat/514f6f2f-6833-4539-39f1-96cd1e011f23%40enterprisedb.com --- doc/src/sgml/config.sgml | 34 ++ src/backend/commands/subscriptioncmds.c | 4 +- src/backend/postmaster/bgworker.c | 3 + .../libpqwalreceiver/libpqwalreceiver.c | 94 src/backend/replication/logical/Makefile | 1 + src/backend/replication/logical/launcher.c| 202 ++--- .../replication/logical/reorderbuffer.c | 85 src/backend/replication/logical/slotsync.c| 412 ++ src/backend/replication/logical/tablesync.c | 13 +- src/backend/replication/repl_gram.y | 32 +- src/backend/replication/repl_scanner.l| 1 + src/backend/replication/slotfuncs.c | 2 +- src/backend/replication/walsender.c | 196 - src/backend/utils/activity/wait_event.c | 3 + src/backend/utils/misc/guc.c | 23 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/commands/subscriptioncmds.h | 3 + src/include/nodes/nodes.h | 1 + src/include/nodes/replnodes.h | 9 + src/include/replication/logicalworker.h | 9 + src/include/replication/slot.h| 4 +- src/include/replication/walreceiver.h | 16 + src/include/replication/worker_internal.h | 8 +- src/include/utils/wait_event.h| 1 + src/test/recovery/t/030_slot_sync.pl | 58 +++ 25 files changed, 1146 insertions(+), 70 deletions(-) create mode 100644 src/backend/replication/logical/slotsync.c create mode 100644 src/test/recovery/t/030_slot_sync.pl diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..2b2a21a251 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4406,6 +4406,23 @@ Primary Server + + standby_slot_names (string) + + standby_slot_names configuration parameter + + + + +List of physical replication slots that logical replication waits for. +If a logical replication connection is meant to switch to a physical +standby after the standby is promoted, the physical replication slot +for the standby should be
Re: Synchronizing slots from primary to standby
Hello, I started taking a brief look at the v2 patch, and it does appear to work for the basic case. Logical slot is synchronized across and I can connect to the promoted standby and stream changes afterwards. It's not clear to me what the correct behavior is when a logical slot that has been synced to the replica and then it gets deleted on the writer. Would we expect this to be propagated or leave it up to the end-user to manage? > + rawname = pstrdup(standby_slot_names); > + SplitIdentifierString(rawname, ',', ); > + > + while (true) > + { > + int wait_slots_remaining; > + XLogRecPtr oldest_flush_pos = InvalidXLogRecPtr; > + int rc; > + > + wait_slots_remaining = list_length(namelist); > + > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > + for (int i = 0; i < max_replication_slots; i++) > + { Even though standby_slot_names is PGC_SIGHUP, we never reload/re-process the value. If we have a wrong entry in there, the backend becomes stuck until we re-establish the logical connection. Adding "postmaster/interrupt.h" with ConfigReloadPending / ProcessConfigFile does seem to work. Another thing I noticed is that once it starts waiting in this block, Ctrl+C doesn't seem to terminate the backend? pg_recvlogical -d postgres -p 5432 --slot regression_slot --start -f - .. ^Cpg_recvlogical: error: unexpected termination of replication stream: The logical backend connection is still present: ps aux | grep 51263 hsuchen 51263 80.7 0.0 320180 14304 ?Rs 01:11 3:04 postgres: walsender hsuchen [local] START_REPLICATION pstack 51263 #0 0x7ffee99e79a5 in clock_gettime () #1 0x7f8705e88246 in clock_gettime () from /lib64/libc.so.6 #2 0x0075f141 in WaitEventSetWait () #3 0x0075f565 in WaitLatch () #4 0x00720aea in ReorderBufferProcessTXN () #5 0x007142a6 in DecodeXactOp () #6 0x0071460f in LogicalDecodingProcessRecord () It can be terminated with a pg_terminate_backend though. If we have a physical slot with name foo on the standby, and then a logical slot is created on the writer with the same slot_name it does error out on the replica although it prevents other slots from being synchronized which is probably fine. 2021-12-16 02:10:29.709 UTC [73788] LOG: replication slot synchronization worker for database "postgres" has started 2021-12-16 02:10:29.713 UTC [73788] ERROR: cannot use physical replication slot for logical decoding 2021-12-16 02:10:29.714 UTC [73037] DEBUG: unregistering background worker "replication slot synchronization worker" On 12/14/21, 2:26 PM, "Peter Eisentraut" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On 28.11.21 07:52, Bharath Rupireddy wrote: > 1) Instead of a new LIST_SLOT command, can't we use > READ_REPLICATION_SLOT (slight modifications needs to be done to make > it support logical replication slots and to get more information from > the subscriber). I looked at that but didn't see an obvious way to consolidate them. This is something we could look at again later. > 2) How frequently the new bg worker is going to sync the slot info? > How can it ensure that the latest information exists say when the > subscriber is down/crashed before it picks up the latest slot > information? The interval is currently hardcoded, but could be a configuration setting. In the v2 patch, there is a new setting that orders physical replication before logical so that the logical subscribers cannot get ahead of the physical standby. > 3) Instead of the subscriber pulling the slot info, why can't the > publisher (via the walsender or a new bg worker maybe?) push the > latest slot info? I'm not sure we want to add more functionality to > the walsender, if yes, isn't it going to be much simpler? This sounds like the failover slot feature, which was rejected.
Re: Synchronizing slots from primary to standby
On 28.11.21 07:52, Bharath Rupireddy wrote: 1) Instead of a new LIST_SLOT command, can't we use READ_REPLICATION_SLOT (slight modifications needs to be done to make it support logical replication slots and to get more information from the subscriber). I looked at that but didn't see an obvious way to consolidate them. This is something we could look at again later. 2) How frequently the new bg worker is going to sync the slot info? How can it ensure that the latest information exists say when the subscriber is down/crashed before it picks up the latest slot information? The interval is currently hardcoded, but could be a configuration setting. In the v2 patch, there is a new setting that orders physical replication before logical so that the logical subscribers cannot get ahead of the physical standby. 3) Instead of the subscriber pulling the slot info, why can't the publisher (via the walsender or a new bg worker maybe?) push the latest slot info? I'm not sure we want to add more functionality to the walsender, if yes, isn't it going to be much simpler? This sounds like the failover slot feature, which was rejected.
Re: Synchronizing slots from primary to standby
On 24.11.21 17:25, Dimitri Fontaine wrote: Is there a case to be made about doing the same thing for physical replication slots too? It has been considered. At the moment, I'm not doing it, because it would add more code and complexity and it's not that important. But it could be added in the future. Given the admitted state of the patch, I didn't focus on tests. I could successfully apply the patch on-top of current master's branch, and cleanly compile and `make check`. Then I also updated pg_auto_failover to support Postgres 15devel [2] so that I could then `make NODES=3 cluster` there and play with the new replication command: $ psql -d "port=5501 replication=1" -c "LIST_SLOTS;" psql:/Users/dim/.psqlrc:24: ERROR: XX000: cannot execute SQL commands in WAL sender for physical replication LOCATION: exec_replication_command, walsender.c:1830 ... I'm not too sure about this idea of running SQL in a replication protocol connection that you're mentioning, but I suppose that's just me needing to brush up on the topic. FWIW, the way the replication command parser works, if there is a parse error, it tries to interpret the command as a plain SQL command. But that only works for logical replication connections. So in physical replication, if you try to run anything that does not parse, you will get this error. But that has nothing to do with this feature. The above command works for me, so maybe something else went wrong in your situation. Maybe the first question about configuration would be about selecting which slots a standby should maintain from the primary. Is it all of the slots that exists on both the nodes, or a sublist of that? Is it possible to have a slot with the same name on a primary and a standby node, in a way that the standby's slot would be a completely separate entity from the primary's slot? If yes (I just don't know at the moment), well then, should we continue to allow that? This has been added in v2. Also, do we want to even consider having the slot management on a primary node depend on the ability to sync the advancing on one or more standby nodes? I'm not sure to see that one as a good idea, but maybe we want to kill it publically very early then ;-) I don't know what you mean by this.
Re: Synchronizing slots from primary to standby
On 24.11.21 07:11, Masahiko Sawada wrote: I haven’t looked at the patch deeply but regarding 007_sync_rep.pl, the tests seem to fail since the tests rely on the order of the wal sender array on the shared memory. Since a background worker for synchronizing replication slots periodically connects to the walsender on the primary and disconnects, it breaks the assumption of the order. Regarding 010_logical_decoding_timelines.pl, I guess that the patch breaks the test because the background worker for synchronizing slots on the replica periodically advances the replica's slot. I think we need to have a way to disable the slot synchronization or to specify the slot name to sync with the primary. I'm not sure we already discussed this topic but I think we need it at least for testing purposes. This has been addressed by patch v2 that adds such a setting.
Re: Synchronizing slots from primary to standby
On 31.10.21 11:08, Peter Eisentraut wrote: I want to reactivate $subject. I took Petr Jelinek's patch from [0], rebased it, added a bit of testing. It basically works, but as mentioned in [0], there are various issues to work out. The idea is that the standby runs a background worker to periodically fetch replication slot information from the primary. On failover, a logical subscriber would then ideally find up-to-date replication slots on the new publisher and can just continue normally. So, again, this isn't anywhere near ready, but there is already a lot here to gather feedback about how it works, how it should work, how to configure it, and how it fits into an overall replication and HA architecture. Here is an updated patch. The main changes are that I added two configuration parameters. The first, synchronize_slot_names, is set on the physical standby to specify which slots to sync from the primary. By default, it is empty. (This also fixes the recovery test failures that I had to disable in the previous patch version.) The second, standby_slot_names, is set on the primary. It holds back logical replication until the listed physical standbys have caught up. That way, when failover is necessary, the promoted standby is not behind the logical replication consumers. In principle, this works now, I think. I haven't made much progress in creating more test cases for this; that's something that needs more attention. It's worth pondering what the configuration language for standby_slot_names should be. Right now, it's just a list of slots that all need to be caught up. More complicated setups are conceivable. Maybe you have standbys S1 and S2 that are potential failover targets for logical replication consumers L1 and L2, and also standbys S3 and S4 that are potential failover targets for logical replication consumers L3 and L4. Viewed like that, this setting could be a replication slot setting. The setting might also have some relationship with synchronous_standby_names. Like, if you have synchronous_standby_names set, then that's a pretty good indication that you also want some or all of those standbys in standby_slot_names. (But note that one is slots and one is application names.) So there are a variety of possibilities.From f1d306bfe3eb657b5d14b0ef3024586083beb4ed Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Tue, 14 Dec 2021 22:20:59 +0100 Subject: [PATCH v2] Synchronize logical replication slots from primary to standby Discussion: https://www.postgresql.org/message-id/flat/514f6f2f-6833-4539-39f1-96cd1e011f23%40enterprisedb.com --- doc/src/sgml/config.sgml | 34 ++ src/backend/commands/subscriptioncmds.c | 4 +- src/backend/postmaster/bgworker.c | 3 + .../libpqwalreceiver/libpqwalreceiver.c | 96 src/backend/replication/logical/Makefile | 1 + src/backend/replication/logical/launcher.c| 202 ++--- .../replication/logical/reorderbuffer.c | 85 src/backend/replication/logical/slotsync.c| 412 ++ src/backend/replication/logical/tablesync.c | 13 +- src/backend/replication/repl_gram.y | 32 +- src/backend/replication/repl_scanner.l| 1 + src/backend/replication/slotfuncs.c | 2 +- src/backend/replication/walsender.c | 196 - src/backend/utils/activity/wait_event.c | 3 + src/backend/utils/misc/guc.c | 23 + src/backend/utils/misc/postgresql.conf.sample | 2 + src/include/commands/subscriptioncmds.h | 3 + src/include/nodes/nodes.h | 1 + src/include/nodes/replnodes.h | 9 + src/include/replication/logicalworker.h | 9 + src/include/replication/slot.h| 4 +- src/include/replication/walreceiver.h | 16 + src/include/replication/worker_internal.h | 8 +- src/include/utils/wait_event.h| 1 + src/test/recovery/t/030_slot_sync.pl | 58 +++ 25 files changed, 1148 insertions(+), 70 deletions(-) create mode 100644 src/backend/replication/logical/slotsync.c create mode 100644 src/test/recovery/t/030_slot_sync.pl diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..2b2a21a251 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4406,6 +4406,23 @@ Primary Server + + standby_slot_names (string) + + standby_slot_names configuration parameter + + + + +List of physical replication slots that logical replication waits for. +If a logical replication connection is meant to switch to a physical +standby after the standby is promoted, the physical replication slot +for the standby should be listed here. This ensures that logical +replication is not ahead of the physical standby. + + +
Re: Synchronizing slots from primary to standby
On Mon, Nov 29, 2021 at 1:10 PM Dilip Kumar wrote: > > On Mon, Nov 29, 2021 at 12:19 PM Bharath Rupireddy > wrote: > > > > On Mon, Nov 29, 2021 at 11:14 AM Dilip Kumar wrote: > > > > > > On Mon, Nov 29, 2021 at 9:40 AM Bharath Rupireddy > > > wrote: > > > > > > > > On Mon, Nov 29, 2021 at 1:48 AM SATYANARAYANA NARLAPURAM > > > > wrote: > > > > > > > > > >> 3) Instead of the subscriber pulling the slot info, why can't the > > > > >> publisher (via the walsender or a new bg worker maybe?) push the > > > > >> latest slot info? I'm not sure we want to add more functionality to > > > > >> the walsender, if yes, isn't it going to be much simpler? > > > > > > > > > > Standby pulling the information or at least making a first attempt to > > > > > connect to the primary is a better design as primary doesn't need to > > > > > spend its cycles repeatedly connecting to an unreachable standby. In > > > > > fact, primary wouldn't even need to know the followers, for example > > > > > followers / log shipping standbys > > > > > > > > My idea was to let the existing walsender from the primary/publisher > > > > to send the slot info (both logical and physical replication slots) to > > > > the standby/subscriber, probably by piggybacking the slot info with > > > > the WAL currently it sends. Having said that, I don't know the > > > > feasibility of it. Anyways, I'm not in favour of having a new bg > > > > worker to just ship the slot info. The standby/subscriber, while > > > > making connection to primary/publisher, can choose to get the > > > > replication slot info. > > > > > > I think it is possible that the standby is restoring the WAL directly > > > from the archive location and there might not be any wal sender at > > > time. So I think the idea of standby pulling the WAL looks better to > > > me. > > > > My point was that why can't we let the walreceiver (of course users > > can configure it on the standby/subscriber) to choose whether or not > > to receive the replication (both physical and logical) slot info from > > the primary/publisher and if yes, the walsender(on the > > primary/publisher) sending it probably as a new WAL record or just > > piggybacking the replication slot info with any of the existing WAL > > records. > > Okay, I thought your point was that the primary pushing is better over > standby pulling the slot info, but now it seems that you also agree > that standby pulling is better right? Now it appears your point is > about whether we will use the same connection for pulling the slot > information which we are using for streaming the data or any other > connection? I mean in this patch also we are creating a replication > connection and pulling the slot information over there, just point is > we are starting a separate worker for pulling the slot information, > and I think that approach is better as this will not impact the > performance of the other replication connection which we are using for > communicating the data. The easiest way to implement this feature so far, is to use a common bg worker (as opposed to the bg worker proposed originally in this thread which, IIUC, works for logical replication) running on standby (in case of streaming replication with physical replication slots) or subscriber (in case of logical replication with logical replication slots) for getting both the physical and logical replication slots info from the primary or publisher. This bg worker requires at least two GUCs, 1) to enable/disable the worker 2) to define the slot sync interval (the bg worker gets the slots info after every sync interval of time). Thoughts? Regards, Bharath Rupireddy.
Re: Synchronizing slots from primary to standby
On Mon, Nov 29, 2021 at 12:19 PM Bharath Rupireddy wrote: > > On Mon, Nov 29, 2021 at 11:14 AM Dilip Kumar wrote: > > > > On Mon, Nov 29, 2021 at 9:40 AM Bharath Rupireddy > > wrote: > > > > > > On Mon, Nov 29, 2021 at 1:48 AM SATYANARAYANA NARLAPURAM > > > wrote: > > > > > > > >> 3) Instead of the subscriber pulling the slot info, why can't the > > > >> publisher (via the walsender or a new bg worker maybe?) push the > > > >> latest slot info? I'm not sure we want to add more functionality to > > > >> the walsender, if yes, isn't it going to be much simpler? > > > > > > > > Standby pulling the information or at least making a first attempt to > > > > connect to the primary is a better design as primary doesn't need to > > > > spend its cycles repeatedly connecting to an unreachable standby. In > > > > fact, primary wouldn't even need to know the followers, for example > > > > followers / log shipping standbys > > > > > > My idea was to let the existing walsender from the primary/publisher > > > to send the slot info (both logical and physical replication slots) to > > > the standby/subscriber, probably by piggybacking the slot info with > > > the WAL currently it sends. Having said that, I don't know the > > > feasibility of it. Anyways, I'm not in favour of having a new bg > > > worker to just ship the slot info. The standby/subscriber, while > > > making connection to primary/publisher, can choose to get the > > > replication slot info. > > > > I think it is possible that the standby is restoring the WAL directly > > from the archive location and there might not be any wal sender at > > time. So I think the idea of standby pulling the WAL looks better to > > me. > > My point was that why can't we let the walreceiver (of course users > can configure it on the standby/subscriber) to choose whether or not > to receive the replication (both physical and logical) slot info from > the primary/publisher and if yes, the walsender(on the > primary/publisher) sending it probably as a new WAL record or just > piggybacking the replication slot info with any of the existing WAL > records. Okay, I thought your point was that the primary pushing is better over standby pulling the slot info, but now it seems that you also agree that standby pulling is better right? Now it appears your point is about whether we will use the same connection for pulling the slot information which we are using for streaming the data or any other connection? I mean in this patch also we are creating a replication connection and pulling the slot information over there, just point is we are starting a separate worker for pulling the slot information, and I think that approach is better as this will not impact the performance of the other replication connection which we are using for communicating the data. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Synchronizing slots from primary to standby
On Mon, Nov 29, 2021 at 11:14 AM Dilip Kumar wrote: > > On Mon, Nov 29, 2021 at 9:40 AM Bharath Rupireddy > wrote: > > > > On Mon, Nov 29, 2021 at 1:48 AM SATYANARAYANA NARLAPURAM > > wrote: > > > > > >> 3) Instead of the subscriber pulling the slot info, why can't the > > >> publisher (via the walsender or a new bg worker maybe?) push the > > >> latest slot info? I'm not sure we want to add more functionality to > > >> the walsender, if yes, isn't it going to be much simpler? > > > > > > Standby pulling the information or at least making a first attempt to > > > connect to the primary is a better design as primary doesn't need to > > > spend its cycles repeatedly connecting to an unreachable standby. In > > > fact, primary wouldn't even need to know the followers, for example > > > followers / log shipping standbys > > > > My idea was to let the existing walsender from the primary/publisher > > to send the slot info (both logical and physical replication slots) to > > the standby/subscriber, probably by piggybacking the slot info with > > the WAL currently it sends. Having said that, I don't know the > > feasibility of it. Anyways, I'm not in favour of having a new bg > > worker to just ship the slot info. The standby/subscriber, while > > making connection to primary/publisher, can choose to get the > > replication slot info. > > I think it is possible that the standby is restoring the WAL directly > from the archive location and there might not be any wal sender at > time. So I think the idea of standby pulling the WAL looks better to > me. My point was that why can't we let the walreceiver (of course users can configure it on the standby/subscriber) to choose whether or not to receive the replication (both physical and logical) slot info from the primary/publisher and if yes, the walsender(on the primary/publisher) sending it probably as a new WAL record or just piggybacking the replication slot info with any of the existing WAL records. Or simply a common bg worker (as opposed to the bg worker proposed originally in this thread which, IIUC, works for logical replication) running on standby/subscriber for getting both the physical and logical replication slots info. > > As I said upthread, the problem I see with standby/subscriber pulling > > the info is that: how frequently the standby/subscriber is going to > > sync the slot info from primary/publisher? How can it ensure that the > > latest information exists say when the subscriber is down/crashed > > before it picks up the latest slot information? > > Yeah that is a good question that how frequently the subscriber should > fetch the slot information, I think that should be configurable > values. And the time delay is more, the chances of losing the latest > slot is more. I agree that it should be configurable. Even if the primary/publisher is down/crashed, one can still compare the latest slot info from both the primary/publisher and standby/subscriber using a new tool pg_replslotdata proposed at [1] and see how far and which slots missed the latest replication slot info and probably drop those alone to recreate and retain other slots as is. [1] - https://www.postgresql.org/message-id/CALj2ACW0rV5gWK8A3m6_X62qH%2BVfaq5hznC%3Di0R5Wojt5%2Byhyw%40mail.gmail.com Regards, Bharath Rupireddy.
Re: Synchronizing slots from primary to standby
On Mon, Nov 29, 2021 at 9:40 AM Bharath Rupireddy wrote: > > On Mon, Nov 29, 2021 at 1:48 AM SATYANARAYANA NARLAPURAM > wrote: > > > >> 3) Instead of the subscriber pulling the slot info, why can't the > >> publisher (via the walsender or a new bg worker maybe?) push the > >> latest slot info? I'm not sure we want to add more functionality to > >> the walsender, if yes, isn't it going to be much simpler? > > > > Standby pulling the information or at least making a first attempt to > > connect to the primary is a better design as primary doesn't need to spend > > its cycles repeatedly connecting to an unreachable standby. In fact, > > primary wouldn't even need to know the followers, for example followers / > > log shipping standbys > > My idea was to let the existing walsender from the primary/publisher > to send the slot info (both logical and physical replication slots) to > the standby/subscriber, probably by piggybacking the slot info with > the WAL currently it sends. Having said that, I don't know the > feasibility of it. Anyways, I'm not in favour of having a new bg > worker to just ship the slot info. The standby/subscriber, while > making connection to primary/publisher, can choose to get the > replication slot info. I think it is possible that the standby is restoring the WAL directly from the archive location and there might not be any wal sender at time. So I think the idea of standby pulling the WAL looks better to me. > As I said upthread, the problem I see with standby/subscriber pulling > the info is that: how frequently the standby/subscriber is going to > sync the slot info from primary/publisher? How can it ensure that the > latest information exists say when the subscriber is down/crashed > before it picks up the latest slot information? Yeah that is a good question that how frequently the subscriber should fetch the slot information, I think that should be configurable values. And the time delay is more, the chances of losing the latest slot is more. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Synchronizing slots from primary to standby
On Mon, Nov 29, 2021 at 1:48 AM SATYANARAYANA NARLAPURAM wrote: > >> 3) Instead of the subscriber pulling the slot info, why can't the >> publisher (via the walsender or a new bg worker maybe?) push the >> latest slot info? I'm not sure we want to add more functionality to >> the walsender, if yes, isn't it going to be much simpler? > > Standby pulling the information or at least making a first attempt to connect > to the primary is a better design as primary doesn't need to spend its > cycles repeatedly connecting to an unreachable standby. In fact, primary > wouldn't even need to know the followers, for example followers / log > shipping standbys My idea was to let the existing walsender from the primary/publisher to send the slot info (both logical and physical replication slots) to the standby/subscriber, probably by piggybacking the slot info with the WAL currently it sends. Having said that, I don't know the feasibility of it. Anyways, I'm not in favour of having a new bg worker to just ship the slot info. The standby/subscriber, while making connection to primary/publisher, can choose to get the replication slot info. As I said upthread, the problem I see with standby/subscriber pulling the info is that: how frequently the standby/subscriber is going to sync the slot info from primary/publisher? How can it ensure that the latest information exists say when the subscriber is down/crashed before it picks up the latest slot information? IIUC, the initial idea proposed in this patch deals with only logical replication slots not the physical replication slots, what I'm thinking is to have a generic way to deal with both of them. Note: In the above description, I used primary-standby and publisher-subscriber to represent the physical and logical replication slots respectively. Regards, Bharath Rupireddy.
Re: Synchronizing slots from primary to standby
> 3) Instead of the subscriber pulling the slot info, why can't the > publisher (via the walsender or a new bg worker maybe?) push the > latest slot info? I'm not sure we want to add more functionality to > the walsender, if yes, isn't it going to be much simpler? > Standby pulling the information or at least making a first attempt to connect to the primary is a better design as primary doesn't need to spend its cycles repeatedly connecting to an unreachable standby. In fact, primary wouldn't even need to know the followers, for example followers / log shipping standbys
Re: Synchronizing slots from primary to standby
On Sun, Oct 31, 2021 at 3:38 PM Peter Eisentraut wrote: > > I want to reactivate $subject. I took Petr Jelinek's patch from [0], > rebased it, added a bit of testing. It basically works, but as > mentioned in [0], there are various issues to work out. > > The idea is that the standby runs a background worker to periodically > fetch replication slot information from the primary. On failover, a > logical subscriber would then ideally find up-to-date replication slots > on the new publisher and can just continue normally. > > The previous thread didn't have a lot of discussion, but I have gathered > from off-line conversations that there is a wider agreement on this > approach. So the next steps would be to make it more robust and > configurable and documented. As I said, I added a small test case to > show that it works at all, but I think a lot more tests should be added. > I have also found that this breaks some seemingly unrelated tests in > the recovery test suite. I have disabled these here. I'm not sure if > the patch actually breaks anything or if these are just differences in > timing or implementation dependencies. This patch adds a LIST_SLOTS > replication command, but I think this could be replaced with just a > SELECT FROM pg_replication_slots query now. (This patch is originally > older than when you could run SELECT queries over the replication protocol.) > > So, again, this isn't anywhere near ready, but there is already a lot > here to gather feedback about how it works, how it should work, how to > configure it, and how it fits into an overall replication and HA > architecture. > > > [0]: > https://www.postgresql.org/message-id/flat/3095349b-44d4-bf11-1b33-7eefb585d578%402ndquadrant.com Thanks for working on this patch. This feature will be useful as it avoids manual intervention during the failover. Here are some thoughts: 1) Instead of a new LIST_SLOT command, can't we use READ_REPLICATION_SLOT (slight modifications needs to be done to make it support logical replication slots and to get more information from the subscriber). 2) How frequently the new bg worker is going to sync the slot info? How can it ensure that the latest information exists say when the subscriber is down/crashed before it picks up the latest slot information? 3) Instead of the subscriber pulling the slot info, why can't the publisher (via the walsender or a new bg worker maybe?) push the latest slot info? I'm not sure we want to add more functionality to the walsender, if yes, isn't it going to be much simpler? 4) IIUC, the proposal works only for logical replication slots but do you also see the need for supporting some kind of synchronization of physical replication slots as well? IMO, we need a better and consistent way for both types of replication slots. If the walsender can somehow push the slot info from the primary (for physical replication slots)/publisher (for logical replication slots) to the standby/subscribers, this will be a more consistent and simplistic design. However, I'm not sure if this design is doable at all. Regards, Bharath Rupireddy.
Re: Synchronizing slots from primary to standby
Hi all, Peter Eisentraut writes: > I want to reactivate $subject. I took Petr Jelinek's patch from [0], > rebased it, added a bit of testing. It basically works, but as > mentioned in [0], there are various issues to work out. Thanks for working on that topic, I believe it's an important part of Postgres' HA story. > The idea is that the standby runs a background worker to periodically fetch > replication slot information from the primary. On failover, a logical > subscriber would then ideally find up-to-date replication slots on the new > publisher and can just continue normally. Is there a case to be made about doing the same thing for physical replication slots too? That's what pg_auto_failover [1] does by default: it creates replication slots on every node for every other node, in a way that a standby Postgres instance now maintains a replication slot for the primary. This ensures that after a promotion, the standby knows to retain any and all WAL segments that the primary might need when rejoining, at pg_rewind time. > The previous thread didn't have a lot of discussion, but I have gathered > from off-line conversations that there is a wider agreement on this > approach. So the next steps would be to make it more robust and > configurable and documented. I suppose part of the configuration would then include taking care of physical slots. Some people might want to turn that off and use the Postgres 13+ ability to use the remote primary restore_command to fetch missing WAL files, instead. Well, people who have setup an archiving system, anyway. > As I said, I added a small test case to > show that it works at all, but I think a lot more tests should be added. I > have also found that this breaks some seemingly unrelated tests in the > recovery test suite. I have disabled these here. I'm not sure if the patch > actually breaks anything or if these are just differences in timing or > implementation dependencies. This patch adds a LIST_SLOTS replication > command, but I think this could be replaced with just a SELECT FROM > pg_replication_slots query now. (This patch is originally older than when > you could run SELECT queries over the replication protocol.) Given the admitted state of the patch, I didn't focus on tests. I could successfully apply the patch on-top of current master's branch, and cleanly compile and `make check`. Then I also updated pg_auto_failover to support Postgres 15devel [2] so that I could then `make NODES=3 cluster` there and play with the new replication command: $ psql -d "port=5501 replication=1" -c "LIST_SLOTS;" psql:/Users/dim/.psqlrc:24: ERROR: XX000: cannot execute SQL commands in WAL sender for physical replication LOCATION: exec_replication_command, walsender.c:1830 ... I'm not too sure about this idea of running SQL in a replication protocol connection that you're mentioning, but I suppose that's just me needing to brush up on the topic. > So, again, this isn't anywhere near ready, but there is already a lot here > to gather feedback about how it works, how it should work, how to configure > it, and how it fits into an overall replication and HA architecture. Maybe the first question about configuration would be about selecting which slots a standby should maintain from the primary. Is it all of the slots that exists on both the nodes, or a sublist of that? Is it possible to have a slot with the same name on a primary and a standby node, in a way that the standby's slot would be a completely separate entity from the primary's slot? If yes (I just don't know at the moment), well then, should we continue to allow that? Other aspects of the configuration might include a list of databases in which to make the new background worker active, and the polling delay, etc. Also, do we want to even consider having the slot management on a primary node depend on the ability to sync the advancing on one or more standby nodes? I'm not sure to see that one as a good idea, but maybe we want to kill it publically very early then ;-) Regards, -- dim Author of “The Art of PostgreSQL”, see https://theartofpostgresql.com [1]: https://github.com/citusdata/pg_auto_failover [2]: https://github.com/citusdata/pg_auto_failover/pull/838
Re: Synchronizing slots from primary to standby
On Sun, Oct 31, 2021 at 7:08 PM Peter Eisentraut wrote: > > I want to reactivate $subject. I took Petr Jelinek's patch from [0], > rebased it, added a bit of testing. It basically works, but as > mentioned in [0], there are various issues to work out. Thank you for working on this feature! > > The idea is that the standby runs a background worker to periodically > fetch replication slot information from the primary. On failover, a > logical subscriber would then ideally find up-to-date replication slots > on the new publisher and can just continue normally. > > I have also found that this breaks some seemingly unrelated tests in > the recovery test suite. I have disabled these here. I'm not sure if > the patch actually breaks anything or if these are just differences in > timing or implementation dependencies. I haven’t looked at the patch deeply but regarding 007_sync_rep.pl, the tests seem to fail since the tests rely on the order of the wal sender array on the shared memory. Since a background worker for synchronizing replication slots periodically connects to the walsender on the primary and disconnects, it breaks the assumption of the order. Regarding 010_logical_decoding_timelines.pl, I guess that the patch breaks the test because the background worker for synchronizing slots on the replica periodically advances the replica's slot. I think we need to have a way to disable the slot synchronization or to specify the slot name to sync with the primary. I'm not sure we already discussed this topic but I think we need it at least for testing purposes. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Synchronizing slots from primary to standby
On Mon, Dec 31, 2018 at 10:23 AM Petr Jelinek wrote: > As Andres has mentioned over at minimal decoding on standby thread [1], > that functionality can be used to add simple worker which periodically > synchronizes the slot state from the primary to a standby. > > Attached patch is rough implementation of such worker. It's nowhere near > committable in the current state, it servers primarily two purposes - to > have something over what we can agree on the approach (and if we do, > serve as base for that) and to demonstrate that the patch in [1] can > indeed be used for this functionality. All this means that this patch > depends on the [1] to work. Hi Petr, Do I understand correctly that this depends on the "logical decoding on standby" patch, but that isn't in the Commitfest? Seems like an oversight, since that thread has a recently posted v11 patch that applies OK, and there was recent review. You patches no longer apply on top though. Would it make sense to post a patch set here including logical-decoding-on-standby_v11.patch + your two patches (rebased), since this is currently marked as "Needs review"? -- Thomas Munro https://enterprisedb.com