On Sun, Jun 8, 2025 at 8:57 PM shveta malik <shveta.ma...@gmail.com> wrote: > > On Sat, Jun 7, 2025 at 2:44 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Fri, Jun 6, 2025 at 3:02 AM shveta malik <shveta.ma...@gmail.com> wrote: > > > > > > On Wed, Jun 4, 2025 at 3:40 PM shveta malik <shveta.ma...@gmail.com> > > > wrote: > > > > > > > > On Wed, Jun 4, 2025 at 6:41 AM Masahiko Sawada <sawada.m...@gmail.com> > > > > wrote: > > > > > > > > > > On Tue, May 20, 2025 at 9:54 PM Amit Kapila <amit.kapil...@gmail.com> > > > > > wrote: > > > > > > > > > > > > Yeah, I find the idea that the presence of a logical slot will allow > > > > > > the user to enable logical decoding/replication more appealing than > > > > > > this new alternative, leaving aside the challenges of realizing it. > > > > > > > > +1. This idea appears more user-friendly and easier to understand > > > > compared to other approaches, such as having multiple GUCs or using > > > > ALTER SYSTEM. > > > > > > > > > I've drafted this idea. Here are summary for attached two patches: > > > > > > > > > > 0001 patch allows us to create a logical slot without WAL reservation. > > > > > > > > > > 0002 patch is the main patch for dynamically enabling/disabling > > > > > logical decoding when wal_level is 'replica'. > > > > > > > > Thank You for the patches. I have done some initial testing, it seems > > > > to be working well. I will do more testing and review and will share > > > > further feedback. > > > > > > I reviewed further and had few concerns: > > > > Thank you for reviewing this feature! > > > > > > > > 1) > > > We now invalidate slots on standby if the primary (with > > > wal_level=replica) has dropped the last logical slot and internally > > > reverted its runtime (effective) wal_level back to replica. Consider > > > the following scenario involving a cascaded logical replication setup: > > > > > > a) The publisher is configured with wal_level = replica and has > > > created a publication (pub1). > > > b) A subscriber server creates a subscription (sub1) to pub1. As part > > > of the slot creation for sub1, the publisher's effective wal_level is > > > switched to logical. > > > c) The publisher also has a physical standby, which in turn has its > > > own logical subscriber, named standby_sub1. > > > > > > At this point, everything works as expected i.e. changes from the > > > publisher flow through the physical standby and are replicated to > > > standby_sub1. Now if the user drops sub1, the replication slot on the > > > primary is also dropped. Since this was the last logical slot, the > > > primary automatically switches its effective wal_level back to > > > replica. This change propagates to the standby, causing it to > > > invalidate the slot for standby_sub1. As a result, the standby logs > > > the following error: > > > > > > STATEMENT: START_REPLICATION SLOT "standby_sub1" LOGICAL 0/0 (...) > > > ERROR: logical decoding needs to be enabled on the primary > > > > > > Even if we manually recreate a logical slot on the primary afterward, > > > the standby_sub1 subscriber is not able to proceed: > > > ERROR: can no longer access replication slot "standby_sub1" > > > DETAIL: This replication slot has been invalidated due to > > > "wal_level_insufficient". > > > > > > So the removal of the logical subscriber for the publisher has somehow > > > restricted the logical subscriber of standby to work. Is this > > > behaviour acceptable? > > > > > > Without this feature, if I manually switch back wal_level to replica > > > on primary, then it will fail to start. This makes the issue obvious > > > and prevents misconfiguration. > > > FATAL: logical replication slot "sub2" exists, but "wal_level" < > > > "logical" > > > HINT: Change "wal_level" to be "logical" or higher. > > > > > > But the current behaviour is harder to diagnose, as the problem is > > > effectively hidden behind subscription/slot creation/deletion. > > > > The most upstream server in replication configuration would carefully > > need to keep having at least one logical slot. One way to keep > > effective_wal_level 'logical' on the publisher where wal_level = > > 'replica' is to have a logical slot without WAL reservation that is > > not relevant with any subscriptions. It could require an extra logical > > slot but seems workable. Does it resolve this concern? > > > > Yes, I agree that publishers should have a separate slot (not related > with any subscription) without WAL reservation to retain > effective_wal_level as logical when wal_level is replica. But the > question is how can that be ensured? Will it be user's responsibility > to always create that slot? If user has already some subscriptions > subscribing to most upstream server, then while setting up logical > replication on physical standby at a later stage, user will not even > encounter the error: > ERROR: logical decoding needs to be enabled on the primary, > HINT: Set wal_level >= logical or create at least one logical slot on > the primary. > > And in lack of such error, users may always end up in the above > explained situation.
I think it's the user's responsibility to keep at least one logical slot. It seems that setting wal_level to 'logical' would be the most reliable solution for this case. We might want to provide a way to keep 'logical' WAL level somehow but I don't have a good idea for now. > > > 2) > > > 'show effective_wal_level' shows output as 'logical' if a slot exists > > > on primary. But on physical standby, it still shows it as 'replica' > > > even in the presence of slots. Is this intentional? > > > > Yes. I think we should disallow the standbys to create a logical slot > > as long as they use wal_level = 'replica', because otherwise the > > standby would need to invalidate the logical slot at a promotion. > > Which could cause a large down time in a failover case. > > Do you mean even if primary is running on effective_wal_level=logical, > we shall disallow slot-creation on standby if standby has > wal_level=replica? It means the $subject's enhancement is only valid > on primary? Thank you for pointing it out, my assumption was wrong. Even if the standby sets wal_level='replica', it should be able to create logical slots if the primary enables the logical decoding, and it should be able to continue using it even after the promotion. I"ve updated the patch accordingly. > > Or the other way could be that we can have 2 trigger points for > enabling effective_wal_level to logical on primary: > 1) One is when a logical slot is created on primary. > 2) Another is when a logical slot is created on any of its physical standby. > > We need to maintain these 2 separately as drop of last primary's slot > should not toggle it back to replica when any of its physical > standbys still need it. But if a publisher has multiple physical > standbys, then it will need extra handling i.e. last logical-slot drop > on standby1 should not end up toggling effective_wal_level to replica > when standby2 still has some logical slots. I am somehow trying to > think of a way where we have that extra slot without the user's > intervention. Considering cascading replication cases too, 2) could be tricky as cascaded standbys need to propagate the information of logical slot creation up to the most upstream server. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
v2-0001-Allow-to-create-logical-slots-with-no-WAL-reserva.patch
Description: Binary data
v2-0002-Enable-logical-decoding-dynamically-based-on-logi.patch
Description: Binary data