Hi Amit, if I resume your scenario 1. A standby S has a failover slot slot1 synchronized with slot1 on primary P 2. We promote S 3. On P we drop slot1 and create slot1 again with failover mode (a subscriber exist on another instance by example) 4. A rewind is performed on P the former primary to rejoin S the former standby 5. On P slot1 is automatically dropped and recreated to be synchronized
In which context this kind of scenario could happend? Isn't the goal to find a solution for a switchover which is carried out for maintenance on a Postgres cluster, the aim is to find a compromise to cover the most likely scenarios. Do you think we must come back to the allow_overwrite flag approach or another solution? Best Regards, Fabrice On Mon, Nov 10, 2025 at 1:10 PM Amit Kapila <[email protected]> wrote: > On Fri, Oct 31, 2025 at 2:58 PM Alexander Kukushkin <[email protected]> > wrote: > > > > Instead of dropping such slots, what we actually need is a way to safely > set synced=false->true and continue operating. > > > > Operating logical replication setups is already extremely complex and > error-prone — this is not theoretical, it’s something many of us face daily. > > So rather than adding more speculative features or workarounds, I think > we should focus on addressing real operational pain points and the > inconsistencies in the current design. > > > > A slot created on the primary (which later becomes a standby) with > failover=true has a very clear purpose. The failover flag already indicates > that purpose; synced shouldn’t override it. > > > > I think this is not as clear as you are saying as compared to WAL. In > failover cases, we bump the WAL timelines on new primary and also have > facilities like pg_rewind to ensure that old primary can follow the > new primary after divergence. For slots, there is no such facility, > now, there is an argument that for slot's it is sufficient to match > the name and failover to say that it is okay to overwrite the slot on > old primary. However, it is not clear whether it is always safe to do > so, for example, if the old primary ran after divergence for sometime > and one has re-created the slot with same name and failover property, > it will no longer be the same slot. Unlike WAL, we don't maintain the > slot's history, so it is not equally clear that we can overwrite old > primary's slot's as it is. > > -- > With Regards, > Amit Kapila. >
