Hi Ashutosh, On Thu, Feb 26, 2026 at 1:11 AM Ashutosh Sharma <[email protected]> wrote:
> Hi, > > On Thu, Feb 26, 2026 at 2:15 PM shveta malik <[email protected]> > wrote: > > > > On Thu, Feb 26, 2026 at 1:54 PM SATYANARAYANA NARLAPURAM > > <[email protected]> wrote: > > > > > > Hi Ashutosh, > > > > > > On Wed, Feb 25, 2026 at 11:42 PM Ashutosh Sharma < > [email protected]> wrote: > > >> > > >> > > >> I don't think we should be comparing "synchronous_standby_names" with > > >> "synchronized_standby_slots", even though they appear similar in > > >> purpose. All values listed in synchronous_standby_names represent > > >> synchronous standbys exclusively, whereas synchronized_standby_slots > > >> can hold values for both synchronous and asynchronous standbys. In > > >> other words, every server referenced by synchronous_standby_names is > > >> of the same type, but that may not be the case with > > >> synchronized_standby_slots. > > >> > > >> If a GUC can hold values of different types (sync vs. async), does it > > >> really make sense to use a qualifier like ANY 1 (val1, val2) when val1 > > >> and val2 are different in nature? For example, suppose val1 is a > > >> synchronous standby and val2 is an asynchronous standby, and we > > >> configure ANY 1 (val1, val2). It's possible for val2 to get ahead of > > >> val1 in terms of replication progress, which in turn could mean the > > >> logical replica is also ahead of val1. So if we were to fail over to > > >> val1 (since it's the only synchronous standby), we will not be able to > > >> use the existing logical replication setup. > > > > > > > > > If the failover orchestrator cannot ensure standby1 to not get the > quorum committed WAL (from archive or standby2) then the setting ANY 1 > (val1, val2) is invalid. > > > This setup also has issues because in your scenario, standby2 is ahead > of the new primary (standby1) and standby2 requires now to rewind to be in > sync with the new primary. Additionally, it allowed readers to read data > that was lost at the end of the failover. We ideally need a mechanism to > not send WAL to async replicas before the sync replicas commit (honoring > syncrhnous_standby_names GUC) feature (similar to > synchronized_standby_slots). It could be a different thread on its own. > > > > > > +1 on the overall idea of the patch. > > I understand the concern raised above that one of the standbys in the > > quorum (synchronized_standby_slots) might lag behind the logical > > replica, and a user could potentially failover to such a standby. But > > I also agree with Amit that configuring failover correctly is > > ultimately the responsibility of failover-solution. And instructions > > in doc should be followed before deciding if a standby is > > failover-ready or not. > > > > As suggested in [1], IMO, it is a reasonably good idea for > > 'synchronized_standby_slots' to DEFAULT to the value of > > 'synchronous_standby_names'. That way, even if the user missed to > > configure 'synchronized_standby_slots' explicitly, we would still have > > reasonable protection in place. At the same time, if a user > > intentionally chooses not to configure it, a NULL/NONE value should > > remain a valid option. > > > > AFAIU, not all names listed in "synchronous_standby_names" are > necessarily synchronous standbys. Tools like pg_receivewal, for > example, can establish a replication connection to the primary and > appear in that list. Therefore, deriving "synchronized_standby_slots" > from "synchronous_standby_names", if not set by the user would cause > logical slots to be synchronized to whatever nodes those names > represent, including a host running pg_receivewal, which is certainly > not something the user would have intended to do. Therefore I feel > this might not just be the good choice. Agreed, not a good idea to have synchronized_standby_slots default to synchronous_standby_names because application_names and slot names are different as stated. Thanks, Satya
