On Fri, Jun 5, 2026 at 8:34 AM Zhijie Hou (Fujitsu) <[email protected]> wrote: > > On Thursday, June 4, 2026 5:27 PM Ashutosh Sharma <[email protected]> > wrote: > > On Thu, Jun 4, 2026 at 1:54 PM Zhijie Hou (Fujitsu) > > <[email protected]> wrote: > > > > > > On Thursday, June 4, 2026 3:36 PM Ashutosh Sharma > > <[email protected]> wrote: > > > > On Thu, Jun 4, 2026 at 9:14 AM shveta malik <[email protected]> > > > > wrote: > > > > > My preference, and original intent, was to accept duplicate entries > > > > > and skip them internally. Doc can be updated to say 'duplicate entries > > > > > are skipped'. A server startup failure due to duplicate entries in a > > > > > GUC does not seem right to me. If the alter-system command fails due > > > > > to duplicate entries, that is still fine, but a startup failure seems > > > > > excessive. But let's see what others have to say on this. > > > > > > > > > > > > > Okay, the attached patch adds the capability to automatically remove > > > > duplicate entries from the synchronized_standby_slots list. > > > > > > Thanks for updating the patch. > > > > > > I agree with Shveta that reporting an ERROR is not ideal. I also think it > > > (ERROR) would > > > be inconsistent with existing GUCs, as most of them, such as > > > synchronous_standby_names, search_path, and session_preload_libraries, do > > > not > > > enforce uniqueness. > > > > > > The most similar GUC, synchronous_standby_names, also clarifies this in > > > the > > > documentation: > > > > > > " There is no mechanism to enforce uniqueness of standby names. > > > In case of > > > duplicates one of the matching standbys will be considered as > > > higher priority, > > > though exactly which one is indeterminate."[1] > > > > > > > In N of M > > > > mode, if N > M after removing duplicate entries, an error is raised. > > > > > > I'm not entirely sure about this case. It seems similar to when the > > > number of > > > specified slots is less than N (in ANY N or FIRST N), given that we want > > > to > > skip > > > duplicate slots. In that situation, the natural behavior to me would be to > > > simply block replication rather than raise an error. And > > > synchronous_standby_names would also simply block the transaction in this > > case. > > > > > > > For duplicate entries themselves, I agree with the direction of not > > raising an error. Silently normalizing duplicates is reasonable for > > this GUC, especially if we document it clearly. A repeated slot name > > does not add any new information, so treating it as “same slot listed > > twice by mistake” is practical. > > > > But for N > M after deduplication, I would still lean toward raising an > > error. > > > > Why I’d separate those cases: > > > > 1) Duplicate entries looks like a harmless normalization problem. ANY > > 2 (a, a, b) can be normalized to ANY 2 (a, b) without changing the > > user’s apparent intent much. > > > > 2) N > M after deduplication is not a transient runtime state. ANY 2 > > (a, a) becomes one unique slot. That configuration can never succeed > > unless the config itself changes. Blocking forever turns a static > > configuration mistake into an operational liveness problem. > > > > 3) N > M after deduplication is different from ordinary “not enough > > standbys are currently available”. If we configure ANY 2 (a, b) and > > only a is currently caught up, blocking makes sense because the > > situation may resolve at runtime. If we configure ANY 2 (a, a) and > > duplicates are ignored, there is no possible future runtime in which > > it succeeds without editing the GUC. That is why I think erroring is > > better. > > > > On the synchronous_standby_names comparison, I do not think it is > > fully analogous. The quoted documentation is about there being no > > reliable way to enforce uniqueness of standby names in the live > > system, because those names are matched against runtime standbys and > > the result can be indeterminate. Here, synchronized_standby_slots > > names concrete replication slots, which are stable object identifiers. > > Duplicate config entries are detectable and normalizable > > deterministically at GUC parse time. That gives us a cleaner option > > than synchronous_standby_names has. > > Thanks for the explanation. > > What I was wondering is: ignoring duplicates, what should be the behavior of > "ANY 2 (standby)" when N > M? > > I studied a bit for the behavior of synchronous_standby_names to understand > the > difference. synchronous_standby_names does support syntax like "ANY 2 > (standby)" > where N > M. Because even in that case, a transaction can still commit if > there > are two standbys with the same name ("standby" in this example). I'm not sure > how common that use case is, but it may explain why no error is reported. > > Given that, I'm not opposed to reporting an error in > synchronized_standby_slots > when N > M. The situation is different here since there cannot be two slots > with > the same name, making this a completely invalid use case. >
I also think, we can report error when N>M. IIRC, we were also reporting earlier (without removing duplicates). Upon removing duplicates, we can follow the same behaviour instead of walsender being stuck indefinitely. thanks Shveta
