Hi, On Tue, Dec 30, 2025 at 10:12 AM Xuneng Zhou <[email protected]> wrote: > > Hi, > > On Sat, Dec 27, 2025 at 12:15 AM Xuneng Zhou <[email protected]> wrote: > > > > Hi Chao, > > > > Thanks a lot for your review! > > > > On Fri, Dec 26, 2025 at 4:25 PM Chao Li <[email protected]> wrote: > > > > > > > > > > > > > On Dec 19, 2025, at 10:49, Xuneng Zhou <[email protected]> wrote: > > > > > > > > Hi, > > > > > > > > On Thu, Dec 18, 2025 at 8:25 PM Alexander Korotkov > > > > <[email protected]> wrote: > > > >> > > > >> On Thu, Dec 18, 2025 at 2:24 PM Xuneng Zhou <[email protected]> > > > >> wrote: > > > >>> On Thu, Dec 18, 2025 at 6:38 PM Alexander Korotkov > > > >>> <[email protected]> wrote: > > > >>>> > > > >>>> Hi, Xuneng! > > > >>>> > > > >>>> On Tue, Dec 16, 2025 at 6:46 AM Xuneng Zhou <[email protected]> > > > >>>> wrote: > > > >>>>> Remove the erroneous WAIT_LSN_TYPE_COUNT case from the switch > > > >>>>> statement in v5 patch 1. > > > >>>> > > > >>>> Thank you for your work on this patchset. Generally, it looks like > > > >>>> good and quite straightforward extension of the current > > > >>>> functionality. > > > >>>> But this patch adds 4 new unreserved keywords to our grammar. Do you > > > >>>> think we can put mode into with options clause? > > > >>>> > > > >>> > > > >>> Thanks for pointing this out. Yeah, 4 unreserved keywords add > > > >>> complexity to the parser and it may not be worthwhile since replay is > > > >>> expected to be the common use scenario. Maybe we can do something like > > > >>> this: > > > >>> > > > >>> -- Default (REPLAY mode) > > > >>> WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '1s'); > > > >>> > > > >>> -- Explicit REPLAY mode > > > >>> WAIT FOR LSN '0/306EE20' WITH (MODE 'replay', TIMEOUT '1s'); > > > >>> > > > >>> -- WRITE mode > > > >>> WAIT FOR LSN '0/306EE20' WITH (MODE 'write', TIMEOUT '1s'); > > > >>> > > > >>> If no mode is set explicitly in the options clause, it defaults to > > > >>> replay. I'll update the patch per your suggestion. > > > >> > > > >> This is exactly what I meant. Please, go ahead. > > > >> > > > > > > > > Here is the updated patch set (v7). Please check. > > > > > > > > -- > > > > Best, > > > > Xuneng > > > > <v7-0001-Extend-xlogwait-infrastructure-with-write-and-flu.patch><v7-0004-Use-WAIT-FOR-LSN-in.patch><v7-0003-Add-tab-completion-for-WAIT-FOR-LSN-MODE-option.patch><v7-0002-Add-MODE-option-to-WAIT-FOR-LSN-command.patch> > > > > > > Hi Xuneng, > > > > > > A solid patch! Just a few small comments: > > > > > > 1 - 0001 > > > ``` > > > +XLogRecPtr > > > +GetCurrentLSNForWaitType(WaitLSNType lsnType) > > > +{ > > > + switch (lsnType) > > > + { > > > + case WAIT_LSN_TYPE_STANDBY_REPLAY: > > > + return GetXLogReplayRecPtr(NULL); > > > + > > > + case WAIT_LSN_TYPE_STANDBY_WRITE: > > > + return GetWalRcvWriteRecPtr(); > > > + > > > + case WAIT_LSN_TYPE_STANDBY_FLUSH: > > > + return GetWalRcvFlushRecPtr(NULL, NULL); > > > + > > > + case WAIT_LSN_TYPE_PRIMARY_FLUSH: > > > + return GetFlushRecPtr(NULL); > > > + } > > > + > > > + elog(ERROR, "invalid LSN wait type: %d", lsnType); > > > + pg_unreachable(); > > > +} > > > ``` > > > > > > As you add pg_unreachable() in the new function > > > GetCurrentLSNForWaitType(), I’m thinking if we should just do an > > > Assert(), I saw every existing related function has done such an assert, > > > for example addLSNWaiter(), it does “Assert(i >= 0 && i < > > > WAIT_LSN_TYPE_COUNT);”. I guess we can just following the current > > > mechanism to verify lsnType. So, for GetCurrentLSNForWaitType(), we can > > > just add a default clause and Assert(false). > > > > My take is that Assert(false) alone might not be enough here, since > > assertions vanish in non-assert builds. An unexpected lsnType is a > > real bug even in production, so keeping a hard error plus > > pg_unreachable() seems to be a safer pattern. It also acts as a > > guardrail for future extensions — if new wait types are added without > > updating this code, we’ll fail loudly rather than silently returning > > an incorrect LSN. Assert(i >= 0 && i < WAIT_LSN_TYPE_COUNT) was added > > to the top of the function. > > > > > 2 - 0002 > > > ``` > > > + else > > > + ereport(ERROR, > > > + > > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > > + errmsg("unrecognized > > > value for WAIT option \"%s\": \"%s\"", > > > + "MODE", > > > mode_str), > > > ``` > > > > > > I wonder why don’t we directly put MODE into the error message? > > > > Yeah, putting MODE into the error message is cleaner. It's done in v8. > > > > > 3 - 0002 > > > ``` > > > case WAIT_LSN_RESULT_NOT_IN_RECOVERY: > > > if (throw) > > > { > > > + const WaitLSNTypeDesc *desc = > > > &WaitLSNTypeDescs[lsnType]; > > > + XLogRecPtr currentLSN = > > > GetCurrentLSNForWaitType(lsnType); > > > + > > > if (PromoteIsTriggered()) > > > { > > > ereport(ERROR, > > > > > > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > errmsg("recovery > > > is not in progress"), > > > - > > > errdetail("Recovery ended before replaying target LSN %X/%08X; last > > > replay LSN %X/%08X.", > > > + > > > errdetail("Recovery ended before target LSN %X/%08X was %s; last %s LSN > > > %X/%08X.", > > > > > > LSN_FORMAT_ARGS(lsn), > > > - > > > LSN_FORMAT_ARGS(GetXLogReplayRecPtr(NULL)))); > > > + > > > desc->verb, > > > + > > > desc->noun, > > > + > > > LSN_FORMAT_ARGS(currentLSN))); > > > } > > > else > > > ereport(ERROR, > > > > > > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > errmsg("recovery > > > is not in progress"), > > > - errhint("Waiting > > > for the replay LSN can only be executed during recovery.")); > > > + errhint("Waiting > > > for the %s LSN can only be executed during recovery.", > > > + > > > desc->noun)); > > > } > > > ``` > > > > > > currentLSN is only used in the if clause, thus it can be defined inside > > > the if clause. > > > > + 1. > > > > > 3 - 0002 > > > ``` > > > + /* > > > + * If we wrote an LSN that someone was waiting for then walk over > > > the > > > + * shared memory array and set latches to notify the waiters. > > > + */ > > > + if (waitLSNState && > > > + (LogstreamResult.Write >= > > > + > > > pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_WRITE]))) > > > + WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_WRITE, > > > LogstreamResult.Write); > > > ``` > > > > > > Do we need to mention "walk over the shared memory array and set latches” > > > in the comment? The logic belongs to WaitLSNWakeup(). What about if the > > > wake up logic changes in future, then this comment would become stale. So > > > I think we only need to mention “notify the waiters”. > > > > > > > It makes sense to me. They are incorporated into v8. > > > > > > > > 4 - 0003 > > > ``` > > > + /* > > > + * Handle parenthesized option list. This fires when we're in an > > > + * unfinished parenthesized option list. get_previous_words > > > treats a > > > + * completed parenthesized option list as one word, so the above > > > test is > > > + * correct. mode takes a string value ('replay', 'write', > > > 'flush'), > > > + * timeout takes a string value, no_throw takes no value. > > > + */ > > > else if (HeadMatches("WAIT", "FOR", "LSN", MatchAny, "WITH", > > > "(*") && > > > !HeadMatches("WAIT", "FOR", "LSN", MatchAny, > > > "WITH", "(*)")) > > > { > > > - /* > > > - * This fires if we're in an unfinished parenthesized > > > option list. > > > - * get_previous_words treats a completed parenthesized > > > option list as > > > - * one word, so the above test is correct. > > > - */ > > > if (ends_with(prev_wd, '(') || ends_with(prev_wd, ',')) > > > - COMPLETE_WITH("timeout", "no_throw"); > > > - > > > - /* > > > - * timeout takes a string value, no_throw takes no value. > > > We don't > > > - * offer completions for these values. > > > - */ > > > ``` > > > > > > The new comment has lost the meaning of “We don’t offer completions for > > > these values (timeout and no_throw)”, to be explicit, I feel we can > > > retain the sentence. > > > > The sentence is retained. > > > > > 5 - 0004 > > > ``` > > > + my $isrecovery = > > > + $self->safe_psql('postgres', "SELECT pg_is_in_recovery()"); > > > + chomp($isrecovery); > > > croak "unknown mode $mode for 'wait_for_catchup', valid modes are > > > " > > > . join(', ', keys(%valid_modes)) > > > unless exists($valid_modes{$mode}); > > > @@ -3347,9 +3350,6 @@ sub wait_for_catchup > > > } > > > if (!defined($target_lsn)) > > > { > > > - my $isrecovery = > > > - $self->safe_psql('postgres', "SELECT > > > pg_is_in_recovery()"); > > > - chomp($isrecovery); > > > ``` > > > > > > I wonder why pull up pg_is_in_recovery to an early place and > > > unconditionally call it? > > > > > > > This seems unnecessary. I also realized that my earlier approach in > > patch 4 may have been semantically incorrect — it could end up waiting > > for the LSN to replay/write/flush on the node itself, rather than on > > the downstream standby, which defeats the purpose of > > wait_for_catchup(). Patch 4 attempts to address this by running WAIT > > FOR LSN on the standby itself. > > > > Support for primary-flush waiting and the refactoring of existing > > modes have been also incorporated in v8 following Alexander’s > > feedback. The major updates are in patches 2 and 4. Please check. > > > > Added WaitLSNTypeDesc to typedefs.list in v9 patch 2. >
Run pgindent using the updated typedefs.list. -- Best, Xuneng
v10-0004-Use-WAIT-FOR-LSN-in-PostgreSQL-Test-Cluster-wait.patch
Description: Binary data
v10-0003-Add-tab-completion-for-WAIT-FOR-LSN-MODE-option.patch
Description: Binary data
v10-0002-Add-MODE-option-to-WAIT-FOR-LSN-command.patch
Description: Binary data
v10-0001-Extend-xlogwait-infrastructure-with-write-and-fl.patch
Description: Binary data
