Hi,

On Tue, Dec 30, 2025 at 10:12 AM Xuneng Zhou <[email protected]> wrote:
>
> Hi,
>
> On Sat, Dec 27, 2025 at 12:15 AM Xuneng Zhou <[email protected]> wrote:
> >
> > Hi Chao,
> >
> > Thanks a lot for your review!
> >
> > On Fri, Dec 26, 2025 at 4:25 PM Chao Li <[email protected]> wrote:
> > >
> > >
> > >
> > > > On Dec 19, 2025, at 10:49, Xuneng Zhou <[email protected]> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Thu, Dec 18, 2025 at 8:25 PM Alexander Korotkov 
> > > > <[email protected]> wrote:
> > > >>
> > > >> On Thu, Dec 18, 2025 at 2:24 PM Xuneng Zhou <[email protected]> 
> > > >> wrote:
> > > >>> On Thu, Dec 18, 2025 at 6:38 PM Alexander Korotkov 
> > > >>> <[email protected]> wrote:
> > > >>>>
> > > >>>> Hi, Xuneng!
> > > >>>>
> > > >>>> On Tue, Dec 16, 2025 at 6:46 AM Xuneng Zhou <[email protected]> 
> > > >>>> wrote:
> > > >>>>> Remove the erroneous WAIT_LSN_TYPE_COUNT case from the switch
> > > >>>>> statement in v5 patch 1.
> > > >>>>
> > > >>>> Thank you for your work on this patchset.  Generally, it looks like
> > > >>>> good and quite straightforward extension of the current 
> > > >>>> functionality.
> > > >>>> But this patch adds 4 new unreserved keywords to our grammar.  Do you
> > > >>>> think we can put mode into with options clause?
> > > >>>>
> > > >>>
> > > >>> Thanks for pointing this out. Yeah, 4 unreserved keywords add
> > > >>> complexity to the parser and it may not be worthwhile since replay is
> > > >>> expected to be the common use scenario. Maybe we can do something like
> > > >>> this:
> > > >>>
> > > >>> -- Default (REPLAY mode)
> > > >>> WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '1s');
> > > >>>
> > > >>> -- Explicit REPLAY mode
> > > >>> WAIT FOR LSN '0/306EE20' WITH (MODE 'replay', TIMEOUT '1s');
> > > >>>
> > > >>> -- WRITE mode
> > > >>> WAIT FOR LSN '0/306EE20' WITH (MODE 'write', TIMEOUT '1s');
> > > >>>
> > > >>> If no mode is set explicitly in the options clause, it defaults to
> > > >>> replay. I'll update the patch per your suggestion.
> > > >>
> > > >> This is exactly what I meant.  Please, go ahead.
> > > >>
> > > >
> > > > Here is the updated patch set (v7). Please check.
> > > >
> > > > --
> > > > Best,
> > > > Xuneng
> > > > <v7-0001-Extend-xlogwait-infrastructure-with-write-and-flu.patch><v7-0004-Use-WAIT-FOR-LSN-in.patch><v7-0003-Add-tab-completion-for-WAIT-FOR-LSN-MODE-option.patch><v7-0002-Add-MODE-option-to-WAIT-FOR-LSN-command.patch>
> > >
> > > Hi Xuneng,
> > >
> > > A solid patch! Just a few small comments:
> > >
> > > 1 - 0001
> > > ```
> > > +XLogRecPtr
> > > +GetCurrentLSNForWaitType(WaitLSNType lsnType)
> > > +{
> > > +       switch (lsnType)
> > > +       {
> > > +               case WAIT_LSN_TYPE_STANDBY_REPLAY:
> > > +                       return GetXLogReplayRecPtr(NULL);
> > > +
> > > +               case WAIT_LSN_TYPE_STANDBY_WRITE:
> > > +                       return GetWalRcvWriteRecPtr();
> > > +
> > > +               case WAIT_LSN_TYPE_STANDBY_FLUSH:
> > > +                       return GetWalRcvFlushRecPtr(NULL, NULL);
> > > +
> > > +               case WAIT_LSN_TYPE_PRIMARY_FLUSH:
> > > +                       return GetFlushRecPtr(NULL);
> > > +       }
> > > +
> > > +       elog(ERROR, "invalid LSN wait type: %d", lsnType);
> > > +       pg_unreachable();
> > > +}
> > > ```
> > >
> > > As you add pg_unreachable() in the new function 
> > > GetCurrentLSNForWaitType(), I’m thinking if we should just do an 
> > > Assert(), I saw every existing related function has done such an assert, 
> > > for example addLSNWaiter(), it does “Assert(i >= 0 && i < 
> > > WAIT_LSN_TYPE_COUNT);”. I guess we can just following the current 
> > > mechanism to verify lsnType. So, for GetCurrentLSNForWaitType(), we can 
> > > just add a default clause and Assert(false).
> >
> > My take is that Assert(false) alone might not be enough here, since
> > assertions vanish in non-assert builds. An unexpected lsnType is a
> > real bug even in production, so keeping a hard error plus
> > pg_unreachable() seems to be a safer pattern. It also acts as a
> > guardrail for future extensions — if new wait types are added without
> > updating this code, we’ll fail loudly rather than silently returning
> > an incorrect LSN. Assert(i >= 0 && i < WAIT_LSN_TYPE_COUNT) was added
> > to the top of the function.
> >
> > > 2 - 0002
> > > ```
> > > +                       else
> > > +                               ereport(ERROR,
> > > +                                               
> > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > +                                                errmsg("unrecognized 
> > > value for WAIT option \"%s\": \"%s\"",
> > > +                                                               "MODE", 
> > > mode_str),
> > > ```
> > >
> > > I wonder why don’t we directly put MODE into the error message?
> >
> > Yeah, putting MODE into the error message is cleaner. It's done in v8.
> >
> > > 3 - 0002
> > > ```
> > >                 case WAIT_LSN_RESULT_NOT_IN_RECOVERY:
> > >                         if (throw)
> > >                         {
> > > +                               const           WaitLSNTypeDesc *desc = 
> > > &WaitLSNTypeDescs[lsnType];
> > > +                               XLogRecPtr      currentLSN = 
> > > GetCurrentLSNForWaitType(lsnType);
> > > +
> > >                                 if (PromoteIsTriggered())
> > >                                 {
> > >                                         ereport(ERROR,
> > >                                                         
> > > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > >                                                         errmsg("recovery 
> > > is not in progress"),
> > > -                                                       
> > > errdetail("Recovery ended before replaying target LSN %X/%08X; last 
> > > replay LSN %X/%08X.",
> > > +                                                       
> > > errdetail("Recovery ended before target LSN %X/%08X was %s; last %s LSN 
> > > %X/%08X.",
> > >                                                                           
> > > LSN_FORMAT_ARGS(lsn),
> > > -                                                                         
> > > LSN_FORMAT_ARGS(GetXLogReplayRecPtr(NULL))));
> > > +                                                                         
> > > desc->verb,
> > > +                                                                         
> > > desc->noun,
> > > +                                                                         
> > > LSN_FORMAT_ARGS(currentLSN)));
> > >                                 }
> > >                                 else
> > >                                         ereport(ERROR,
> > >                                                         
> > > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > >                                                         errmsg("recovery 
> > > is not in progress"),
> > > -                                                       errhint("Waiting 
> > > for the replay LSN can only be executed during recovery."));
> > > +                                                       errhint("Waiting 
> > > for the %s LSN can only be executed during recovery.",
> > > +                                                                       
> > > desc->noun));
> > >                         }
> > > ```
> > >
> > > currentLSN is only used in the if clause, thus it can be defined inside 
> > > the if clause.
> >
> > + 1.
> >
> > > 3 - 0002
> > > ```
> > > +       /*
> > > +        * If we wrote an LSN that someone was waiting for then walk over 
> > > the
> > > +        * shared memory array and set latches to notify the waiters.
> > > +        */
> > > +       if (waitLSNState &&
> > > +               (LogstreamResult.Write >=
> > > +                
> > > pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_WRITE])))
> > > +               WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_WRITE, 
> > > LogstreamResult.Write);
> > > ```
> > >
> > > Do we need to mention "walk over the shared memory array and set latches” 
> > > in the comment? The logic belongs to WaitLSNWakeup(). What about if the 
> > > wake up logic changes in future, then this comment would become stale. So 
> > > I think we only need to mention “notify the waiters”.
> > >
> >
> > It makes sense to me. They are incorporated into v8.
> >
> > >
> > > 4 - 0003
> > > ```
> > > +       /*
> > > +        * Handle parenthesized option list.  This fires when we're in an
> > > +        * unfinished parenthesized option list.  get_previous_words 
> > > treats a
> > > +        * completed parenthesized option list as one word, so the above 
> > > test is
> > > +        * correct.  mode takes a string value ('replay', 'write', 
> > > 'flush'),
> > > +        * timeout takes a string value, no_throw takes no value.
> > > +        */
> > >         else if (HeadMatches("WAIT", "FOR", "LSN", MatchAny, "WITH", 
> > > "(*") &&
> > >                          !HeadMatches("WAIT", "FOR", "LSN", MatchAny, 
> > > "WITH", "(*)"))
> > >         {
> > > -               /*
> > > -                * This fires if we're in an unfinished parenthesized 
> > > option list.
> > > -                * get_previous_words treats a completed parenthesized 
> > > option list as
> > > -                * one word, so the above test is correct.
> > > -                */
> > >                 if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
> > > -                       COMPLETE_WITH("timeout", "no_throw");
> > > -
> > > -               /*
> > > -                * timeout takes a string value, no_throw takes no value. 
> > > We don't
> > > -                * offer completions for these values.
> > > -                */
> > > ```
> > >
> > > The new comment has lost the meaning of “We don’t offer completions for 
> > > these values (timeout and no_throw)”, to be explicit, I feel we can 
> > > retain the sentence.
> >
> >  The sentence is retained.
> >
> > > 5 - 0004
> > > ```
> > > +       my $isrecovery =
> > > +         $self->safe_psql('postgres', "SELECT pg_is_in_recovery()");
> > > +       chomp($isrecovery);
> > >         croak "unknown mode $mode for 'wait_for_catchup', valid modes are 
> > > "
> > >           . join(', ', keys(%valid_modes))
> > >           unless exists($valid_modes{$mode});
> > > @@ -3347,9 +3350,6 @@ sub wait_for_catchup
> > >         }
> > >         if (!defined($target_lsn))
> > >         {
> > > -               my $isrecovery =
> > > -                 $self->safe_psql('postgres', "SELECT 
> > > pg_is_in_recovery()");
> > > -               chomp($isrecovery);
> > > ```
> > >
> > > I wonder why pull up pg_is_in_recovery to an early place and 
> > > unconditionally call it?
> > >
> >
> > This seems unnecessary. I also realized that my earlier approach in
> > patch 4 may have been semantically incorrect — it could end up waiting
> > for the LSN to replay/write/flush on the node itself, rather than on
> > the downstream standby, which defeats the purpose of
> > wait_for_catchup(). Patch 4 attempts to address this by running WAIT
> > FOR LSN on the standby itself.
> >
> > Support for primary-flush waiting and the refactoring of existing
> > modes have been also incorporated in v8 following Alexander’s
> > feedback. The major updates are in patches 2 and 4. Please check.
> >
>
> Added WaitLSNTypeDesc to typedefs.list in v9 patch 2.
>

Run pgindent using the updated typedefs.list.

-- 
Best,
Xuneng

Attachment: v10-0004-Use-WAIT-FOR-LSN-in-PostgreSQL-Test-Cluster-wait.patch
Description: Binary data

Attachment: v10-0003-Add-tab-completion-for-WAIT-FOR-LSN-MODE-option.patch
Description: Binary data

Attachment: v10-0002-Add-MODE-option-to-WAIT-FOR-LSN-command.patch
Description: Binary data

Attachment: v10-0001-Extend-xlogwait-infrastructure-with-write-and-fl.patch
Description: Binary data

Reply via email to