Hi,

On 2026-04-06 23:07:45 -0400, Andres Freund wrote:
> But, leaving that aside, looking at this code I'm somewhat concerned - it
> seems to not worry at all about memory ordering?
> 
> 
> static void
> XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr, TimeLineID tli)
> ...
>       /* Update shared-memory status */
>       pg_atomic_write_u64(&WalRcv->writtenUpto, LogstreamResult.Write);
> 
>       /*
>        * If we wrote an LSN that someone was waiting for, notify the waiters.
>        */
>       if (waitLSNState &&
>               (LogstreamResult.Write >=
>                
> pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_WRITE])))
>               WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_WRITE, 
> LogstreamResult.Write);
> 
> There are no memory barriers here, so the CPU would be entirely free to not
> make the writtenUpto write visible to a waiter that's in the process of
> registering and is checking whether it needs to wait in WaitForLSN().
> 
> And WaitForLSN()->GetCurrentLSNForWaitType()->GetWalRcvWriteRecPtr() also has
> no barriers.  That MAYBE is ok, due addLSNWaiter() providing the barrier at
> loop entry and maybe kinda you can think that WaitLatch() will somehow also
> have barrier semantic.  But if so, that would need to be very carefully
> documented.  And it seems completely unnecessary here, it's hard to believe
> using a barrier (via pg_atomic_read_membarrier_u64() or such) would be a
> performance issue

And separately from the memory ordering, how can it make sense that there's
at least 5 copies of this

                if (waitLSNState &&
                        (LogstreamResult.Flush >=
                         
pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_FLUSH])))
                        WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_FLUSH, 
LogstreamResult.Flush);

around?  That needs to be encapsulated so that if you have a bug, like the
memory ordering problem I describe above, it can be fixed once, not in
multiple places.

And why do these callers even have that pre-check?  Seems WaitLSNWakeup()
does so itself?

        /*
         * Fast path check.  Skip if currentLSN is InvalidXLogRecPtr, which 
means
         * "wake all waiters" (e.g., during promotion when recovery ends).
         */
        if (XLogRecPtrIsValid(currentLSN) &&
                pg_atomic_read_u64(&waitLSNState->minWaitedLSN[i]) > currentLSN)
                return;

And why is the code checking if waitLSNState is non-NULL?

Greetings,

Andres Freund


Reply via email to