On Tue, Mar 1, 2022 at 10:35 PM Nathan Bossart <nathandboss...@gmail.com> wrote: > > On Tue, Mar 01, 2022 at 04:34:31PM +0900, Kyotaro Horiguchi wrote: > > At Mon, 28 Feb 2022 22:05:28 -0800, Nathan Bossart > > <nathandboss...@gmail.com> wrote in > >> replicated LSN. TBH there are probably other things that need to be > >> considered (e.g., how do we ensure that the WAL sender sends the rest once > >> it is replicated?), but I still think we should avoid spinning in the WAL > >> sender waiting for WAL to be replicated. > > > > It seems to me it would be something similar to > > SyncRepReleaseWaiters(). Or it could be possible to consolidate this > > feature into the function, I'm not sure, though. > > Yes, perhaps the synchronous replication framework will need to alert WAL > senders when the syncrep LSN advances so that the WAL is sent to the async > standbys. I'm glossing over the details, but I think that should be the > general direction.
It's doable. But we can't avoid async walsenders waiting for the flush LSN even if we take the SyncRepReleaseWaiters() approach right? I'm not sure (at this moment) what's the biggest advantage of this approach i.e. (1) backends waking up walsenders after flush lsn is updated vs (2) walsenders keep looking for the new flush lsn. > >> My feedback is specifically about this behavior. I don't think we should > >> spin in XLogSend*() waiting for an LSN to be synchronously replicated. I > >> think we should just choose the SendRqstPtr based on what is currently > >> synchronously replicated. > > > > Do you mean something like the following? > > > > /* Main loop of walsender process that streams the WAL over Copy messages. > > */ > > static void > > WalSndLoop(WalSndSendDataCallback send_data) > > { > > /* > > * Loop until we reach the end of this timeline or the client requests > > to > > * stop streaming. > > */ > > for (;;) > > { > > if (am_async_walsender && there_are_sync_standbys) > > { > > XLogRecPtr SendRqstLSN; > > XLogRecPtr SyncFlushLSN; > > > > SendRqstLSN = GetFlushRecPtr(NULL); > > LWLockAcquire(SyncRepLock, LW_SHARED); > > SyncFlushLSN = walsndctl->lsn[SYNC_REP_WAIT_FLUSH]; > > LWLockRelease(SyncRepLock); > > > > if (SendRqstLSN > SyncFlushLSN) > > continue; > > } > > Not quite. Instead of "continue", I would set SendRqstLSN to SyncFlushLSN > so that the WAL sender only sends up to the current synchronously > replicated LSN. TBH there are probably other things that need to be > considered (e.g., how do we ensure that the WAL sender sends the rest once > it is replicated?), but I still think we should avoid spinning in the WAL > sender waiting for WAL to be replicated. I did some more analysis on the above point: we can let XLogSendPhysical know up to which it can send WAL (SendRqstLSN). But, XLogSendLogical reads the WAL using XLogReadRecord mechanism with read_local_xlog_page page_read callback to which we can't really say SendRqstLSN. May be we have to do something like below: XLogSendPhysical: /* Figure out how far we can safely send the WAL. */ if (am_async_walsender && there_are_sync_standbys) { LWLockAcquire(SyncRepLock, LW_SHARED); SendRqstPtr = WalSndCtl->lsn[SYNC_REP_WAIT_FLUSH]; LWLockRelease(SyncRepLock); } /* Existing code path to determine SendRqstPtr */ else if (sendTimeLineIsHistoric) { } else if (am_cascading_walsender) { } else { /* * Streaming the current timeline on a primary. } XLogSendLogical: if (am_async_walsender && there_are_sync_standbys) { XLogRecPtr SendRqstLSN; XLogRecPtr SyncFlushLSN; SendRqstLSN = GetFlushRecPtr(NULL); LWLockAcquire(SyncRepLock, LW_SHARED); SyncFlushLSN = WalSndCtl->lsn[SYNC_REP_WAIT_FLUSH]; LWLockRelease(SyncRepLock); if (SendRqstLSN > SyncFlushLSN) return; } On Tue, Mar 1, 2022 at 7:35 AM Hsu, John <hsuc...@amazon.com> wrote: > > I too observed this once or twice. It looks like the walsender isn't > > detecting postmaster death in for (;;) with WalSndWait. Not sure if > > this is expected or true with other wait-loops in walsender code. Any > > more thoughts here? > > Unfortunately I haven't had a chance to dig into it more although iirc I hit > it fairly often. I think I got what the issue is. Below does the trick. if (got_STOPPING) proc_exit(0); * If the server is shut down, checkpointer sends us * PROCSIG_WALSND_INIT_STOPPING after all regular backends have exited. I will take care of this in the next patch once the approach we take for this feature gets finalized. Regards, Bharath Rupireddy.