> > What's your PostgreSQL community username? gordiychuk
It seems like what you're also trying to allow interruption deeper than > that, when we're in the middle of processing a reorder buffer commit record > and streaming that to the client. You're introducing an is_active member > (actually a callback, though name suggests it's a flag) in struct > ReorderBuffer to check whether a CopyDone is received, and you're skipping > ReorderBuffer commit processing when the callback returns false. The > callback returns "!streamingDoneReceiving && !streamingDoneSending" i.e. > it's false if either end has sent CopyDone. streamingDoneSending and > streamingDoneSending are only set in ProcessRepliesIfAny, called by > WalSndLoop and WalSndWaitForWal. So the idea is, presumably, that if we're > waiting for WAL from XLogSendLogical we skip processing of any commit > records and exit. > > That seems overcomplicated. > > When WalSndWaitForWAL is called > by logical_read_xlog_page, logical_read_xlog_page can just test > streamingDoneReceiving and streamingDoneSending. If they're set it can skip > the page read and return -1, which will cause the xlogreader to return a > null record to XLogSendLogical. That'll skip the decoding calls and return > to WalSndLoop, where we'll notice it's time to exit. > ProcessRepliesIfAny also now executes in WalSdnWriteData. Because during send data we should also check message from client(client can send CopyDone, KeepAlive, Terminate). @@ -1086,14 +1089,6 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, memcpy(&ctx->out->data[1 + sizeof(int64) + sizeof(int64)], tmpbuf.data, sizeof(int64)); - /* fast path */ - /* Try to flush pending output to the client */ - if (pq_flush_if_writable() != 0) - WalSndShutdown(); - - if (!pq_is_send_pending()) - return; - The main idea is that we can get CopyDone from client in the next functions: WalSdnLoop, WalSndWaitForWal, WalSndWriteData. All of this methods can take a long time, because WalSndWaitForWal can wait new transaction and on not active db it can take long enough, WalSndWriteData can send big transaction that also lead to ignore messages from client until long time(In my example above for 1 million object changes, walsender ignore messages 13 seconds and not allow reuse connection). When client send CopyDone they don't want receive message anymore for current LSN. For example physical replication can be interrupt in the middle of transaction that affect more than one LSN. Maybe I not correct undestand documentation, but I want reuse same connection without reopen it, because open new connection takes too long. Is it correct use case or CopyDOne it side effect of copy protocol and for complete replication need use always Terminate package and reopen connection?