On 6 May 2016 at 23:23, Vladimir Gordiychuk <fol...@gmail.com> wrote:
> I prepare small patch that fix problems describe below. Now *WalSndWriteData
> *first check message from consumer and during decode transaction check
> that replication still active.
OK, upon looking closer I'm not sure I agree with this approach.
In WalSndLoop(...) we currently call ProcessRepliesIfAny() which handles a
client-initiated CopyDone by replying with its own CopyDone. WalSndLoop
then just waits until the send buffer is flushed or the client disconnects
and returns. That looks to be pretty much what's needed... but only when we
actually execute that code path.
XLogSendLogical calls WalSndWaitForWal(...) (via xlogreader and
logical_read_xlog_page) when waiting for something to process, when the
client will be blocked on a socket read. It also processes client CopyDone,
but it doesn't seem to test streamingDoneSending or streamingDoneReceiving
like the outer WalSndLoop does. So it looks like it continues waiting for
WAL when we know the client sent CopyDone and doesn't want anything more.
I think we should be testing that in WalSndWaitForWal, like we do in
WalSndLoop, and bailing out. You've done that part, and I agree that's good.
It seems like what you're also trying to allow interruption deeper than
that, when we're in the middle of processing a reorder buffer commit record
and streaming that to the client. You're introducing an is_active member
(actually a callback, though name suggests it's a flag) in struct
ReorderBuffer to check whether a CopyDone is received, and you're skipping
ReorderBuffer commit processing when the callback returns false. The
callback returns "!streamingDoneReceiving && !streamingDoneSending" i.e.
it's false if either end has sent CopyDone. streamingDoneSending and
streamingDoneSending are only set in ProcessRepliesIfAny, called by
WalSndLoop and WalSndWaitForWal. So the idea is, presumably, that if we're
waiting for WAL from XLogSendLogical we skip processing of any commit
records and exit.
That seems overcomplicated.
When WalSndWaitForWAL is called
by logical_read_xlog_page, logical_read_xlog_page can just test
streamingDoneReceiving and streamingDoneSending. If they're set it can skip
the page read and return -1, which will cause the xlogreader to return a
null record to XLogSendLogical. That'll skip the decoding calls and return
to WalSndLoop, where we'll notice it's time to exit.
That way there's no need to modify the reorder buffer structs, add
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services