Re: [HACKERS] Stopping logical replication protocol
On Fri, Nov 4, 2016 at 12:44 AM, Craig Ringerwrote: > On 21 October 2016 at 19:38, Vladimir Gordiychuk wrote: > > Craig, Andres what do you thinks about previous message? > > I haven't had a chance to look further to be honest. > > Since a downstream disconnect works, though it's ugly, it's not > something I can justify spending a lot of time on, and I already did > spend a lot on it in patch review/updating/testing etc. > > I don't know what Andres wants, but I think CopyFail with > ERRCODE_QUERY_CANCELED is fine. > > As for plugins that collect changes in memory and only send them on > commit, I'd call them "broken". Not an interesting case IMO. Don't do > that. > Currently the patch is marked as "returned with feedback" as there are some comments from reviewer. Please free to submit an update patch to the next commitfest. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] Stopping logical replication protocol
On 21 October 2016 at 19:38, Vladimir Gordiychukwrote: > Craig, Andres what do you thinks about previous message? I haven't had a chance to look further to be honest. Since a downstream disconnect works, though it's ugly, it's not something I can justify spending a lot of time on, and I already did spend a lot on it in patch review/updating/testing etc. I don't know what Andres wants, but I think CopyFail with ERRCODE_QUERY_CANCELED is fine. As for plugins that collect changes in memory and only send them on commit, I'd call them "broken". Not an interesting case IMO. Don't do that. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
Craig, Andres what do you thinks about previous message? 2016-10-06 0:35 GMT+03:00 Vladimir Gordiychuk: > > Vladimir? I'm running out of time to spend on this at least until the next > CF. Think you can make these changes? > > Yes, I can. But I thinks It should be discuss first. > > > Terminating COPY BOTH with ERRCODE_QUERY_CANCELED seems fine. If it > > was expecting the error the client can Sync and do whatever it next > > wants to do on the walsender protocol, or bail out nicely. > > Do you want return CopyFail with ERRCODE_QUERY_CANCELED instead of > CopyDone on client initialized stop replication? > > > Hm, I'm *VERY* doubtful about doing this via yet another callback. Why > > don't we just error out in the prepare write callback? > > But what about scenario when output plugin collect changes in memory with > some transformation and send it only inside commit_cb? > It means that OutputPluginPrepareWrite will not execute until end > transaction and we face with too long stops replication when decodes huge > transaction. > > 2016-10-05 13:06 GMT+03:00 Craig Ringer : > >> On 5 October 2016 at 05:14, Andres Freund wrote: >> > Hi, >> > >> > On 2016-09-23 13:01:27 +0800, Craig Ringer wrote: >> >> From f98f2388c57d938ebbe07ccd2dbe02138312858f Mon Sep 17 00:00:00 2001 >> >> From: Vladimir Gordiychuk >> >> Date: Wed, 7 Sep 2016 00:39:18 +0300 >> >> Subject: [PATCH 2/4] Client-initiated CopyDone during transaction >> decoding in >> >> walsender >> >> >> >> The prior patch caused the walsender to react to a client-initiated >> >> CopyDone while it's in the WalSndLoop. That's not enough to let clients >> >> end logical decoding mid-transaction because we loop in >> ReorderBufferCommit >> >> during decoding of a transaction without ever returning to WalSndLoop. >> >> >> >> Allow breaking out of ReorderBufferCommit by detecting that client >> >> input has requested an end to COPY BOTH mode, so clients can abort >> >> decoding mid-transaction. >> > >> > Hm, I'm *VERY* doubtful about doing this via yet another callback. Why >> > don't we just error out in the prepare write callback? >> >> That's sensible. >> >> > I don't think it's a good idea to return a non-error state if a command >> > is interrupted in the middle. We might already have sent a partial >> > transaction or such. Also compare this to interrupting a query - we >> > don't just returning rows, we error out. >> >> Good point. It's not usually visible to the user because if it's a >> client-initiated cancel the client will generally consume the error, >> but there's still an error generated. >> >> Terminating COPY BOTH with ERRCODE_QUERY_CANCELED seems fine. If it >> was expecting the error the client can Sync and do whatever it next >> wants to do on the walsender protocol, or bail out nicely. >> >> Vladimir? I'm running out of time to spend on this at least until the >> next CF. Think you can make these changes? >> >> >> -- >> Craig Ringer http://www.2ndQuadrant.com/ >> PostgreSQL Development, 24x7 Support, Training & Services >> > >
Re: [HACKERS] Stopping logical replication protocol
> Vladimir? I'm running out of time to spend on this at least until the next CF. Think you can make these changes? Yes, I can. But I thinks It should be discuss first. > Terminating COPY BOTH with ERRCODE_QUERY_CANCELED seems fine. If it > was expecting the error the client can Sync and do whatever it next > wants to do on the walsender protocol, or bail out nicely. Do you want return CopyFail with ERRCODE_QUERY_CANCELED instead of CopyDone on client initialized stop replication? > Hm, I'm *VERY* doubtful about doing this via yet another callback. Why > don't we just error out in the prepare write callback? But what about scenario when output plugin collect changes in memory with some transformation and send it only inside commit_cb? It means that OutputPluginPrepareWrite will not execute until end transaction and we face with too long stops replication when decodes huge transaction. 2016-10-05 13:06 GMT+03:00 Craig Ringer: > On 5 October 2016 at 05:14, Andres Freund wrote: > > Hi, > > > > On 2016-09-23 13:01:27 +0800, Craig Ringer wrote: > >> From f98f2388c57d938ebbe07ccd2dbe02138312858f Mon Sep 17 00:00:00 2001 > >> From: Vladimir Gordiychuk > >> Date: Wed, 7 Sep 2016 00:39:18 +0300 > >> Subject: [PATCH 2/4] Client-initiated CopyDone during transaction > decoding in > >> walsender > >> > >> The prior patch caused the walsender to react to a client-initiated > >> CopyDone while it's in the WalSndLoop. That's not enough to let clients > >> end logical decoding mid-transaction because we loop in > ReorderBufferCommit > >> during decoding of a transaction without ever returning to WalSndLoop. > >> > >> Allow breaking out of ReorderBufferCommit by detecting that client > >> input has requested an end to COPY BOTH mode, so clients can abort > >> decoding mid-transaction. > > > > Hm, I'm *VERY* doubtful about doing this via yet another callback. Why > > don't we just error out in the prepare write callback? > > That's sensible. > > > I don't think it's a good idea to return a non-error state if a command > > is interrupted in the middle. We might already have sent a partial > > transaction or such. Also compare this to interrupting a query - we > > don't just returning rows, we error out. > > Good point. It's not usually visible to the user because if it's a > client-initiated cancel the client will generally consume the error, > but there's still an error generated. > > Terminating COPY BOTH with ERRCODE_QUERY_CANCELED seems fine. If it > was expecting the error the client can Sync and do whatever it next > wants to do on the walsender protocol, or bail out nicely. > > Vladimir? I'm running out of time to spend on this at least until the > next CF. Think you can make these changes? > > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Stopping logical replication protocol
On 5 October 2016 at 05:14, Andres Freundwrote: > Hi, > > On 2016-09-23 13:01:27 +0800, Craig Ringer wrote: >> From f98f2388c57d938ebbe07ccd2dbe02138312858f Mon Sep 17 00:00:00 2001 >> From: Vladimir Gordiychuk >> Date: Wed, 7 Sep 2016 00:39:18 +0300 >> Subject: [PATCH 2/4] Client-initiated CopyDone during transaction decoding in >> walsender >> >> The prior patch caused the walsender to react to a client-initiated >> CopyDone while it's in the WalSndLoop. That's not enough to let clients >> end logical decoding mid-transaction because we loop in ReorderBufferCommit >> during decoding of a transaction without ever returning to WalSndLoop. >> >> Allow breaking out of ReorderBufferCommit by detecting that client >> input has requested an end to COPY BOTH mode, so clients can abort >> decoding mid-transaction. > > Hm, I'm *VERY* doubtful about doing this via yet another callback. Why > don't we just error out in the prepare write callback? That's sensible. > I don't think it's a good idea to return a non-error state if a command > is interrupted in the middle. We might already have sent a partial > transaction or such. Also compare this to interrupting a query - we > don't just returning rows, we error out. Good point. It's not usually visible to the user because if it's a client-initiated cancel the client will generally consume the error, but there's still an error generated. Terminating COPY BOTH with ERRCODE_QUERY_CANCELED seems fine. If it was expecting the error the client can Sync and do whatever it next wants to do on the walsender protocol, or bail out nicely. Vladimir? I'm running out of time to spend on this at least until the next CF. Think you can make these changes? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
Hi, On 2016-09-23 13:01:27 +0800, Craig Ringer wrote: > From f98f2388c57d938ebbe07ccd2dbe02138312858f Mon Sep 17 00:00:00 2001 > From: Vladimir Gordiychuk> Date: Wed, 7 Sep 2016 00:39:18 +0300 > Subject: [PATCH 2/4] Client-initiated CopyDone during transaction decoding in > walsender > > The prior patch caused the walsender to react to a client-initiated > CopyDone while it's in the WalSndLoop. That's not enough to let clients > end logical decoding mid-transaction because we loop in ReorderBufferCommit > during decoding of a transaction without ever returning to WalSndLoop. > > Allow breaking out of ReorderBufferCommit by detecting that client > input has requested an end to COPY BOTH mode, so clients can abort > decoding mid-transaction. Hm, I'm *VERY* doubtful about doing this via yet another callback. Why don't we just error out in the prepare write callback? I don't think it's a good idea to return a non-error state if a command is interrupted in the middle. We might already have sent a partial transaction or such. Also compare this to interrupting a query - we don't just returning rows, we error out. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
> Vladimir, can I have your opinion on the latest patch or if you want to make changes, an updated patch? I am satisfied with all our changes and I thinks it enough to complete this PR. 2016-10-03 6:51 GMT+03:00 Craig Ringer: > On 3 Oct. 2016 10:15, "Michael Paquier" wrote: > > > > On Tue, Sep 27, 2016 at 11:05 AM, Craig Ringer > wrote: > > > On 26 September 2016 at 21:52, Vladimir Gordiychuk > wrote: > > >>>You should rely on time I tests as little as possible. Some of the > test > > >>> hosts are VERY slow. If possible something deterministic should be > used. > > >> > > >> That's why this changes difficult to verify. Maybe in that case we > should > > >> write benchmark but not integration test? > > >> In that case we can say, before this changes stopping logical > replication > > >> gets N ms but after apply changes it gets NN ms where NN ms less than > N ms. > > >> Is it available add this kind of benchmark to postgresql? I will be > grateful > > >> for links. > > > > > > It's for that reason that I added a message printed only in verbose > > > mode that pg_recvlogical emits when it's exiting after a > > > client-initiated copydone. > > > > > > You can use the TAP tests, print diag messages, etc. But we generally > > > want them to run fairly quickly, so big benchmark runs aren't > > > desirable. You'll notice that I left diag messages in to report the > > > timing for the results in your tests, I just changed the tests so they > > > didn't depend on very tight timing for success/failure anymore. > > > > > > We don't currently have any automated benchmarking infrastructure. > > > > Which seems like this patch is not complete yet. > > Personally I think it is. I'm just explaining why I adjusted Vladimir's > tests to be less timing sensitive and rely on a qualitative property > instead. > > That said, it's had recent change and it isn't a big intrusive change that > really need attention this cf. > > > I am marking it as > > returned with feedback, but it may be a better idea to move it to next > > CF once a new version with updated tests shows up. > > I'll move it now. I think the tests are fine, if Vladimir agrees, so IMO > it's ready to go. More eyes wouldn't hurt though. > > If Vladimir wants benchmarking based tests that's a whole separate project > IMO. Something that will work robustly on the weird slow machines we have > isn't simple. Probably a new buildfarm option etc since we won't want to > clutter the really slow old niche boxes with it. > > Vladimir, can I have your opinion on the latest patch or if you want to > make changes, an updated patch? > > >
Re: [HACKERS] Stopping logical replication protocol
On 3 Oct. 2016 10:15, "Michael Paquier"wrote: > > On Tue, Sep 27, 2016 at 11:05 AM, Craig Ringer wrote: > > On 26 September 2016 at 21:52, Vladimir Gordiychuk wrote: > >>>You should rely on time I tests as little as possible. Some of the test > >>> hosts are VERY slow. If possible something deterministic should be used. > >> > >> That's why this changes difficult to verify. Maybe in that case we should > >> write benchmark but not integration test? > >> In that case we can say, before this changes stopping logical replication > >> gets N ms but after apply changes it gets NN ms where NN ms less than N ms. > >> Is it available add this kind of benchmark to postgresql? I will be grateful > >> for links. > > > > It's for that reason that I added a message printed only in verbose > > mode that pg_recvlogical emits when it's exiting after a > > client-initiated copydone. > > > > You can use the TAP tests, print diag messages, etc. But we generally > > want them to run fairly quickly, so big benchmark runs aren't > > desirable. You'll notice that I left diag messages in to report the > > timing for the results in your tests, I just changed the tests so they > > didn't depend on very tight timing for success/failure anymore. > > > > We don't currently have any automated benchmarking infrastructure. > > Which seems like this patch is not complete yet. Personally I think it is. I'm just explaining why I adjusted Vladimir's tests to be less timing sensitive and rely on a qualitative property instead. That said, it's had recent change and it isn't a big intrusive change that really need attention this cf. > I am marking it as > returned with feedback, but it may be a better idea to move it to next > CF once a new version with updated tests shows up. I'll move it now. I think the tests are fine, if Vladimir agrees, so IMO it's ready to go. More eyes wouldn't hurt though. If Vladimir wants benchmarking based tests that's a whole separate project IMO. Something that will work robustly on the weird slow machines we have isn't simple. Probably a new buildfarm option etc since we won't want to clutter the really slow old niche boxes with it. Vladimir, can I have your opinion on the latest patch or if you want to make changes, an updated patch?
Re: [HACKERS] Stopping logical replication protocol
On Tue, Sep 27, 2016 at 11:05 AM, Craig Ringerwrote: > On 26 September 2016 at 21:52, Vladimir Gordiychuk wrote: >>>You should rely on time I tests as little as possible. Some of the test >>> hosts are VERY slow. If possible something deterministic should be used. >> >> That's why this changes difficult to verify. Maybe in that case we should >> write benchmark but not integration test? >> In that case we can say, before this changes stopping logical replication >> gets N ms but after apply changes it gets NN ms where NN ms less than N ms. >> Is it available add this kind of benchmark to postgresql? I will be grateful >> for links. > > It's for that reason that I added a message printed only in verbose > mode that pg_recvlogical emits when it's exiting after a > client-initiated copydone. > > You can use the TAP tests, print diag messages, etc. But we generally > want them to run fairly quickly, so big benchmark runs aren't > desirable. You'll notice that I left diag messages in to report the > timing for the results in your tests, I just changed the tests so they > didn't depend on very tight timing for success/failure anymore. > > We don't currently have any automated benchmarking infrastructure. Which seems like this patch is not complete yet. I am marking it as returned with feedback, but it may be a better idea to move it to next CF once a new version with updated tests shows up. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
On 26 September 2016 at 21:52, Vladimir Gordiychukwrote: >>You should rely on time I tests as little as possible. Some of the test >> hosts are VERY slow. If possible something deterministic should be used. > > That's why this changes difficult to verify. Maybe in that case we should > write benchmark but not integration test? > In that case we can say, before this changes stopping logical replication > gets N ms but after apply changes it gets NN ms where NN ms less than N ms. > Is it available add this kind of benchmark to postgresql? I will be grateful > for links. It's for that reason that I added a message printed only in verbose mode that pg_recvlogical emits when it's exiting after a client-initiated copydone. You can use the TAP tests, print diag messages, etc. But we generally want them to run fairly quickly, so big benchmark runs aren't desirable. You'll notice that I left diag messages in to report the timing for the results in your tests, I just changed the tests so they didn't depend on very tight timing for success/failure anymore. We don't currently have any automated benchmarking infrastructure. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
>You should rely on time I tests as little as possible. Some of the test hosts are VERY slow. If possible something deterministic should be used. That's why this changes difficult to verify. Maybe in that case we should write benchmark but not integration test? In that case we can say, before this changes stopping logical replication gets N ms but after apply changes it gets NN ms where NN ms less than N ms. Is it available add this kind of benchmark to postgresql? I will be grateful for links. 2016-09-26 5:32 GMT+03:00 Craig Ringer: > On 26 Sep. 2016 02:16, "Vladimir Gordiychuk" wrote: > > > > >It looks like you didn't import my updated patches, so I've > rebased your new patches on top of them. > > Yes, i forgot it, sorry. Thanks for you fixes. > > > > >I did't see you explain why this was removed: > > > > -/* fast path */ > > -/* Try to flush pending output to the client */ > > -if (pq_flush_if_writable() != 0) > > -WalSndShutdown(); > > - > > -if (!pq_is_send_pending()) > > -return; > > > > I remove it, because during decode long transaction code, we always > exist from function by fast path and not receive messages from client. Now > we always include in 'for' loop and executes > > /* Check for input from the client */ ProcessRepliesIfAny(); > > Makes sense. > I > > >Some of the changes to pg_recvlogical look to be copied from > > >receivelog.c, most notably the functions ProcessKeepalive and > > >ProcessXLogData . > > > > During refactoring huge function I also notice It, but decide not > include additional changes in already huge patch. I only split method that > do everything to few small functions. > > Good decision. This improves things already and makes future refactoring > easier. > > > >I was evaluating the tests and I don't think they actually demonstrate > > >that the patch works. There's nothing done to check that > > >pg_recvlogical exited because of client-initiated CopyDone. While > > >looking into that I found that it also never actually hits > > >ProcessCopyDone(...) because libpq handles the CopyDone reply from the > > >server its self and treats it as end-of-stream. So the loop in > > >StreamLogicalLog will just end and ProcessCopyDone() is dead code. > > > > The main idea was about fast stop logical replication as it available, > because if stop replication gets more them 1 seconds it takes more pain. > > You should rely on time I tests as little as possible. Some of the test > hosts are VERY slow. If possible something deterministic should be used. > > > Increase this timeout to 3 or 5 second hide problems that this PR try > fix, that's why I think this lines should be restore to state from previous > patch. > > Per above I don't agree. > > > > > ``` > > ok($spendTime <= 5, # allow extra time for slow machines > > "pg_recvlogical exited promptly on signal when decoding"); > > > > ok((time() - $cancelTime) <= 3, # allow extra time for slow machines > > "pg_recvlogical exited promptly on sigint when idle" > > ); > > > > ``` > > > > Also I do not understand why we do > > > > $proc->pump while $proc->pumpable; > > > > after send signal to process, why we can not just wait stop replication > slot? > > Because verbose output can produce a lot of writes. If we don't pump the > buffer pg_recvlogical could block writing on its socket. Also it lets us > capture stderr which we need to observe that pg_recvlogical actually ended > copy on user command rather than just receiving all the input available. >
Re: [HACKERS] Stopping logical replication protocol
On 26 Sep. 2016 02:16, "Vladimir Gordiychuk"wrote: > > >It looks like you didn't import my updated patches, so I've rebased your new patches on top of them. > Yes, i forgot it, sorry. Thanks for you fixes. > > >I did't see you explain why this was removed: > > -/* fast path */ > -/* Try to flush pending output to the client */ > -if (pq_flush_if_writable() != 0) > -WalSndShutdown(); > - > -if (!pq_is_send_pending()) > -return; > > I remove it, because during decode long transaction code, we always exist from function by fast path and not receive messages from client. Now we always include in 'for' loop and executes > /* Check for input from the client */ ProcessRepliesIfAny(); Makes sense. I > >Some of the changes to pg_recvlogical look to be copied from > >receivelog.c, most notably the functions ProcessKeepalive and > >ProcessXLogData . > > During refactoring huge function I also notice It, but decide not include additional changes in already huge patch. I only split method that do everything to few small functions. Good decision. This improves things already and makes future refactoring easier. > >I was evaluating the tests and I don't think they actually demonstrate > >that the patch works. There's nothing done to check that > >pg_recvlogical exited because of client-initiated CopyDone. While > >looking into that I found that it also never actually hits > >ProcessCopyDone(...) because libpq handles the CopyDone reply from the > >server its self and treats it as end-of-stream. So the loop in > >StreamLogicalLog will just end and ProcessCopyDone() is dead code. > > The main idea was about fast stop logical replication as it available, because if stop replication gets more them 1 seconds it takes more pain. You should rely on time I tests as little as possible. Some of the test hosts are VERY slow. If possible something deterministic should be used. > Increase this timeout to 3 or 5 second hide problems that this PR try fix, that's why I think this lines should be restore to state from previous patch. Per above I don't agree. > > ``` > ok($spendTime <= 5, # allow extra time for slow machines > "pg_recvlogical exited promptly on signal when decoding"); > > ok((time() - $cancelTime) <= 3, # allow extra time for slow machines > "pg_recvlogical exited promptly on sigint when idle" > ); > > ``` > > Also I do not understand why we do > > $proc->pump while $proc->pumpable; > > after send signal to process, why we can not just wait stop replication slot? Because verbose output can produce a lot of writes. If we don't pump the buffer pg_recvlogical could block writing on its socket. Also it lets us capture stderr which we need to observe that pg_recvlogical actually ended copy on user command rather than just receiving all the input available.
Re: [HACKERS] Stopping logical replication protocol
>It looks like you didn't import my updated patches, so I've rebased your new patches on top of them. Yes, i forgot it, sorry. Thanks for you fixes. >I did't see you explain why this was removed: -/* fast path */ -/* Try to flush pending output to the client */ -if (pq_flush_if_writable() != 0) -WalSndShutdown(); - -if (!pq_is_send_pending()) -return; I remove it, because during decode long transaction code, we always exist from function by fast path and not receive messages from client. Now we always include in 'for' loop and executes /* Check for input from the client */ ProcessRepliesIfAny(); >Some of the changes to pg_recvlogical look to be copied from >receivelog.c, most notably the functions ProcessKeepalive and >ProcessXLogData . During refactoring huge function I also notice It, but decide not include additional changes in already huge patch. I only split method that do everything to few small functions. >I was evaluating the tests and I don't think they actually demonstrate >that the patch works. There's nothing done to check that >pg_recvlogical exited because of client-initiated CopyDone. While >looking into that I found that it also never actually hits >ProcessCopyDone(...) because libpq handles the CopyDone reply from the >server its self and treats it as end-of-stream. So the loop in >StreamLogicalLog will just end and ProcessCopyDone() is dead code. The main idea was about fast stop logical replication as it available, because if stop replication gets more them 1 seconds it takes more pain. That's why my tests not contained check pg_reclogincal output, only time spend on stop replication(CopyDone exchange). Increase this timeout to 3 or 5 second hide problems that this PR try fix, that's why I think this lines should be restore to state from previous patch. ``` ok($spendTime <= 5, # allow extra time for slow machines "pg_recvlogical exited promptly on signal when decoding"); ok((time() - $cancelTime) <= 3, # allow extra time for slow machines "pg_recvlogical exited promptly on sigint when idle" ); ``` Also I do not understand why we do $proc->pump while $proc->pumpable; after send signal to process, why we can not just wait stop replication slot? 2016-09-23 8:01 GMT+03:00 Craig Ringer: > On 19 September 2016 at 07:12, Vladimir Gordiychuk > wrote: > > New patch in attach. 0001 and 0002 without changes. > > 0003 - patch contain improvements for pg_recvloginca, now pg_recvloginca > > after receive SIGINT will send CopyDone package to postgresql and wait > from > > server CopyDone. For backward compatible after repeat send SIGINT > > pg_recvloginca will continue immediately without wait CopyDone from > server. > > 0004 patch contain regression tests on scenarios that fix 0001 and 0002 > > patches. > > Great. > > Thanks for this. > > > First observation (which I'll fix in updated patch): > > > > It looks like you didn't import my updated patches, so I've rebased > your new patches on top of them. > > Whitespace issues: > > $ git am ~/Downloads/0003-Add-ability-for-pg_recvlogical-safe-stop- > replication.patch > Applying: Add ability for pg_recvlogical safe stop replication > /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:140: indent > with spaces. >msgBuf + hdr_len + bytes_written, > /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:550: indent > with spaces. > /* Backward compatible, allow force interrupt logical replication > /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:551: indent > with spaces. > * after second SIGINT without wait CopyDone from server > /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:552: indent > with spaces. > */ > warning: 4 lines add whitespace errors. > > > Remember to use "git log --check" before sending patches. > > Also, comment style, please do > > /* this */ > > or > > /* > * this > */ > > not > > /* this > */ > > > > > I did't see you explain why this was removed: > > -/* fast path */ > -/* Try to flush pending output to the client */ > -if (pq_flush_if_writable() != 0) > -WalSndShutdown(); > - > -if (!pq_is_send_pending()) > -return; > > > > I fixed a warning introduced here: > > > pg_recvlogical.c: In function ‘ProcessXLogData’: > pg_recvlogical.c:289:2: warning: ISO C90 forbids mixed declarations > and code [-Wdeclaration-after-statement] > int bytes_left = msgLength - hdr_len; > ^ > > > Some of the changes to pg_recvlogical look to be copied from > receivelog.c, most notably the functions ProcessKeepalive and > ProcessXLogData . I thought that rather than copying them, shouldn't > the existing ones be modified to meet your needs? But it looks like > the issue is that a fair chunk of the rest of pg_recvlogical doesn't > re-use code from receivelog.c either; for example, pg_recvlogical's > sendFeedback differs from receivelog.c's sendFeedback. So >
Re: [HACKERS] Stopping logical replication protocol
On 19 September 2016 at 07:12, Vladimir Gordiychukwrote: > New patch in attach. 0001 and 0002 without changes. > 0003 - patch contain improvements for pg_recvloginca, now pg_recvloginca > after receive SIGINT will send CopyDone package to postgresql and wait from > server CopyDone. For backward compatible after repeat send SIGINT > pg_recvloginca will continue immediately without wait CopyDone from server. > 0004 patch contain regression tests on scenarios that fix 0001 and 0002 > patches. Great. Thanks for this. First observation (which I'll fix in updated patch): It looks like you didn't import my updated patches, so I've rebased your new patches on top of them. Whitespace issues: $ git am ~/Downloads/0003-Add-ability-for-pg_recvlogical-safe-stop-replication.patch Applying: Add ability for pg_recvlogical safe stop replication /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:140: indent with spaces. msgBuf + hdr_len + bytes_written, /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:550: indent with spaces. /* Backward compatible, allow force interrupt logical replication /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:551: indent with spaces. * after second SIGINT without wait CopyDone from server /home/craig/projects/2Q/postgres/.git/rebase-apply/patch:552: indent with spaces. */ warning: 4 lines add whitespace errors. Remember to use "git log --check" before sending patches. Also, comment style, please do /* this */ or /* * this */ not /* this */ I did't see you explain why this was removed: -/* fast path */ -/* Try to flush pending output to the client */ -if (pq_flush_if_writable() != 0) -WalSndShutdown(); - -if (!pq_is_send_pending()) -return; I fixed a warning introduced here: pg_recvlogical.c: In function ‘ProcessXLogData’: pg_recvlogical.c:289:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] int bytes_left = msgLength - hdr_len; ^ Some of the changes to pg_recvlogical look to be copied from receivelog.c, most notably the functions ProcessKeepalive and ProcessXLogData . I thought that rather than copying them, shouldn't the existing ones be modified to meet your needs? But it looks like the issue is that a fair chunk of the rest of pg_recvlogical doesn't re-use code from receivelog.c either; for example, pg_recvlogical's sendFeedback differs from receivelog.c's sendFeedback. So pg_recvlogical doesn't share the globals that receivelog.c assumes are used. Also, what I thought were copied from receivelog.c were actually extracted from code previously inline in StreamLogicalLog(...) in pg_recvlogical.c . I'm reluctant to try to untangle this in the same patch for risk that it'll get stalled by issues with refactoring. The change you've already made is a useful cleanup without messing with unnecessary code. Your patch 3 does something ... odd: src/test/recovery/t/001_stream_rep.pl| 63 -- src/test/recovery/t/002_archiving.pl | 53 --- src/test/recovery/t/003_recovery_targets.pl | 146 --- src/test/recovery/t/004_timeline_switch.pl | 75 -- src/test/recovery/t/005_replay_delay.pl | 69 src/test/recovery/t/006_logical_decoding.pl | 40 -- src/test/recovery/t/007_sync_rep.pl | 174 src/test/recovery/t/previous/001_stream_rep.pl | 63 ++ src/test/recovery/t/previous/002_archiving.pl| 53 +++ src/test/recovery/t/previous/003_recovery_targets.pl | 146 +++ src/test/recovery/t/previous/004_timeline_switch.pl | 75 ++ src/test/recovery/t/previous/005_replay_delay.pl | 69 src/test/recovery/t/previous/006_logical_decoding.pl | 40 ++ src/test/recovery/t/previous/007_sync_rep.pl | 174 so I've revised it to remove that whole change, which I think you probably did not mean to commit. Reworded commit message. Ensured that we send feedback just before a client-initiated CopyDone so server knows what position we saved up to. We don't discard already buffered data I've modified patch 3 so that it also flushes to disk and sends feedback before sending CopyDone, then discards any xlog data received after it sends CopyDone. This is helpful in ensuring that we get predictable behaviour when cancelling pg_recvxlog and restarting it because the client and server agree on the point at which replay stopped. I was evaluating the tests and I don't think they actually demonstrate that the patch works. There's nothing done to
Re: [HACKERS] Stopping logical replication protocol
New patch in attach. 0001 and 0002 without changes. 0003 - patch contain improvements for pg_recvloginca, now pg_recvloginca after receive SIGINT will send CopyDone package to postgresql and wait from server CopyDone. For backward compatible after repeat send SIGINT pg_recvloginca will continue immediately without wait CopyDone from server. 0004 patch contain regression tests on scenarios that fix 0001 and 0002 patches. 2016-09-16 10:11 GMT+03:00 Vladimir Gordiychuk: > >Have you had a chance to look at adding the tests we discussed? > > Not yet. I plane do it on this weekends > > 2016-09-16 4:37 GMT+03:00 Craig Ringer : > >> On 9 September 2016 at 12:03, Craig Ringer wrote: >> >> > Setting "waiting on author" in CF per discussion of the need for tests. >> >> Have you had a chance to look at adding the tests we discussed? >> >> -- >> Craig Ringer http://www.2ndQuadrant.com/ >> PostgreSQL Development, 24x7 Support, Training & Services >> > > From 97da2c500f8f8edc69bcd520096c686e99e52612 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Tue, 10 May 2016 10:34:10 +0800 Subject: [PATCH 1/4] Respect client-initiated CopyDone in walsender The walsender never looked for CopyDone sent by the client unless it had already decided it was done sending data and dispatched its own CopyDone message. Check for client-initiated CopyDone when in COPY BOTH mode, returning to command mode. In logical decoding this will allow the client to end a logical decoding session between transactions without just unilaterally closing its connection. This change does not allow a client to end COPY BOTH session in the middle of processing a logical decoding block. TODO effect on physical walsender? --- src/backend/replication/walsender.c | 32 +++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 1ea2a5c..c43310c 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -759,6 +759,13 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req /* make sure we have enough WAL available */ flushptr = WalSndWaitForWal(targetPagePtr + reqLen); + /* + * If the client sent CopyDone while we were waiting, + * bail out so we can wind up the decoding session. + */ + if (streamingDoneSending) + return -1; + /* more than one block available */ if (targetPagePtr + XLOG_BLCKSZ <= flushptr) count = XLOG_BLCKSZ; @@ -1220,8 +1227,11 @@ WalSndWaitForWal(XLogRecPtr loc) * It's important to do this check after the recomputation of * RecentFlushPtr, so we can send all remaining data before shutting * down. + * + * We'll also exit here if the client sent CopyDone because it wants + * to return to command mode. */ - if (walsender_ready_to_stop) + if (walsender_ready_to_stop || streamingDoneReceiving) break; /* @@ -1787,7 +1797,14 @@ WalSndCheckTimeOut(TimestampTz now) } } -/* Main loop of walsender process that streams the WAL over Copy messages. */ +/* + * Main loop of walsender process that streams the WAL over Copy messages. + * + * The send_data callback must enqueue complete CopyData messages to libpq + * using pq_putmessage_noblock or similar, since the walsender loop may send + * CopyDone then exit and return to command mode in response to a client + * CopyDone between calls to send_data. + */ static void WalSndLoop(WalSndSendDataCallback send_data) { @@ -1850,10 +1867,15 @@ WalSndLoop(WalSndSendDataCallback send_data) * some more. If there is some, we don't bother to call send_data * again until we've flushed it ... but we'd better assume we are not * caught up. + * + * If we're trying to finish sending and exit we shouldn't enqueue more + * data to libpq. We need to finish writing out whatever we already + * have in libpq's send buffer to maintain protocol sync so we still + * need to loop until it's flushed. */ - if (!pq_is_send_pending()) + if (!pq_is_send_pending() && !streamingDoneSending) send_data(); - else + else if (!streamingDoneSending) WalSndCaughtUp = false; /* Try to flush pending output to the client */ @@ -2909,7 +2931,7 @@ WalSndKeepaliveIfNecessary(TimestampTz now) if (wal_sender_timeout <= 0 || last_reply_timestamp <= 0) return; - if (waiting_for_ping_response) + if (waiting_for_ping_response || streamingDoneSending) return; /* -- 1.9.1 From fcdcfe1aedfb3c7ef90c78c7d8acb4ca99a2ffdc Mon Sep 17 00:00:00 2001 From: Vladimir Gordiychuk Date: Wed, 7 Sep 2016 00:39:18 +0300 Subject: [PATCH 2/4] Client-initiated CopyDone during transaction decondig in walsender The walsender never looked for client messages during decode transaction by ReorderBufferCommit. It affect long transaction, even if client try
Re: [HACKERS] Stopping logical replication protocol
>Have you had a chance to look at adding the tests we discussed? Not yet. I plane do it on this weekends 2016-09-16 4:37 GMT+03:00 Craig Ringer: > On 9 September 2016 at 12:03, Craig Ringer wrote: > > > Setting "waiting on author" in CF per discussion of the need for tests. > > Have you had a chance to look at adding the tests we discussed? > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Stopping logical replication protocol
On 9 September 2016 at 12:03, Craig Ringerwrote: > Setting "waiting on author" in CF per discussion of the need for tests. Have you had a chance to look at adding the tests we discussed? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
On 9 September 2016 at 10:37, Craig Ringerwrote: > I'm looking at the revised patch now. Considerably improved. I fixed a typo from decondig to decoding that's throughout all the callback names. This test is wrong: +while ((change = ReorderBufferIterTXNNext(rb, iterstate)) != NULL && rb->continue_decoding_cb != NULL && rb->continue_decoding_cb()) If continue_decoding_cb is non-null, it'll never be called since the and will short-circuit. If it's null the and will be false so we'll call rb->continue_decoding_cb(). It also violates PostgreSQL source formatting coding conventions; it should be wrapped to 80 lines. (Yes, that's archaic, but it's kind of useful when you've got multiple files open side-by-side, and at least it's consistent across the codebase). so it should be: while ((change = ReorderBufferIterTXNNext(rb, iterstate)) != NULL && (rb->continue_decoding_cb == NULL || rb->continue_decoding_cb())) I'd prefer the continue_decoding_cb tests to be on the same line, but it's slightly too wide. Eh, whatever. Same logic issue later; - if(rb->continue_decoding_cb != NULL && rb->continue_decoding_cb()) +if (rb->continue_decoding_cb == NULL || rb->continue_decoding_cb()) Commit message is slightly inaccurate. The walsender DID look for client messages during ReorderBufferCommit decoding, but it never reacted to CopyDone sent by a client when it saw it. Rewrote comment on IsStreamingActive() per my original review advice. I'm not sure why exactly you removed -/* fast path */ -/* Try to flush pending output to the client */ -if (pq_flush_if_writable() != 0) -WalSndShutdown(); - -if (!pq_is_send_pending()) -return; - It should be OK if we flush pending output even after we receive CopyDone. In fact, we have to, since we can't unqueue it and have to send it before we can send our own CopyDone reply. pq_flush_if_writable() should only return EOF if the socket is closed, in which case fast bailout is the right thing to do. Can you explain your thinking here and what the intended outcome is? I've attached updated patches with a number of typo fixes, copy-editing changes, a fix to the test logic, etc as noted above. Setting "waiting on author" in CF per discussion of the need for tests. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 14bbd25cf27c3555126b571ee7cb41524f2a8729 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Tue, 10 May 2016 10:34:10 +0800 Subject: [PATCH 1/2] Respect client-initiated CopyDone in walsender The walsender never reacted to CopyDone sent by the client unless it had already decided it was done sending data and dispatched its own CopyDone message. It actually noticed CopyDone from the client when WalSndWriteData() called ProcessRepliesIfAny() but it didn't react to it until it separately decided to end streaming from the walsender end. Modify the walsender so it checks for client-initiated CopyDone when in COPY BOTH mode. It now cleans up what it's doing, reples with its own CopyDone and returns to command mode. In logical decoding this will allow the client to end a logical decoding session between transactions without just unilaterally closing its connection. For physical walsender connections this allows the client to end streaming before the end of a timeline. This change does not allow a client to end COPY BOTH session in the middle of processing a logical decoding commit (in ReorderBufferCommit) or while decoding a large WAL record, so there can still be a significant delay before the walsender reacts to the client. --- src/backend/replication/walsender.c | 23 +++ 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 1ea2a5c..ad94c13 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -759,6 +759,13 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req /* make sure we have enough WAL available */ flushptr = WalSndWaitForWal(targetPagePtr + reqLen); + /* + * If the client sent CopyDone while we were waiting, + * bail out so we can wind up the decoding session. + */ + if (streamingDoneSending) + return -1; + /* more than one block available */ if (targetPagePtr + XLOG_BLCKSZ <= flushptr) count = XLOG_BLCKSZ; @@ -1220,8 +1227,11 @@ WalSndWaitForWal(XLogRecPtr loc) * It's important to do this check after the recomputation of * RecentFlushPtr, so we can send all remaining data before shutting * down. + * + * We'll also exit here if the client sent CopyDone because it wants + * to return to command mode. */ - if (walsender_ready_to_stop) + if (walsender_ready_to_stop || streamingDoneReceiving) break; /* @@
Re: [HACKERS] Stopping logical replication protocol
On Fri, Sep 9, 2016 at 11:37 AM, Craig Ringerwrote: > When writing TAP tests for Perl you must ensure you use only Perl > 5.8.8-compatible code and only the core modules plus IPC::Run . You'll > usually be fine if you just avoid importing things you don't see other > modules already using. Don't bother actually installing Perl 5.8.8, we > can run tests in an ancient-perl Docker container before committing > and fix any issues. I don't see any reason you'd need anything special > for this test. On top of that, the perl documentation is your friend if you have a doubt about the versioning of a method: http://perldoc.perl.org/ The important thing is that it is easy to navigate through perl versions down to 5.8.8 so you can easily detect if what you are using would be fit or not for the buildfarm. > If you think some of the code you add would be very reusable for other > test modules in future, feel free to propose new methods for > PostgresNode.pm . Ask if you're not sure. That would be great if things generic enough could be extracted, but I haven't read the patch to make an opinion regarding what could be done. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
On 9 September 2016 at 03:56, Vladimir Gordiychukwrote: >>It'd helpful if you summarize the changes made when posting revisions. > > Can we use as summary about changes first message? I meant "what did you change between the last patch I posted and the one you just posted" ? I'm looking at the revised patch now. > Sounds good. Is it changes should be include in this PR? Yes, probably as a 3rd patch on top of these two that adds the pg_recvlogical tests. I suggest: 1 and 2: same patches as currently 3: send CopyDone on SIGINT in pg_recvlogical, second SIGINT just closes per current behaviour 4: add tests using the code in 1-3 When you're editing such a patch stack, sometimes you'll find you want to change something in patch 1 or 2 or 3 while working on patch 4. The trick to doing this is to make sure to commit your changes in small pieces and avoid mixing up changes for different patches. So you'll land up with lots of small commits like * "fix typo in ReorderBufferIsActive declaration" * "forgot to test streamingDoneReceiving in SomeFunction" etc. Then when you're ready to send a new patch series, you tag your current working branch (optional, but makes cleaning up after mistakes easier), rebase the changes, tag the result, git format-patch it and push it, e.g.: git checkout my-branch git tag my-branch-v4-before-rebase # non-interactive rebase on top of current Pg git master git rebase master # Interactive rebase of all patches on top of master. git rebase -i master This will produce an editor window. Here you choose patches to reorder and squash into existing patches, e.g if you start with: pick 81b3f61 Respect client-initiated CopyDone in walsender pick 43de23d Second part of Vladimir Gordiychuk's patch pick 1231134 fix a typo in part 1 pick 123 forgot to check something in part 2 pick a213111 another typo you might edit it to: pick 81b3f61 Respect client-initiated CopyDone in walsender fixup 1231134 fix a typo in part 1 fixup a213111 another typo pick 43de23d Second part of Vladimir Gordiychuk's patch fixup 123 forgot to check something in part 2 When you save and quit, git will reorder the patches and squash them for you. If there are merge conflicts you fix them at each stage then "git rebase --continue". It takes some getting used to but isn't really hard at all. If you run into trouble, just push your current working branch to a public tree (github or whatever) and ask for help here. Once you're used to it, git has a few helper features like "fixup!" commits and --auto-squash that make this a bit quicker, but it's worth doing manually first since it's easier to figure out problems. There's lots of good reference material on this out there so I won't go on about it endlessly. > I'm not ensure that > I do this before complete this commitefest, I need some time to understand > how this tests works, and how can i write new one. No problem. I don't think it matters much if the patch bumps to the next commitfest. It's not a big intrusive change that needs to get in early in the development cycle. I'm not a committer and can't commit this myself. If I were I don't think I'd do so without a working test anyway. A good starting point for the tests is src/test/perl/README . You can add a test to src/test/recovery/t . Follow the model used in the other tests there. I wrote some tests using logical decoding in the logical decoding timeline following patch; see https://commitfest.postgresql.org/10/779/ for the patch, or https://github.com/2ndQuadrant/postgres/tree/dev/logical-decoding-timeline-following-pg10-v1 . They use the SQL interface not pg_recvlogical though. There's also https://github.com/2ndQuadrant/postgres/blob/dev/failover-slots/src/test/recovery/t/008_failover_slots.pl which has another example of controlling an external process, in this case pg_receivexlog, using IPC::Run. See start_pg_receivexlog(...) and its callers. When writing TAP tests for Perl you must ensure you use only Perl 5.8.8-compatible code and only the core modules plus IPC::Run . You'll usually be fine if you just avoid importing things you don't see other modules already using. Don't bother actually installing Perl 5.8.8, we can run tests in an ancient-perl Docker container before committing and fix any issues. I don't see any reason you'd need anything special for this test. If you think some of the code you add would be very reusable for other test modules in future, feel free to propose new methods for PostgresNode.pm . Ask if you're not sure. Ping me if you're stuck or need a hand. I really want more people to have a clue about how the TAP tests work. I don't use IRC (wrong timezone and life is busy) so email is best. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] Stopping logical replication protocol
>It'd helpful if you summarize the changes made when posting revisions. Can we use as summary about changes first message? If not, summary can be something like that: This parches fix scenarios interrupt logical replication from client side and allow the client to end a logical decoding session between transactions without just unilaterally closing its connection. The client send CopyDone package and wait stop replication as fast as possible. But in two cases walsender interrupt replication too slow. First scenario it's interrupt during waiting new WAL, when replication in streaming state, walsender ignores messages from Client until new WAL record will not generate. The second scenario it is interrupt replication during decoding long transaction that contain many changes. >Yeah, testing it isn't trivial in postgres core, since of course none of the tools know to send a client-initiated CopyDone. >My suggestion is to teach pg_recvlogical to send CopyDone on the first SIGINT (control-C) and wait until the server ends the data-stream and returns to command mode. A second control-C should unilaterally close the connection and it should close after a timeout too. This >will be backward compatible with 9.4/5/6 (just with a delay on exit by sigint). >Then in a TAP test in src/test/recovery set up a logical decoding session with test_decoding and pg_recvlogical. Do a test xact then sigint pg_recvlogical and verify that it exits. To make sure it exited by mutual agreement with server, probably run pg_recvlogival at a higher debug level and text search for a message you print from pg_recvlogical when it gets server CopyDone in the response to client CopyDone. I don't think a different pg_recvlogical numeric exit code could be justified for this. >It sounds like more work than I think it would actually be. Sounds good. Is it changes should be include in this PR? I'm not ensure that I do this before complete this commitefest, I need some time to understand how this tests works, and how can i write new one. 2016-09-08 12:29 GMT+03:00 Craig Ringer: > On 7 September 2016 at 15:44, Vladimir Gordiychuk > wrote: > > > Fixed patch in attach. > > It'd helpful if you summarize the changes made when posting revisions. > > >> * There are no tests. I don't see any really simple way to test this, > >> though. > > > > I will be grateful if you specify the best way how to do it. I tests this > > patches by pgjdbc driver tests that also build on head of postgres. You > can > > check test scenario for both patches in > > PRhttps://github.com/pgjdbc/pgjdbc/pull/550 > > > > org.postgresql.replication.LogicalReplicationTest# > testDuringSendBigTransactionReplicationStreamCloseNotActive > > org.postgresql.replication.LogicalReplicationTest# > testAfterCloseReplicationStreamDBSlotStatusNotActive > > Yeah, testing it isn't trivial in postgres core, since of course none of > the tools know to send a client-initiated CopyDone. > > My suggestion is to teach pg_recvlogical to send CopyDone on the first > SIGINT (control-C) and wait until the server ends the data-stream and > returns to command mode. A second control-C should unilaterally close the > connection and it should close after a timeout too. This will be backward > compatible with 9.4/5/6 (just with a delay on exit by sigint). > > Then in a TAP test in src/test/recovery set up a logical decoding session > with test_decoding and pg_recvlogical. Do a test xact then sigint > pg_recvlogical and verify that it exits. To make sure it exited by mutual > agreement with server, probably run pg_recvlogival at a higher debug level > and text search for a message you print from pg_recvlogical when it gets > server CopyDone in the response to client CopyDone. I don't think a > different pg_recvlogical numeric exit code could be justified for this. > > It sounds like more work than I think it would actually be. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Stopping logical replication protocol
On 7 September 2016 at 15:44, Vladimir Gordiychukwrote: > Fixed patch in attach. It'd helpful if you summarize the changes made when posting revisions. >> * There are no tests. I don't see any really simple way to test this, >> though. > > I will be grateful if you specify the best way how to do it. I tests this > patches by pgjdbc driver tests that also build on head of postgres. You can > check test scenario for both patches in > PRhttps://github.com/pgjdbc/pgjdbc/pull/550 > > org.postgresql.replication.LogicalReplicationTest#testDuringSendBigTransactionReplicationStreamCloseNotActive > org.postgresql.replication.LogicalReplicationTest#testAfterCloseReplicationStreamDBSlotStatusNotActive Yeah, testing it isn't trivial in postgres core, since of course none of the tools know to send a client-initiated CopyDone. My suggestion is to teach pg_recvlogical to send CopyDone on the first SIGINT (control-C) and wait until the server ends the data-stream and returns to command mode. A second control-C should unilaterally close the connection and it should close after a timeout too. This will be backward compatible with 9.4/5/6 (just with a delay on exit by sigint). Then in a TAP test in src/test/recovery set up a logical decoding session with test_decoding and pg_recvlogical. Do a test xact then sigint pg_recvlogical and verify that it exits. To make sure it exited by mutual agreement with server, probably run pg_recvlogival at a higher debug level and text search for a message you print from pg_recvlogical when it gets server CopyDone in the response to client CopyDone. I don't think a different pg_recvlogical numeric exit code could be justified for this. It sounds like more work than I think it would actually be. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Stopping logical replication protocol
>Review comments on the 2nd patch, i.e. the 2nd half of your original patch: > >* Other places in logical decoding use the CB suffix for callback >types. This should do the same. > >* I'm not too keen on the name is_active for the callback. We >discussed the name continue_decoding_cb in our prior conversation. > >* Instead of having LogicalContextAlwaysActive, would it be better to >test if the callback pointer is null and skip it if so? > >* Lots of typos in LogicalContextAlwaysActive comments, and wording is unclear > >* comment added to arguments of pg_logical_slot_get_changes_guts is >not in PostgreSQL style - it goes way wide on the line. Probably >better to put the comment above the call and mention which argument it >refers to. > >* A comment somewhere is needed - probably on the IsStreamingActive() >callback - to point readers at the fact that WalSndWriteData() calls >ProcessRepliesIfAny() and that's how IsStreamingActive can get set. I >had to look at the email discussion history to remind myself how this >worked when I was re-reading the code. > >* I'm not keen on > >typedef ReorderBufferIsActive LogicalDecondingContextIsActive; > > I think instead a single callback name that encompasses both should >be used. ContinueDecodingCB ? Fixed patch in attach. > * There are no tests. I don't see any really simple way to test this, though. I will be grateful if you specify the best way how to do it. I tests this patches by pgjdbc driver tests that also build on head of postgres. You can check test scenario for both patches in PRhttps://github.com/pgjdbc/ pgjdbc/pull/550 org.postgresql.replication.LogicalReplicationTest#testDuringSendBigTransactionReplicationStreamCloseNotActive org.postgresql.replication.LogicalReplicationTest#testAfterCloseReplicationStreamDBSlotStatusNotActive 2016-09-06 4:10 GMT+03:00 Craig Ringer: > On 25 August 2016 at 13:04, Craig Ringer wrote: > > > By the way, I now think that the second part of your patch, to allow > > interruption during ReorderBufferCommit processing, is also very > > desirable. > > I've updated your patch, rebasing it on top of 10.0 master and > splitting it into two pieces. I thought you were planning on following > up with an update to the second part, but maybe that was unclear from > our prior discussion, so I'm doing this to help get this moving again. > > The first patch is what I posted earlier - the part of your patch that > allows the walsender to exit COPY BOTH mode and return to command mode > in response to a client-initiated CopyDone when it's in the > WalSndLoop. For logical decoding this means that it will notice a > client CopyDone when it's between transactions or while it's ingesting > transaction changes. It won't react to CopyDone while it's in > ReorderBufferCommit and sending a transaction to an output plugin > though. > > The second piece is the other half of your original patch. Currently > unmodified from what you wrote. It adds a hook in ReorderBufferCommit > that calls back into the walsender to check for new client input and > interrupt processing. I haven't yet investigated this in detail. > > > Review comments on the 2nd patch, i.e. the 2nd half of your original patch: > > * Other places in logical decoding use the CB suffix for callback > types. This should do the same. > > * I'm not too keen on the name is_active for the callback. We > discussed the name continue_decoding_cb in our prior conversation. > > * Instead of having LogicalContextAlwaysActive, would it be better to > test if the callback pointer is null and skip it if so? > > * Lots of typos in LogicalContextAlwaysActive comments, and wording is > unclear > > * comment added to arguments of pg_logical_slot_get_changes_guts is > not in PostgreSQL style - it goes way wide on the line. Probably > better to put the comment above the call and mention which argument it > refers to. > > * A comment somewhere is needed - probably on the IsStreamingActive() > callback - to point readers at the fact that WalSndWriteData() calls > ProcessRepliesIfAny() and that's how IsStreamingActive can get set. I > had to look at the email discussion history to remind myself how this > worked when I was re-reading the code. > > * I'm not keen on > > typedef ReorderBufferIsActive LogicalDecondingContextIsActive; > > I think instead a single callback name that encompasses both should > be used. ContinueDecodingCB ? > > * There are no tests. I don't see any really simple way to test this, > though. > > I suggest that you apply these updated/split patches to a fresh copy > of master then see if you can update it based on this feedback. > Something like: > > git checkout master > git pull > git checkout -b stop-decoding-v10 > git am /path/to/0001-Respect-client-initiated-CopyDone-in-walsender.patch > git am /path/to/0002-Second-part-of-Vladimir-Gordiychuk-s-patch.patch > > then modify the code per the above, and "git commit
Re: [HACKERS] Stopping logical replication protocol
On 25 August 2016 at 13:04, Craig Ringerwrote: > By the way, I now think that the second part of your patch, to allow > interruption during ReorderBufferCommit processing, is also very > desirable. I've updated your patch, rebasing it on top of 10.0 master and splitting it into two pieces. I thought you were planning on following up with an update to the second part, but maybe that was unclear from our prior discussion, so I'm doing this to help get this moving again. The first patch is what I posted earlier - the part of your patch that allows the walsender to exit COPY BOTH mode and return to command mode in response to a client-initiated CopyDone when it's in the WalSndLoop. For logical decoding this means that it will notice a client CopyDone when it's between transactions or while it's ingesting transaction changes. It won't react to CopyDone while it's in ReorderBufferCommit and sending a transaction to an output plugin though. The second piece is the other half of your original patch. Currently unmodified from what you wrote. It adds a hook in ReorderBufferCommit that calls back into the walsender to check for new client input and interrupt processing. I haven't yet investigated this in detail. Review comments on the 2nd patch, i.e. the 2nd half of your original patch: * Other places in logical decoding use the CB suffix for callback types. This should do the same. * I'm not too keen on the name is_active for the callback. We discussed the name continue_decoding_cb in our prior conversation. * Instead of having LogicalContextAlwaysActive, would it be better to test if the callback pointer is null and skip it if so? * Lots of typos in LogicalContextAlwaysActive comments, and wording is unclear * comment added to arguments of pg_logical_slot_get_changes_guts is not in PostgreSQL style - it goes way wide on the line. Probably better to put the comment above the call and mention which argument it refers to. * A comment somewhere is needed - probably on the IsStreamingActive() callback - to point readers at the fact that WalSndWriteData() calls ProcessRepliesIfAny() and that's how IsStreamingActive can get set. I had to look at the email discussion history to remind myself how this worked when I was re-reading the code. * I'm not keen on typedef ReorderBufferIsActive LogicalDecondingContextIsActive; I think instead a single callback name that encompasses both should be used. ContinueDecodingCB ? * There are no tests. I don't see any really simple way to test this, though. I suggest that you apply these updated/split patches to a fresh copy of master then see if you can update it based on this feedback. Something like: git checkout master git pull git checkout -b stop-decoding-v10 git am /path/to/0001-Respect-client-initiated-CopyDone-in-walsender.patch git am /path/to/0002-Second-part-of-Vladimir-Gordiychuk-s-patch.patch then modify the code per the above, and "git commit --amend" the changes and send an updated set of patches with "git format-patch -2" . I am setting this patch as "waiting on author" in the commitfest. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 81b3f61b0235e1b936e3336e43ce55c00bc230c4 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Tue, 10 May 2016 10:34:10 +0800 Subject: [PATCH 1/2] Respect client-initiated CopyDone in walsender The walsender never looked for CopyDone sent by the client unless it had already decided it was done sending data and dispatched its own CopyDone message. Check for client-initiated CopyDone when in COPY BOTH mode, returning to command mode. In logical decoding this will allow the client to end a logical decoding session between transactions without just unilaterally closing its connection. This change does not allow a client to end COPY BOTH session in the middle of processing a logical decoding block. TODO effect on physical walsender? --- src/backend/replication/walsender.c | 32 +++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 1ea2a5c..c43310c 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -759,6 +759,13 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req /* make sure we have enough WAL available */ flushptr = WalSndWaitForWal(targetPagePtr + reqLen); + /* + * If the client sent CopyDone while we were waiting, + * bail out so we can wind up the decoding session. + */ + if (streamingDoneSending) + return -1; + /* more than one block available */ if (targetPagePtr + XLOG_BLCKSZ <= flushptr) count = XLOG_BLCKSZ; @@ -1220,8 +1227,11 @@ WalSndWaitForWal(XLogRecPtr loc) * It's important to do this check after the recomputation of * RecentFlushPtr, so we can send all
Re: [HACKERS] Stopping logical replication protocol
On 25 August 2016 at 09:22, Craig Ringerwrote: > On 25 August 2016 at 03:26, Vladimir Gordiychuk wrote: >> Hi. It has already passed a few months but patch still have required review >> state. Can I help to speed up the review, or should i wait commitfest? >> I plane complete changes in pgjdbc drive inside PR >> https://github.com/pgjdbc/pgjdbc/pull/550 but PR blocked current problem >> with not available stop logical replication. > > The latest patch is the updated one I posted, which was the part of > yours that allowed return to command mode when logical decoding is > idle. > > If you want to submit an updated version of the second patch, with the > callback to allow logical decoding cancellation during > ReorderBufferCommit, that'd be handy, but I don't think it's as > important as the first one. > > As far as I'm concerned the first patch is ready. Please take a look > and verify that you're happy with my updates and test the updated > patch. I'll mark it ready to go if you are. It's linked to at > https://commitfest.postgresql.org/10/621/ . By the way, I now think that the second part of your patch, to allow interruption during ReorderBufferCommit processing, is also very desirable. Not just for that feature, but because we should be processing client messages during long ReorderBufferCommit executions so that we can respond to feedback from the client. Right now, long running ReorderBufferCommit runs can trigger walsender_timeout because there's no feedback processed. Alternately, ReorderBufferCommit() could call back into the walsender with a progress update, without requiring the client to send feedback. It knows the client is making progress because it's consuming data on the socket. But I'd rather have client feedback processed, since it could be useful later for client=->server messages to output plugin callbacks too. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
On 25 August 2016 at 03:26, Vladimir Gordiychukwrote: > Hi. It has already passed a few months but patch still have required review > state. Can I help to speed up the review, or should i wait commitfest? > I plane complete changes in pgjdbc drive inside PR > https://github.com/pgjdbc/pgjdbc/pull/550 but PR blocked current problem > with not available stop logical replication. The latest patch is the updated one I posted, which was the part of yours that allowed return to command mode when logical decoding is idle. If you want to submit an updated version of the second patch, with the callback to allow logical decoding cancellation during ReorderBufferCommit, that'd be handy, but I don't think it's as important as the first one. As far as I'm concerned the first patch is ready. Please take a look and verify that you're happy with my updates and test the updated patch. I'll mark it ready to go if you are. It's linked to at https://commitfest.postgresql.org/10/621/ . -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stopping logical replication protocol
Hi. It has already passed a few months but patch still have required review state. Can I help to speed up the review, or should i wait commitfest? I plane complete changes in pgjdbc drive inside PR https://github.com/pgjdbc/pgjdbc/pull/550 but PR blocked current problem with not available stop logical replication. 2016-05-11 7:25 GMT+03:00 Craig Ringer: > On 11 May 2016 at 06:47, Vladimir Gordiychuk wrote: > >> Same thread, I just think these are two somewhat separate changes. One is >>> just in the walsender and allows return to command mode during waiting for >>> WAL. The other is more intrusive into the reorder buffer etc and allows >>> aborting decoding during commit processing. So two separate patches make >>> sense here IMO, one on top of the other. >> >> >> About the second part of the patch. What the reason decode and send whole >> transaction? Why we can't process logical decoding via WalSndLoop LSN by >> LSN as it work in physycal replication? For example if transaction contains >> in more them one LSN, first we decode and send "begin", "part data from >> current LSN" and then returns to WalSndLoop on the next iteration we send >> "another part data", "commit". I don't research in this way, because I >> think it will be big changes in comparison callback that stop sending. >> > > There are two parts to that. First, why do we reorder at all, accumulating > a whole transaction in a reorder buffer until we see a commit then sending > it all at once? Second, when sending, why don't we return control to the > walsender between messages? > > For the first: reordering xacts server-side lets the client not worry > about replay order. It just applies them as it receives them. It means the > server can omit uncommitted transactions from the stream entirely and > clients can be kept simple(r). IIRC there are also some issues around > relcache invalidation handling and time travel that make it desirable to > wait until commit before building a snapshot and decoding, but I haven't > dug into the details. Andres is the person who knows that area best. > > As for why we don't return control to the walsender between change > callbacks when processing a reorder buffer at commit time, I'm not really > sure but suspect it's mostly down to easy API and development. If control > returned to the walsender between each change we'd need an async api for > the reorder buffer where you test to see if there's more unprocessed work > and call back into the reorder buffer again if there is. So the reorder > buffer has to keep state for the progress of replaying a commit in a > separate struct, handle repeated calls to process work, etc. Also, since > many individual changes are very small that could lead to a fair bit of > overhead; it'd probably want to process a batch of changes then return. > Which makes it even more complicated. > > If it returned control to the caller between changes then each caller > would also need to have the logic to test for more work and call back into > the reorder buffer. Both the walsender and SQL interface would need it. The > way it is, the loop is just in one place. > > It probably makes more sense to have a callback that can test state and > abort processing, like you've introduced. The callback could probably even > periodically check to see if there's client input to process and consume it. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Stopping logical replication protocol
On 11 May 2016 at 06:47, Vladimir Gordiychukwrote: > Same thread, I just think these are two somewhat separate changes. One is >> just in the walsender and allows return to command mode during waiting for >> WAL. The other is more intrusive into the reorder buffer etc and allows >> aborting decoding during commit processing. So two separate patches make >> sense here IMO, one on top of the other. > > > About the second part of the patch. What the reason decode and send whole > transaction? Why we can't process logical decoding via WalSndLoop LSN by > LSN as it work in physycal replication? For example if transaction contains > in more them one LSN, first we decode and send "begin", "part data from > current LSN" and then returns to WalSndLoop on the next iteration we send > "another part data", "commit". I don't research in this way, because I > think it will be big changes in comparison callback that stop sending. > There are two parts to that. First, why do we reorder at all, accumulating a whole transaction in a reorder buffer until we see a commit then sending it all at once? Second, when sending, why don't we return control to the walsender between messages? For the first: reordering xacts server-side lets the client not worry about replay order. It just applies them as it receives them. It means the server can omit uncommitted transactions from the stream entirely and clients can be kept simple(r). IIRC there are also some issues around relcache invalidation handling and time travel that make it desirable to wait until commit before building a snapshot and decoding, but I haven't dug into the details. Andres is the person who knows that area best. As for why we don't return control to the walsender between change callbacks when processing a reorder buffer at commit time, I'm not really sure but suspect it's mostly down to easy API and development. If control returned to the walsender between each change we'd need an async api for the reorder buffer where you test to see if there's more unprocessed work and call back into the reorder buffer again if there is. So the reorder buffer has to keep state for the progress of replaying a commit in a separate struct, handle repeated calls to process work, etc. Also, since many individual changes are very small that could lead to a fair bit of overhead; it'd probably want to process a batch of changes then return. Which makes it even more complicated. If it returned control to the caller between changes then each caller would also need to have the logic to test for more work and call back into the reorder buffer. Both the walsender and SQL interface would need it. The way it is, the loop is just in one place. It probably makes more sense to have a callback that can test state and abort processing, like you've introduced. The callback could probably even periodically check to see if there's client input to process and consume it. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Stopping logical replication protocol
> > Same thread, I just think these are two somewhat separate changes. One is > just in the walsender and allows return to command mode during waiting for > WAL. The other is more intrusive into the reorder buffer etc and allows > aborting decoding during commit processing. So two separate patches make > sense here IMO, one on top of the other. About the second part of the patch. What the reason decode and send whole transaction? Why we can't process logical decoding via WalSndLoop LSN by LSN as it work in physycal replication? For example if transaction contains in more them one LSN, first we decode and send "begin", "part data from current LSN" and then returns to WalSndLoop on the next iteration we send "another part data", "commit". I don't research in this way, because I think it will be big changes in comparison callback that stop sending. 2016-05-10 20:22 GMT+03:00 Craig Ringer: > On 11 May 2016 at 01:15, Vladimir Gordiychuk wrote: > >> in which release can be include first part? >> > > Since it's not a bug fix, I don't think it can go in before 9.7. > > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Stopping logical replication protocol
On 11 May 2016 at 01:15, Vladimir Gordiychukwrote: > in which release can be include first part? > Since it's not a bug fix, I don't think it can go in before 9.7. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Stopping logical replication protocol
in which release can be include first part? 2016-05-10 15:15 GMT+03:00 Craig Ringer: > On 10 May 2016 at 19:41, Vladimir Gordiychuk wrote: > >> >> Fair enough. Though I don't understand why you'd be doing this often >>> enough that you'd care about reopening connections. What is the problem you >>> are trying to solve with this? The underlying reason you need this change? >>> >> >> First reason it clear API in pgjdc. Second reason it ability fast enough >> rollack to one of the previous LSN and repeat it. My use case for second >> reason - I have logical decoding extension that prepare only data as >> key-value pair without information about (insert, update, delete) for >> example if it delete as key I use primary key for table and as value null. >> Via >> pgjdc by replication protocol connects receiver data, consumer group >> changes to batch and send it to Kafka. If some problem occurs during >> delivery to kafka consumer, I should stop current replication, go back to >> success LSN and repeat all messages from success LSN. I think it operation >> can be quite common, but reopen connection or not stopping replication >> will increase latency. >> > > It will, but not tons. The biggest cost (at least if there are any long > running xacts) will be replaying since the restart_lsn when you restart the > decoding session, which will happen whether you reconnect or just stop > decoding and return to command mode. > > > >> Anyway, here's a draft patch that does the parts other than the reorder >>> buffer processing stop. It passes 'make check' and the src/test/recovery >>> tests, but I haven't written anything to verify the client-initiated abort >>> handling. You have test code for this and I'd be interested to see the >>> results. >>> >> >> What about keepAlive package when we already send/receive CopyDone? Is It >> really necessary? >> > > No, I don't think it is, and I applied the change you made to suppress it. > > >> I think it's worth making the next step, where you allow reorder buffer >>> commit processing to be interrupted, into a separate patch on top of this >>> one. They're two separate changes IMO. >>> >> >> We will continue in the current thread, or new? I interesting in both >> patch for my solution and pgjbc driver. >> > > Same thread, I just think these are two somewhat separate changes. One is > just in the walsender and allows return to command mode during waiting for > WAL. The other is more intrusive into the reorder buffer etc and allows > aborting decoding during commit processing. So two separate patches make > sense here IMO, one on top of the other. > > I use git and just "git format-patch -2" to prepare a stack of two patches > from two separate commits. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Stopping logical replication protocol
On 10 May 2016 at 19:41, Vladimir Gordiychukwrote: > > Fair enough. Though I don't understand why you'd be doing this often >> enough that you'd care about reopening connections. What is the problem you >> are trying to solve with this? The underlying reason you need this change? >> > > First reason it clear API in pgjdc. Second reason it ability fast enough > rollack to one of the previous LSN and repeat it. My use case for second > reason - I have logical decoding extension that prepare only data as > key-value pair without information about (insert, update, delete) for > example if it delete as key I use primary key for table and as value null. Via > pgjdc by replication protocol connects receiver data, consumer group > changes to batch and send it to Kafka. If some problem occurs during > delivery to kafka consumer, I should stop current replication, go back to > success LSN and repeat all messages from success LSN. I think it operation > can be quite common, but reopen connection or not stopping replication > will increase latency. > It will, but not tons. The biggest cost (at least if there are any long running xacts) will be replaying since the restart_lsn when you restart the decoding session, which will happen whether you reconnect or just stop decoding and return to command mode. > Anyway, here's a draft patch that does the parts other than the reorder >> buffer processing stop. It passes 'make check' and the src/test/recovery >> tests, but I haven't written anything to verify the client-initiated abort >> handling. You have test code for this and I'd be interested to see the >> results. >> > > What about keepAlive package when we already send/receive CopyDone? Is It > really necessary? > No, I don't think it is, and I applied the change you made to suppress it. > I think it's worth making the next step, where you allow reorder buffer >> commit processing to be interrupted, into a separate patch on top of this >> one. They're two separate changes IMO. >> > > We will continue in the current thread, or new? I interesting in both > patch for my solution and pgjbc driver. > Same thread, I just think these are two somewhat separate changes. One is just in the walsender and allows return to command mode during waiting for WAL. The other is more intrusive into the reorder buffer etc and allows aborting decoding during commit processing. So two separate patches make sense here IMO, one on top of the other. I use git and just "git format-patch -2" to prepare a stack of two patches from two separate commits. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Stopping logical replication protocol
> Fair enough. Though I don't understand why you'd be doing this often > enough that you'd care about reopening connections. What is the problem you > are trying to solve with this? The underlying reason you need this change? > First reason it clear API in pgjdc. Second reason it ability fast enough rollack to one of the previous LSN and repeat it. My use case for second reason - I have logical decoding extension that prepare only data as key-value pair without information about (insert, update, delete) for example if it delete as key I use primary key for table and as value null. Via pgjdc by replication protocol connects receiver data, consumer group changes to batch and send it to Kafka. If some problem occurs during delivery to kafka consumer, I should stop current replication, go back to success LSN and repeat all messages from success LSN. I think it operation can be quite common, but reopen connection or not stopping replication will increase latency. Anyway, here's a draft patch that does the parts other than the reorder > buffer processing stop. It passes 'make check' and the src/test/recovery > tests, but I haven't written anything to verify the client-initiated abort > handling. You have test code for this and I'd be interested to see the > results. > What about keepAlive package when we already send/receive CopyDone? Is It really necessary? I measure current fix without include long transaction, only start replication and stop it, when on postgresql absent active transactions and get next results: *Before:* 12:35:31.403 (2) FE=> StartReplication(query: START_REPLICATION SLOT pgjdbc_logical_replication_slot LOGICAL 0/7287AC00 ("include-xids" 'false', "skip-empty-xacts" 'true')) 12:35:31.403 (2) FE=> Query(CopyStart) 12:35:31.404 (2) <=BE CopyBothResponse 12:35:31.406 (2) FE=> StopReplication 12:35:31.406 (2) FE=> CopyDone 12:35:31.406 (2) <=BE CopyData 12:35:31.407 (2) [107, 0, 0, 0, 0, 114, -121, -84, 0, 0, 1, -43, 120, 106, 53, -85, 21, 0] 12:35:31.407 (2) <=BE CopyDone 12:35:31.407 (2) <=BE CopyData 12:35:31.407 (2) [107, 0, 0, 0, 0, 114, -121, -84, 0, 0, 1, -43, 120, 106, 53, -76, 29, 0] 12:35:52.120 (2) <=BE CommandStatus(COPY 0) 12:35:52.120 (2) <=BE CommandStatus(SELECT) 12:35:52.120 (2) <=BE ReadyForQuery(I) 12:35:52.126 (2) FE=> Terminate *Timing:* Start and stopping time: 20729ms Stopping time: 20720ms *After:* 14:30:02.050 (2) FE=> StartReplication(query: START_REPLICATION SLOT pgjdbc_logical_replication_slot LOGICAL 0/728A06C0 ("include-xids" 'false', "skip-empty-xacts" 'true')) 14:30:02.050 (2) FE=> Query(CopyStart) 14:30:02.051 (2) <=BE CopyBothResponse 14:30:02.056 (2) FE=> StopReplication 14:30:02.057 (2) FE=> CopyDone 14:30:02.057 (2) <=BE CopyData 14:30:02.057 (2) [107, 0, 0, 0, 0, 114, -118, 6, -64, 0, 1, -43, 122, 3, -69, 107, 76, 0] 14:30:02.058 (2) <=BE CopyDone 14:30:02.058 (2) <=BE CommandStatus(COPY 0) 14:30:02.058 (2) <=BE CommandStatus(SELECT) 14:30:02.058 (2) <=BE ReadyForQuery(I) 14:30:02.068 (2) FE=> StandbyStatusUpdate(received: 0/728A06C0, flushed: 0/0, applied: 0/0, clock: Tue May 10 14:30:02 MSK 2016) *Timing:* Start and stopping time: 27ms Stopping time: 11ms So aborting processing and returning between individual changes in > ReorderBufferCommit(...) seems very reasonable. I agree that some kind of > callback is needed because the walsender uses static globals to control its > send/receive stop logic. I don't like the naming "is_active"; maybe reverse > the sense and call it "stop_decoding_cb" ? Or "continue_decoding_cb"? > Unsure. continue_decoding_cb sounds good. I think it's worth making the next step, where you allow reorder buffer > commit processing to be interrupted, into a separate patch on top of this > one. They're two separate changes IMO. > We will continue in the current thread, or new? I interesting in both patch for my solution and pgjbc driver. 2016-05-10 5:49 GMT+03:00 Craig Ringer: > On 10 May 2016 at 09:50, Craig Ringer wrote: > > >> I outlined how I think WalSndWaitForWal() should be handled earlier. >> Test streamingDoneReceiving and streamingDoneSending in >> logical_read_xlog_page >> and return -1. >> > > OK, so thinking about this some more, I see why you've added the callback > within the reorder buffer code. You want to stop processing of a > transaction after we've decoded the commit and are looping over the changes > within ReorderBufferCommit(...), which doesn't know anything about the > walsender. So we could loop for a long time within WalSndLoop -> > XLogSendLogical -> LogicalDecodingProcessRecord if the record is a commit, > as we process each change and send it to the client. > > So aborting processing and returning between individual changes in > ReorderBufferCommit(...) seems very reasonable. I agree that some kind of > callback is needed because the walsender uses static globals to control its > send/receive stop logic. I don't like
Re: [HACKERS] Stopping logical replication protocol
On 10 May 2016 at 09:50, Craig Ringerwrote: > I outlined how I think WalSndWaitForWal() should be handled earlier. Test > streamingDoneReceiving > and streamingDoneSending in logical_read_xlog_page and return -1. > OK, so thinking about this some more, I see why you've added the callback within the reorder buffer code. You want to stop processing of a transaction after we've decoded the commit and are looping over the changes within ReorderBufferCommit(...), which doesn't know anything about the walsender. So we could loop for a long time within WalSndLoop -> XLogSendLogical -> LogicalDecodingProcessRecord if the record is a commit, as we process each change and send it to the client. So aborting processing and returning between individual changes in ReorderBufferCommit(...) seems very reasonable. I agree that some kind of callback is needed because the walsender uses static globals to control its send/receive stop logic. I don't like the naming "is_active"; maybe reverse the sense and call it "stop_decoding_cb" ? Or "continue_decoding_cb"? Unsure. Anyway, here's a draft patch that does the parts other than the reorder buffer processing stop. It passes 'make check' and the src/test/recovery tests, but I haven't written anything to verify the client-initiated abort handling. You have test code for this and I'd be interested to see the results. This patch doesn't attempt to allow decoding to be aborted during processing of an xlog record, including a commit. So it'll still attempt to send whole transactions. But it should allow clients to send CopyDone when we're waiting for new WAL to be generated and return to command mode then. I think it's worth making the next step, where you allow reorder buffer commit processing to be interrupted, into a separate patch on top of this one. They're two separate changes IMO. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 14c0ac1aacfacb3b6b42bf0c5e453ab4e989aa2c Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Tue, 10 May 2016 10:34:10 +0800 Subject: [PATCH] Respect client-initiated CopyDone in walsender --- src/backend/replication/walsender.c | 32 +++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 926a247..4e2164e 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -759,6 +759,13 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req /* make sure we have enough WAL available */ flushptr = WalSndWaitForWal(targetPagePtr + reqLen); + /* + * If the client sent CopyDone while we were waiting, + * bail out so we can wind up the decoding session. + */ + if (streamingDoneSending) + return -1; + /* more than one block available */ if (targetPagePtr + XLOG_BLCKSZ <= flushptr) count = XLOG_BLCKSZ; @@ -1220,8 +1227,11 @@ WalSndWaitForWal(XLogRecPtr loc) * It's important to do this check after the recomputation of * RecentFlushPtr, so we can send all remaining data before shutting * down. + * + * We'll also exit here if the client sent CopyDone because it wants + * to return to command mode. */ - if (walsender_ready_to_stop) + if (walsender_ready_to_stop || streamingDoneReceiving) break; /* @@ -1789,7 +1799,14 @@ WalSndCheckTimeOut(TimestampTz now) } } -/* Main loop of walsender process that streams the WAL over Copy messages. */ +/* + * Main loop of walsender process that streams the WAL over Copy messages. + * + * The send_data callback must enqueue complete CopyData messages to libpq + * using pq_putmessage_noblock or similar, since the walsender loop may send + * CopyDone then exit and return to command mode in response to a client + * CopyDone between calls to send_data. + */ static void WalSndLoop(WalSndSendDataCallback send_data) { @@ -1852,10 +1869,15 @@ WalSndLoop(WalSndSendDataCallback send_data) * some more. If there is some, we don't bother to call send_data * again until we've flushed it ... but we'd better assume we are not * caught up. + * + * If we're trying to finish sending and exit we shouldn't enqueue more + * data to libpq. We need to finish writing out whatever we already + * have in libpq's send buffer to maintain protocol sync so we still + * need to loop until it's flushed. */ - if (!pq_is_send_pending()) + if (!pq_is_send_pending() && !streamingDoneSending) send_data(); - else + else if (!streamingDoneSending) WalSndCaughtUp = false; /* Try to flush pending output to the client */ @@ -2911,7 +2933,7 @@ WalSndKeepaliveIfNecessary(TimestampTz now) if (wal_sender_timeout <= 0 || last_reply_timestamp <= 0) return; - if (waiting_for_ping_response) + if (waiting_for_ping_response || streamingDoneSending) return; /*
Re: [HACKERS] Stopping logical replication protocol
> > ProcessRepliesIfAny also now executes in WalSdnWriteData. Because during > send data we should also check message from client(client can send > CopyDone, KeepAlive, Terminate). > > Ah, I didn't spot that ProcessRepliesIfAny() is already called from WalSndWriteData in the current codebase. I agree that it would be useful to check for client-initiated CopyDone there, clear the walsender write queue, close the decoding session and return to command mode. Nothing gets sent until we've formatted the data we want to send into a StringInfo, though. The comments on WalSndPrepareWrite even note that there's no guarantee anything will get done with the data. There should be no need to test whether we're processing a Datum, since we won't call ProcessRepliesIfAny() or WalSndWriteData() while we're doing that. I think all you should need to do is test streamingDoneSending and streamingDoneReceiving in WalSndWriteData() and WalSndWaitForWal(). I outlined how I think WalSndWaitForWal() should be handled earlier. Test streamingDoneReceiving and streamingDoneSending in logical_read_xlog_page and return -1. For WalSndWriteData() it's more complicated because there's data waiting to flush. I don't see any way to remove data from the send buffer in the libpq async API, so you'll have to send whatever's already been passed to libpq for sending using pq_putmessage_noblock(...). This is necessary for protocol integrity too, since you need to send a complete CopyData packet. The length is specified at the start of the CopyData, so once you start sending it you are committed to finishing it. Otherwise the client has no way to know when to stop expecting CopyData and look for the next protocol message. It can wait until its socket read blocks, but that doesn't mean there isn't more of the CopyData in the upstream server's send buffers, buffered in a router, etc. If you want a way to abort in the middle of sending a datum (or in fact an individual logical change) you'd need a much bigger change to the replication protocol, where we chunk up big Datum values into multiple CopyData messages with continuations. That'd be quite nice to have, especially if it let us stream the Datum out without assembling the whole thing in memory in a StringInfo first, but it's a much bigger change. If you just return from WalSndWriteData(), the data is still queued by libpq, and libpq will still finish sending it when you try to send something else. So I don't see any point taking much action in response to CopyDone in WalSndWriteData(). Maybe you could check if the message we're about to send is large and if it is, check for client input before we start sending and bail out before the pq_putmessage_noblock() call. But once we've done that we're committed to finishing sending that message and have to wait until pq_is_send_pending() return false before we can send our own CopyDone and exit. Since WalSndWriteData() must write a whole message from the decoding plugin before it can return, and when it returns we'll return from LogicalDecodingProcessRecord(..) to XLogSendLogical(...), we can just test for streamingDoneReceiving and streamingDoneSending there. I guess if you add a flag or callback to the logical decoding context you could have the logical decoding plugin interrupt processing of an insert/update/delete row change message between processing of individual datums within it and return without calling OutputPluginWrite(...) so the data is never queued for WalSndWriteData(...). This would make a difference if you have wide rows... but don't currently process client writes during output plugin callbacks. You'd have to add a logical decoding API function clients could call to process client replies and set the flag. (They can't call ProcessRepliesIfAny from walsender.c as it's static and not meant to be called from within a decoding plugin anyway). > I want reuse same connection without reopen it, because open new connection takes too long. Fair enough. Though I don't understand why you'd be doing this often enough that you'd care about reopening connections. What is the problem you are trying to solve with this? The underlying reason you need this change? > Is it correct use case or CopyDone it side effect of copy protocol and for complete replication need use always Terminate package and reopen connection? Client-initiated CopyDone for COPY BOTH should be just fine in protocol terms. There's partial support for it in the walsender already. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Stopping logical replication protocol
> > What's your PostgreSQL community username? gordiychuk It seems like what you're also trying to allow interruption deeper than > that, when we're in the middle of processing a reorder buffer commit record > and streaming that to the client. You're introducing an is_active member > (actually a callback, though name suggests it's a flag) in struct > ReorderBuffer to check whether a CopyDone is received, and you're skipping > ReorderBuffer commit processing when the callback returns false. The > callback returns "!streamingDoneReceiving && !streamingDoneSending" i.e. > it's false if either end has sent CopyDone. streamingDoneSending and > streamingDoneSending are only set in ProcessRepliesIfAny, called by > WalSndLoop and WalSndWaitForWal. So the idea is, presumably, that if we're > waiting for WAL from XLogSendLogical we skip processing of any commit > records and exit. > > That seems overcomplicated. > > When WalSndWaitForWAL is called > by logical_read_xlog_page, logical_read_xlog_page can just test > streamingDoneReceiving and streamingDoneSending. If they're set it can skip > the page read and return -1, which will cause the xlogreader to return a > null record to XLogSendLogical. That'll skip the decoding calls and return > to WalSndLoop, where we'll notice it's time to exit. > ProcessRepliesIfAny also now executes in WalSdnWriteData. Because during send data we should also check message from client(client can send CopyDone, KeepAlive, Terminate). @@ -1086,14 +1089,6 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, memcpy(>out->data[1 + sizeof(int64) + sizeof(int64)], tmpbuf.data, sizeof(int64)); - /* fast path */ - /* Try to flush pending output to the client */ - if (pq_flush_if_writable() != 0) - WalSndShutdown(); - - if (!pq_is_send_pending()) - return; - The main idea is that we can get CopyDone from client in the next functions: WalSdnLoop, WalSndWaitForWal, WalSndWriteData. All of this methods can take a long time, because WalSndWaitForWal can wait new transaction and on not active db it can take long enough, WalSndWriteData can send big transaction that also lead to ignore messages from client until long time(In my example above for 1 million object changes, walsender ignore messages 13 seconds and not allow reuse connection). When client send CopyDone they don't want receive message anymore for current LSN. For example physical replication can be interrupt in the middle of transaction that affect more than one LSN. Maybe I not correct undestand documentation, but I want reuse same connection without reopen it, because open new connection takes too long. Is it correct use case or CopyDOne it side effect of copy protocol and for complete replication need use always Terminate package and reopen connection?
Re: [HACKERS] Stopping logical replication protocol
I've created a CF entry for this patch: https://commitfest.postgresql.org/10/621/ set as waiting-on-author. Vladimir, I didn't find a PostgreSQL community user account for you, so I couldn't set you up as the author. What's your PostgreSQL community username?
Re: [HACKERS] Stopping logical replication protocol
On 6 May 2016 at 23:23, Vladimir Gordiychukwrote: > I prepare small patch that fix problems describe below. Now *WalSndWriteData > *first check message from consumer and during decode transaction check > that replication still active. > OK, upon looking closer I'm not sure I agree with this approach. In WalSndLoop(...) we currently call ProcessRepliesIfAny() which handles a client-initiated CopyDone by replying with its own CopyDone. WalSndLoop then just waits until the send buffer is flushed or the client disconnects and returns. That looks to be pretty much what's needed... but only when we actually execute that code path. XLogSendLogical calls WalSndWaitForWal(...) (via xlogreader and logical_read_xlog_page) when waiting for something to process, when the client will be blocked on a socket read. It also processes client CopyDone, but it doesn't seem to test streamingDoneSending or streamingDoneReceiving like the outer WalSndLoop does. So it looks like it continues waiting for WAL when we know the client sent CopyDone and doesn't want anything more. I think we should be testing that in WalSndWaitForWal, like we do in WalSndLoop, and bailing out. You've done that part, and I agree that's good. It seems like what you're also trying to allow interruption deeper than that, when we're in the middle of processing a reorder buffer commit record and streaming that to the client. You're introducing an is_active member (actually a callback, though name suggests it's a flag) in struct ReorderBuffer to check whether a CopyDone is received, and you're skipping ReorderBuffer commit processing when the callback returns false. The callback returns "!streamingDoneReceiving && !streamingDoneSending" i.e. it's false if either end has sent CopyDone. streamingDoneSending and streamingDoneSending are only set in ProcessRepliesIfAny, called by WalSndLoop and WalSndWaitForWal. So the idea is, presumably, that if we're waiting for WAL from XLogSendLogical we skip processing of any commit records and exit. That seems overcomplicated. When WalSndWaitForWAL is called by logical_read_xlog_page, logical_read_xlog_page can just test streamingDoneReceiving and streamingDoneSending. If they're set it can skip the page read and return -1, which will cause the xlogreader to return a null record to XLogSendLogical. That'll skip the decoding calls and return to WalSndLoop, where we'll notice it's time to exit. That way there's no need to modify the reorder buffer structs, add callbacks, etc. Make sense? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Stopping logical replication protocol
On 6 May 2016 at 23:23, Vladimir Gordiychukwrote: > I prepare small patch that fix problems describe below. Now *WalSndWriteData > *first check message from consumer and during decode transaction check > that replication still active. KeppAlive message now not send if was get > CopyDone package(keep alive now not necessary we preparing to complete). > *WalSndWaitForWal > *after get CopyDone exit from loop. With apply this patch I get next > measurements > I'll review this, but based on the description and concept I agree it's useful. I have been frustrated in the past by the inability to terminate the logical replication stream and return to command mode without a disconnect/reconnect so I'd like to have this. AFAIK there's no particular reason we can't do it, it's just not implemented. It does have to be a clean exit though, where the logical decoding context is destroyed, the xact it was running in is closed, etc. Like for the SQL interface. You still won't want to do it too often because there's a cost to stopping decoding and restarting it. We have to re-read from the restart_lsn and reassemble transactions again up to the point where we can start sending them to the client again. Especially if you're most of the way through decoding a big xact, or if there's a long-running xact holding restart_lsn down this might cost a bit. But no worse than repeatedly calling the SQL logical decoding interface. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Stopping logical replication protocol
Replication work via Copy API, so for stop replication walcender.c wait CopyDone. My communication seems like this FE=> StartReplication(query: START_REPLICATION SLOT pgjdbc_logical_replication_slot LOGICAL 0/18FCFD0 ("include-xids" 'false', "skip-empty-xacts" 'true')) FE=> Query(CopyStart) <=BE CopyBothResponse FE=> StandbyStatusUpdate(received: 0/18FCFD0, flushed: 0/0, applied: 0/0, clock: Tue May 03 22:13:14 MSK 2016) FE=> CopyData(34) <=BE CopyData <=BE CopyData <=BE Keepalive(lastServerWal: 0/18FCFD0, clock: Tue May 03 22:13:14 MSK 2016 needReply: false) <=BE XLogData(currWal: 0/0, lastServerWal: 0/0, clock: 0) <=BE CopyData <=BE XLogData(currWal: 0/18FD0A0, lastServerWal: 0/18FD0A0, clock: 0) <=BE CopyData <=BE XLogData(currWal: 0/18FD1B0, lastServerWal: 0/18FD1B0, clock: 0) FE=> StopReplication <=BE CopyData FE=> CopyDone <=BE CopyDone <=BE CopyData ... after few seconds <=BE CommandStatus(COPY 0) <=BE ReadyForQuery(I) The main idea that i want work with same connection without close it. 2016-05-06 19:45 GMT+03:00 Oleksandr Shulgin: > On Fri, May 6, 2016 at 5:23 PM, Vladimir Gordiychuk > wrote: > >> Hi all, >> >> During implementing logical replication protocol for pgjdbc >> https://github.com/pgjdbc/pgjdbc/pull/550 I faced with strange behavior >> of the *walcender.c*: >> >>1. When WAL consumer catchup master and change his state to >>streaming, not available normally complete replication by send CopyDone >>message until will not generate/create new WAL record. It occurs because >>logical decoding located in *WalSndWaitForWal* until will not >>available next WAL record, and it method receive message from consumer >> even >>reply on CopyDone with CopyDone but ignore exit from loop and we can wait >>many times waiting CommandStatus & ReadyForQuery packages on consumer. >>2. Logical decoding ignore message from consumer during decoding and >>writing transaction in socket(*WalSndWriteData*). It affect long >>transactions with many changes. Because for example if we start decoding >>transaction that insert 1 million records and after consume 1% of it date >>we >>decide stop replication, it will be not available until whole million >>record will not send to consumer. >> >> How exactly are you stopping the replication? If you just stop reading > you'll likely to hit some problems, but what if you also close the > connection? I don't think there is any other supported way to do it. > > I was working last year on adding support for replication protocol to the > Python driver: https://github.com/psycopg/psycopg2/pull/322 It would be > nice if you could skim through this implementation, since it didn't receive > a whole lot of review. > > Cheers. > -- > Alex > >
Re: [HACKERS] Stopping logical replication protocol
On Fri, May 6, 2016 at 5:23 PM, Vladimir Gordiychukwrote: > Hi all, > > During implementing logical replication protocol for pgjdbc > https://github.com/pgjdbc/pgjdbc/pull/550 I faced with strange behavior > of the *walcender.c*: > >1. When WAL consumer catchup master and change his state to streaming, >not available normally complete replication by send CopyDone message until >will not generate/create new WAL record. It occurs because logical decoding >located in *WalSndWaitForWal* until will not available next WAL >record, and it method receive message from consumer even reply on CopyDone >with CopyDone but ignore exit from loop and we can wait many times waiting >CommandStatus & ReadyForQuery packages on consumer. >2. Logical decoding ignore message from consumer during decoding and >writing transaction in socket(*WalSndWriteData*). It affect long >transactions with many changes. Because for example if we start decoding >transaction that insert 1 million records and after consume 1% of it date >we >decide stop replication, it will be not available until whole million >record will not send to consumer. > > How exactly are you stopping the replication? If you just stop reading you'll likely to hit some problems, but what if you also close the connection? I don't think there is any other supported way to do it. I was working last year on adding support for replication protocol to the Python driver: https://github.com/psycopg/psycopg2/pull/322 It would be nice if you could skim through this implementation, since it didn't receive a whole lot of review. Cheers. -- Alex