Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-05 Thread Andres Freund
Hi, On 2017-06-05 15:30:38 +0900, Michael Paquier wrote: > + * This will trigger walsenders to send the remaining WAL, prevent them from > + * accepting further commands. After that they'll wait till the last WAL is > + * written. > s/prevent/preventing/? > I would rephrase the last sentence a

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-05 Thread Michael Paquier
On Tue, Jun 6, 2017 at 9:47 AM, Andres Freund wrote: > On 2017-06-05 15:30:38 +0900, Michael Paquier wrote: >> I think that it would be interesting to be able to >> trigger a feedback message using SIGHUP in WAL receivers, refactoring >> at the same time SIGHUP handling for

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-05 Thread Andres Freund
On 2017-06-05 15:30:38 +0900, Michael Paquier wrote: > I have looked at all those patches. The set looks solid to me. Thanks! > Here are some comments about 0003. > + /* > +* Have WalSndLoop() terminate the connection in an orderly > +* manner, after writing

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-05 Thread Michael Paquier
On Mon, Jun 5, 2017 at 10:29 AM, Andres Freund wrote: > On 2017-06-02 17:20:23 -0700, Andres Freund wrote: >> Attached is a *preliminary* patch series implementing this. I've first >> reverted the previous patch, as otherwise backpatchable versions of the >> necessary patches

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-04 Thread Andres Freund
Hi, On 2017-06-05 10:31:12 +0900, Michael Paquier wrote: > On Mon, Jun 5, 2017 at 10:29 AM, Andres Freund wrote: > > Michael, Peter, Fujii, is either of you planning to review this? I'm > > planning to commit this tomorrow morning PST, unless somebody protest > > till then. >

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-04 Thread Michael Paquier
On Mon, Jun 5, 2017 at 10:29 AM, Andres Freund wrote: > Michael, Peter, Fujii, is either of you planning to review this? I'm > planning to commit this tomorrow morning PST, unless somebody protest > till then. Yes, I am. It would be nice if you could let me 24 hours to look

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-04 Thread Andres Freund
On 2017-06-02 17:20:23 -0700, Andres Freund wrote: > Attached is a *preliminary* patch series implementing this. I've first > reverted the previous patch, as otherwise backpatchable versions of the > necessary patches would get too complicated, due to the signals used and > such. I went again

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-02 Thread Andres Freund
On 2017-06-01 17:29:12 -0700, Andres Freund wrote: > On 2017-06-02 08:38:51 +0900, Michael Paquier wrote: > > On Fri, Jun 2, 2017 at 7:05 AM, Andres Freund wrote: > > > I'm a unhappy how this is reusing SIGINT for WalSndLastCycleHandler. > > > Normally INT is used cancel

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-01 Thread Robert Haas
On Thu, Jun 1, 2017 at 6:05 PM, Andres Freund wrote: > I'm a unhappy how this is reusing SIGINT for WalSndLastCycleHandler. > Normally INT is used cancel interrupts, and since walsender is now also > working as a normal backend, this overlap is bad. Yep, that's bad. --

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-01 Thread Andres Freund
On 2017-06-02 10:05:21 +0900, Michael Paquier wrote: > On Fri, Jun 2, 2017 at 9:29 AM, Andres Freund wrote: > > On 2017-06-02 08:38:51 +0900, Michael Paquier wrote: > >> On Fri, Jun 2, 2017 at 7:05 AM, Andres Freund wrote: > >> > I'm a unhappy how this is

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-01 Thread Michael Paquier
On Fri, Jun 2, 2017 at 9:29 AM, Andres Freund wrote: > On 2017-06-02 08:38:51 +0900, Michael Paquier wrote: >> On Fri, Jun 2, 2017 at 7:05 AM, Andres Freund wrote: >> > I'm a unhappy how this is reusing SIGINT for WalSndLastCycleHandler. >> > Normally INT

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-01 Thread Andres Freund
On 2017-06-02 08:38:51 +0900, Michael Paquier wrote: > On Fri, Jun 2, 2017 at 7:05 AM, Andres Freund wrote: > > I'm a unhappy how this is reusing SIGINT for WalSndLastCycleHandler. > > Normally INT is used cancel interrupts, and since walsender is now also > > working as a

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-01 Thread Michael Paquier
On Fri, Jun 2, 2017 at 7:05 AM, Andres Freund wrote: > I'm a unhappy how this is reusing SIGINT for WalSndLastCycleHandler. > Normally INT is used cancel interrupts, and since walsender is now also > working as a normal backend, this overlap is bad. Even for plain > walsender

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-01 Thread Andres Freund
On 2017-05-05 10:50:11 -0400, Peter Eisentraut wrote: > On 5/5/17 01:26, Michael Paquier wrote: > > The only code path doing HOT-pruning and generating WAL is > > heap_page_prune(). Do you think that we need to worry about FPWs as > > well? > > > > Attached is an updated patch, which also forbids

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-27 Thread Michael Paquier
On Fri, May 26, 2017 at 4:47 PM, Peter Eisentraut wrote: > On 5/26/17 14:16, Michael Paquier wrote: >> So, now that the last round of minor releases has happened and that >> some dust has settled on this patch, shouldn't there be a backpatch? >> If yes, do you

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-26 Thread Peter Eisentraut
On 5/26/17 14:16, Michael Paquier wrote: > So, now that the last round of minor releases has happened and that > some dust has settled on this patch, shouldn't there be a backpatch? > If yes, do you need patches for all branches? This problems goes down > to 9.2 anyway as BASE_BACKUP can generate

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-26 Thread Michael Paquier
On Sat, May 6, 2017 at 6:40 AM, Michael Paquier wrote: > Agreed. Just adding an ERROR message in XLogInsert() is not going to > help much as this leads also to PANIC for critical sections :( > So a patch really needs to be a no-op for all WAL-related operations > within

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-06 Thread Michael Paquier
On Fri, May 5, 2017 at 11:50 PM, Peter Eisentraut wrote: > On 5/5/17 01:26, Michael Paquier wrote: >> The only code path doing HOT-pruning and generating WAL is >> heap_page_prune(). Do you think that we need to worry about FPWs as >> well? >> >> Attached is an

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-05 Thread Peter Eisentraut
On 5/5/17 01:26, Michael Paquier wrote: > The only code path doing HOT-pruning and generating WAL is > heap_page_prune(). Do you think that we need to worry about FPWs as > well? > > Attached is an updated patch, which also forbids the run of any > replication commands when the stopping state is

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-05 Thread Michael Paquier
On Fri, May 5, 2017 at 5:33 PM, Pavan Deolasee wrote: > > > On Fri, May 5, 2017 at 10:56 AM, Michael Paquier > wrote: >> >> On Wed, May 3, 2017 at 12:25 AM, Peter Eisentraut >> >> >> >>> Can we prevent HOT pruning during logical decoding? >>

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-05 Thread Pavan Deolasee
On Fri, May 5, 2017 at 10:56 AM, Michael Paquier wrote: > On Wed, May 3, 2017 at 12:25 AM, Peter Eisentraut > > > >>> Can we prevent HOT pruning during logical decoding? > >> > >> It does not sound much difficult to do, couldn't you just make it a > >> no-op with

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-04 Thread Michael Paquier
On Wed, May 3, 2017 at 12:25 AM, Peter Eisentraut wrote: > On 5/2/17 10:08, Michael Paquier wrote: >> On Tue, May 2, 2017 at 9:30 PM, Peter Eisentraut >> wrote: >>> On 5/2/17 03:11, Petr Jelinek wrote: logical decoding can

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-02 Thread Peter Eisentraut
On 5/2/17 10:08, Michael Paquier wrote: > On Tue, May 2, 2017 at 9:30 PM, Peter Eisentraut > wrote: >> On 5/2/17 03:11, Petr Jelinek wrote: >>> logical decoding can theoretically >>> do HOT pruning (even if the chance is really small) so it's not safe to >>>

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-02 Thread Michael Paquier
On Tue, May 2, 2017 at 9:30 PM, Peter Eisentraut wrote: > On 5/2/17 03:11, Petr Jelinek wrote: >> logical decoding can theoretically >> do HOT pruning (even if the chance is really small) so it's not safe to >> start logical replication either. > > This seems a

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-02 Thread Michael Paquier
On Tue, May 2, 2017 at 9:27 PM, Peter Eisentraut wrote: > On 5/2/17 03:43, Michael Paquier wrote: >>> I don't think the code covers all because a) the SQL queries are not >>> covered at all that I can see and b) logical decoding can theoretically >>> do HOT

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-02 Thread Peter Eisentraut
On 5/2/17 03:11, Petr Jelinek wrote: > logical decoding can theoretically > do HOT pruning (even if the chance is really small) so it's not safe to > start logical replication either. This seems a bit impossible to resolve. On the one hand, we want to allow streaming until after the shutdown

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-02 Thread Peter Eisentraut
On 5/2/17 03:43, Michael Paquier wrote: >> I don't think the code covers all because a) the SQL queries are not >> covered at all that I can see and b) logical decoding can theoretically >> do HOT pruning (even if the chance is really small) so it's not safe to >> start logical replication either.

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-02 Thread Michael Paquier
On Tue, May 2, 2017 at 4:11 PM, Petr Jelinek wrote: > On 02/05/17 05:35, Michael Paquier wrote: >> On Tue, May 2, 2017 at 7:07 AM, Peter Eisentraut >> wrote: >>> On 4/25/17 21:47, Michael Paquier wrote: Attached is an updated

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-02 Thread Petr Jelinek
On 02/05/17 05:35, Michael Paquier wrote: > On Tue, May 2, 2017 at 7:07 AM, Peter Eisentraut > wrote: >> On 4/25/17 21:47, Michael Paquier wrote: >>> Attached is an updated patch to reflect that. >> >> I edited this a bit, here is a new version. > > Thanks,

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-01 Thread Michael Paquier
On Tue, May 2, 2017 at 7:07 AM, Peter Eisentraut wrote: > On 4/25/17 21:47, Michael Paquier wrote: >> Attached is an updated patch to reflect that. > > I edited this a bit, here is a new version. Thanks, looks fine for me. > A variant approach would be to

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-01 Thread Peter Eisentraut
On 4/25/17 21:47, Michael Paquier wrote: > Attached is an updated patch to reflect that. I edited this a bit, here is a new version. A variant approach would be to prohibit *all* new commands after entering the "stopping" state, just let running commands run. That way we don't have to pick

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-25 Thread Michael Paquier
On Wed, Apr 26, 2017 at 3:17 AM, Peter Eisentraut wrote: > On 4/21/17 00:11, Michael Paquier wrote: >> Hmm. I have been actually looking at this solution and I am having >> doubts regarding its robustness. In short this would need to be >> roughly a two-step

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-25 Thread Peter Eisentraut
On 4/21/17 00:11, Michael Paquier wrote: > Hmm. I have been actually looking at this solution and I am having > doubts regarding its robustness. In short this would need to be > roughly a two-step process: > - In PostmasterStateMachine(), SIGUSR2 is sent to the checkpoint to > make it call

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-23 Thread Michael Paquier
On Sun, Apr 23, 2017 at 10:15 AM, Petr Jelinek wrote: > On 21/04/17 06:11, Michael Paquier wrote: >> On Fri, Apr 21, 2017 at 12:29 AM, Peter Eisentraut >> wrote: >> Hmm. I have been actually looking at this solution and I am having

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-22 Thread Petr Jelinek
On 21/04/17 06:11, Michael Paquier wrote: > On Fri, Apr 21, 2017 at 12:29 AM, Peter Eisentraut > wrote: >> On 4/20/17 07:52, Petr Jelinek wrote: >>> On 20/04/17 05:57, Michael Paquier wrote: 2nd thoughts here... Ah now I see your point. True that there is no

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-20 Thread Michael Paquier
On Fri, Apr 21, 2017 at 12:29 AM, Peter Eisentraut wrote: > On 4/20/17 07:52, Petr Jelinek wrote: >> On 20/04/17 05:57, Michael Paquier wrote: >>> 2nd thoughts here... Ah now I see your point. True that there is no >>> way to ensure that an unwanted command is

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-20 Thread Peter Eisentraut
On 4/20/17 07:52, Petr Jelinek wrote: > On 20/04/17 05:57, Michael Paquier wrote: >> 2nd thoughts here... Ah now I see your point. True that there is no >> way to ensure that an unwanted command is not running when SIGUSR2 is >> received as the shutdown checkpoint may have already begun. Here is

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-20 Thread Petr Jelinek
On 20/04/17 05:57, Michael Paquier wrote: > On Thu, Apr 20, 2017 at 12:40 PM, Michael Paquier > wrote: >> On Thu, Apr 20, 2017 at 4:57 AM, Peter Eisentraut >> wrote: >>> I think the problem with a signal-based solution is that there is

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-19 Thread Michael Paquier
On Thu, Apr 20, 2017 at 12:40 PM, Michael Paquier wrote: > On Thu, Apr 20, 2017 at 4:57 AM, Peter Eisentraut > wrote: >> I think the problem with a signal-based solution is that there is no >> feedback. Ideally, you would wait for all

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-19 Thread Michael Paquier
On Thu, Apr 20, 2017 at 4:57 AM, Peter Eisentraut wrote: > On 4/19/17 01:45, Michael Paquier wrote: >> On Tue, Apr 18, 2017 at 3:27 AM, Peter Eisentraut >> wrote: >>> I'd imagine the postmaster would tell the walsender that it

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-19 Thread Peter Eisentraut
On 4/19/17 01:45, Michael Paquier wrote: > On Tue, Apr 18, 2017 at 3:27 AM, Peter Eisentraut > wrote: >> I'd imagine the postmaster would tell the walsender that it has started >> shutdown, and then the walsender would reject $certain_things. But I >> don't see

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-18 Thread Michael Paquier
On Tue, Apr 18, 2017 at 3:27 AM, Peter Eisentraut wrote: > I'd imagine the postmaster would tell the walsender that it has started > shutdown, and then the walsender would reject $certain_things. But I > don't see an existing way for the walsender to know that

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-17 Thread Peter Eisentraut
On 4/17/17 12:30, Andres Freund wrote: So I guess that CREATE_REPLICATION_SLOT code calls LogStandbySnapshot() and which generates WAL record about snapshot of running transactions. >>> >>> Erroring out in these cases sounds easy enough. Wonder if there's not a >>> bigger problem with

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-17 Thread Andres Freund
On 2017-04-17 18:28:16 +0200, Petr Jelinek wrote: > On 17/04/17 18:02, Andres Freund wrote: > > On 2017-04-15 02:33:59 +0900, Fujii Masao wrote: > >> On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek > >> wrote: > >>> On 12/04/17 15:55, Fujii Masao wrote: > Hi, >

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-17 Thread Petr Jelinek
On 17/04/17 18:02, Andres Freund wrote: > On 2017-04-15 02:33:59 +0900, Fujii Masao wrote: >> On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek >> wrote: >>> On 12/04/17 15:55, Fujii Masao wrote: Hi, When I shut down the publisher while I repeated

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-17 Thread Andres Freund
On 2017-04-15 02:33:59 +0900, Fujii Masao wrote: > On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek > wrote: > > On 12/04/17 15:55, Fujii Masao wrote: > >> Hi, > >> > >> When I shut down the publisher while I repeated creating and dropping > >> the subscription in the

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Petr Jelinek
On 14/04/17 21:05, Peter Eisentraut wrote: > On 4/14/17 14:23, Petr Jelinek wrote: >> Ah yes looking at the code, it does exactly that (on master only). Means >> that backport will be necessary. > > I think these two sentences are contradicting each other. > Hehe, didn't realize master will be

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Peter Eisentraut
On 4/14/17 14:23, Petr Jelinek wrote: > Ah yes looking at the code, it does exactly that (on master only). Means > that backport will be necessary. I think these two sentences are contradicting each other. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Petr Jelinek
On 14/04/17 19:33, Fujii Masao wrote: > On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek > wrote: >> On 12/04/17 15:55, Fujii Masao wrote: >>> Hi, >>> >>> When I shut down the publisher while I repeated creating and dropping >>> the subscription in the subscriber, the

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Fujii Masao
On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek wrote: > On 12/04/17 15:55, Fujii Masao wrote: >> Hi, >> >> When I shut down the publisher while I repeated creating and dropping >> the subscription in the subscriber, the publisher emitted the following >> PANIC error

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Petr Jelinek
On 12/04/17 15:55, Fujii Masao wrote: > Hi, > > When I shut down the publisher while I repeated creating and dropping > the subscription in the subscriber, the publisher emitted the following > PANIC error during shutdown checkpoint. > > PANIC: concurrent transaction log activity while database

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-13 Thread Michael Paquier
On Fri, Apr 14, 2017 at 3:03 AM, Fujii Masao wrote: > On Thu, Apr 13, 2017 at 12:36 PM, Michael Paquier > wrote: >> On Thu, Apr 13, 2017 at 12:28 PM, Fujii Masao wrote: >>> On Thu, Apr 13, 2017 at 5:25 AM, Peter Eisentraut

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-13 Thread Fujii Masao
On Thu, Apr 13, 2017 at 12:36 PM, Michael Paquier wrote: > On Thu, Apr 13, 2017 at 12:28 PM, Fujii Masao wrote: >> On Thu, Apr 13, 2017 at 5:25 AM, Peter Eisentraut >> wrote: >>> On 4/12/17 09:55, Fujii Masao

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Michael Paquier
On Thu, Apr 13, 2017 at 12:28 PM, Fujii Masao wrote: > On Thu, Apr 13, 2017 at 5:25 AM, Peter Eisentraut > wrote: >> On 4/12/17 09:55, Fujii Masao wrote: >>> To fix this issue, we should terminate walsender for logical replication >>>

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Fujii Masao
On Thu, Apr 13, 2017 at 5:25 AM, Peter Eisentraut wrote: > On 4/12/17 09:55, Fujii Masao wrote: >> To fix this issue, we should terminate walsender for logical replication >> before shutdown checkpoint starts. Of course walsender for physical >> replication still

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Peter Eisentraut
On 4/12/17 09:55, Fujii Masao wrote: > To fix this issue, we should terminate walsender for logical replication > before shutdown checkpoint starts. Of course walsender for physical > replication still needs to keep running until shutdown checkpoint ends, > though. Can we turn it into a kind of

[HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Fujii Masao
Hi, When I shut down the publisher while I repeated creating and dropping the subscription in the subscriber, the publisher emitted the following PANIC error during shutdown checkpoint. PANIC: concurrent transaction log activity while database system is shutting down The cause of this problem