Re: Physical replication slot advance is not persistent

2020-07-09 Thread Michael Paquier
On Thu, Jul 09, 2020 at 04:12:49PM +0530, Amit Kapila wrote: > On Fri, Jun 19, 2020 at 12:16 AM Alexey Kondratov > wrote: >> 1. Both ReplicationSlotsComputeRequiredXmin() and >> ReplicationSlotsComputeRequiredLSN() may have already been done in the >> LogicalConfirmReceivedLocation() if it was a

Re: Physical replication slot advance is not persistent

2020-07-09 Thread Amit Kapila
On Fri, Jun 19, 2020 at 12:16 AM Alexey Kondratov wrote: > > On 2020-06-16 10:27, Michael Paquier wrote: > > On Wed, Jun 10, 2020 at 08:57:17PM +0300, Alexey Kondratov wrote: > >> New test reproduces this issue well. Left it running for a couple of > >> hours > >> in repeat and it seems to be

Re: Physical replication slot advance is not persistent

2020-06-18 Thread Alexey Kondratov
On 2020-06-16 10:27, Michael Paquier wrote: On Wed, Jun 10, 2020 at 08:57:17PM +0300, Alexey Kondratov wrote: New test reproduces this issue well. Left it running for a couple of hours in repeat and it seems to be stable. Thanks for testing. I have been thinking about the minimum xmin and

Re: Physical replication slot advance is not persistent

2020-06-18 Thread Michael Paquier
On Tue, Jun 16, 2020 at 04:27:27PM +0900, Michael Paquier wrote: > We could do that. Now I found cleaner the direct comparison of > pg_replication_slots.restart before and after the restart. So I have > kept it. And done. There were conflicts in 001_stream_rep.pl for 11 and 12 but I have

Re: Physical replication slot advance is not persistent

2020-06-16 Thread Michael Paquier
On Wed, Jun 10, 2020 at 08:57:17PM +0300, Alexey Kondratov wrote: > New test reproduces this issue well. Left it running for a couple of hours > in repeat and it seems to be stable. Thanks for testing. I have been thinking about the minimum xmin and LSN computations on advancing, and actually I

Re: Physical replication slot advance is not persistent

2020-06-10 Thread Alexey Kondratov
On 2020-06-10 11:38, Kyotaro Horiguchi wrote: At Wed, 10 Jun 2020 15:53:53 +0900, Michael Paquier wrote in > > I find it really depressing how much obviously untested stuff gets > > added in this area. > > Prior to this patch pg_replication_slot_advance was not being tested > at all. >

Re: Physical replication slot advance is not persistent

2020-06-10 Thread Kyotaro Horiguchi
At Wed, 10 Jun 2020 15:53:53 +0900, Michael Paquier wrote in > On Tue, Jun 09, 2020 at 09:01:13PM +0300, Alexey Kondratov wrote: > > Yes, there was a ReplicationSlotsComputeRequiredLSN() call inside > > pg_physical_replication_slot_advance() in the v5 of the patch: > > > > But later it has been

Re: Physical replication slot advance is not persistent

2020-06-10 Thread Michael Paquier
On Tue, Jun 09, 2020 at 09:01:13PM +0300, Alexey Kondratov wrote: > Yes, there was a ReplicationSlotsComputeRequiredLSN() call inside > pg_physical_replication_slot_advance() in the v5 of the patch: > > But later it has been missed and we have not noticed that. > > I think that adding it back as

Re: Physical replication slot advance is not persistent

2020-06-09 Thread Alexey Kondratov
On 2020-06-09 20:19, Andres Freund wrote: Hi, On 2020-01-28 17:01:14 +0900, Michael Paquier wrote: So attached is an updated patch which addresses the problem just by marking a physical slot as dirty if any advancing is done. Some documentation is added about the fact that an advance is

Re: Physical replication slot advance is not persistent

2020-06-09 Thread Andres Freund
Hi, On 2020-01-28 17:01:14 +0900, Michael Paquier wrote: > So attached is an updated patch which addresses the problem just by > marking a physical slot as dirty if any advancing is done. Some > documentation is added about the fact that an advance is persistent > only at the follow-up

Re: Physical replication slot advance is not persistent

2020-01-31 Thread Alexey Kondratov
On 30.01.2020 05:19, Michael Paquier wrote: On Wed, Jan 29, 2020 at 05:10:20PM +0900, Kyotaro Horiguchi wrote: Looks perfect. Thanks Horiguchi-san and others. Applied and back-patched down to 11. Great! Thanks for getting this done. -- Alexey Kondratov Postgres Professional

Re: Physical replication slot advance is not persistent

2020-01-29 Thread Michael Paquier
On Wed, Jan 29, 2020 at 05:10:20PM +0900, Kyotaro Horiguchi wrote: > Looks perfect. Thanks Horiguchi-san and others. Applied and back-patched down to 11. -- Michael signature.asc Description: PGP signature

Re: Physical replication slot advance is not persistent

2020-01-29 Thread Kyotaro Horiguchi
At Wed, 29 Jan 2020 15:45:56 +0900, Michael Paquier wrote in > On Tue, Jan 28, 2020 at 06:06:06PM +0300, Alexey Kondratov wrote: > > On 28.01.2020 15:14, Kyotaro Horiguchi wrote: > >> But the doc part looks a bit too detailed to me. Couldn't we explain > >> that without the word 'dirty'? .. >

Re: Physical replication slot advance is not persistent

2020-01-28 Thread Michael Paquier
On Tue, Jan 28, 2020 at 06:06:06PM +0300, Alexey Kondratov wrote: > On 28.01.2020 15:14, Kyotaro Horiguchi wrote: >> I agree not to save slots immediately. The code is wrtten as described >> above. The TAP test is correct. > > +1, removing this broken saving code path from

Re: Physical replication slot advance is not persistent

2020-01-28 Thread Alexey Kondratov
On 28.01.2020 15:14, Kyotaro Horiguchi wrote: At Tue, 28 Jan 2020 17:45:47 +0800, Craig Ringer wrote in On Tue, 28 Jan 2020 at 16:01, Michael Paquier wrote: So attached is an updated patch which addresses the problem just by marking a physical slot as dirty if any advancing is done. Some

Re: Physical replication slot advance is not persistent

2020-01-28 Thread Kyotaro Horiguchi
At Tue, 28 Jan 2020 17:45:47 +0800, Craig Ringer wrote in > On Tue, 28 Jan 2020 at 16:01, Michael Paquier wrote: > > > > On Tue, Jan 21, 2020 at 02:07:30PM +0900, Michael Paquier wrote: > > > On Mon, Jan 20, 2020 at 11:00:14AM -0800, Andres Freund wrote: > > >> Hm. I'm think testing this with

Re: Physical replication slot advance is not persistent

2020-01-28 Thread Craig Ringer
On Tue, 28 Jan 2020 at 16:01, Michael Paquier wrote: > > On Tue, Jan 21, 2020 at 02:07:30PM +0900, Michael Paquier wrote: > > On Mon, Jan 20, 2020 at 11:00:14AM -0800, Andres Freund wrote: > >> Hm. I'm think testing this with real LSNs is a better idea. What if the > >> node actually already is

Re: Physical replication slot advance is not persistent

2020-01-28 Thread Michael Paquier
On Tue, Jan 21, 2020 at 02:07:30PM +0900, Michael Paquier wrote: > On Mon, Jan 20, 2020 at 11:00:14AM -0800, Andres Freund wrote: >> Hm. I'm think testing this with real LSNs is a better idea. What if the >> node actually already is past FF/ at this point? Quite unlikely, >> I know, but

Re: Physical replication slot advance is not persistent

2020-01-20 Thread Craig Ringer
((On Tue, 21 Jan 2020 at 11:06, Michael Paquier wrote: > > The replication interface should not immediately flush changes to the > > slot replay position on advance. It should be marked dirty and left to > > be flushed by the next checkpoint. Doing otherwise potentially > > introduces a lot of

Re: Physical replication slot advance is not persistent

2020-01-20 Thread Michael Paquier
On Mon, Jan 20, 2020 at 11:00:14AM -0800, Andres Freund wrote: > On 2020-01-20 15:45:40 +0900, Michael Paquier wrote: >> 1) The slot advancing has to mark the slot as dirty, but should we >> make the change persistent at the end of the function or should we >> wait for a checkpoint to do the work,

Re: Physical replication slot advance is not persistent

2020-01-20 Thread Michael Paquier
On Tue, Jan 21, 2020 at 09:44:12AM +0800, Craig Ringer wrote: > PLEASE do not make the streaming replication interface force flushes! Yeah, that's a bad idea. FWIW, my understanding is that this has been only proposed in v3, and this has been discarded:

Re: Physical replication slot advance is not persistent

2020-01-20 Thread Craig Ringer
On Fri, 17 Jan 2020 at 01:09, Alexey Kondratov wrote: > > > I think we shouldn't touch the paths used by replication protocol. And > > don't we focus on how we make a change of a replication slot from SQL > > interface persistent? It seems to me that generaly we don't need to > > save dirty

Re: Physical replication slot advance is not persistent

2020-01-20 Thread Kyotaro Horiguchi
Thanks for looking this. At Mon, 20 Jan 2020 11:00:14 -0800, Andres Freund wrote in > > Here is a summary of the points raised (please correct me if that > > does not sound right to you, Andres): > > > 1) The slot advancing has to mark the slot as dirty, but should we > > make the change

Re: Physical replication slot advance is not persistent

2020-01-20 Thread Andres Freund
Hi, On 2020-01-20 15:45:40 +0900, Michael Paquier wrote: > On Thu, Jan 16, 2020 at 08:09:09PM +0300, Alexey Kondratov wrote: > > OK, I have definitely overthought that, thanks. This looks like a minimal > > subset of changes that actually solves the bug. I would only prefer to keep > > some

Re: Physical replication slot advance is not persistent

2020-01-20 Thread a . kondratov
On 20 Jan 2020, 09:45 +0300, Michael Paquier , wrote: > > So, I have been looking at this patch by myself, and updated it so as > the extra slot save is done only if any advancing has been done, on > top of the other computations that had better be around for > consistency. The patch includes TAP

Re: Physical replication slot advance is not persistent

2020-01-19 Thread Michael Paquier
On Thu, Jan 16, 2020 at 08:09:09PM +0300, Alexey Kondratov wrote: > OK, I have definitely overthought that, thanks. This looks like a minimal > subset of changes that actually solves the bug. I would only prefer to keep > some additional comments (something like the attached), otherwise after half

Re: Physical replication slot advance is not persistent

2020-01-16 Thread Alexey Kondratov
On 09.01.2020 09:36, Kyotaro Horiguchi wrote: Hello. At Sun, 29 Dec 2019 15:12:16 +0300, Alexey Kondratov wrote in On 2019-12-26 16:35, Alexey Kondratov wrote: Another concern is that ReplicationSlotIsDirty is added with the only one user. It also cannot be used by SaveSlotToPath due to the

Re: Physical replication slot advance is not persistent

2020-01-08 Thread Kyotaro Horiguchi
Hello. At Sun, 29 Dec 2019 15:12:16 +0300, Alexey Kondratov wrote in > On 2019-12-26 16:35, Alexey Kondratov wrote: > > Another concern is that ReplicationSlotIsDirty is added with the only > > one user. It also cannot be used by SaveSlotToPath due to the > > simultaneous usage of both flags

Re: Physical replication slot advance is not persistent

2019-12-29 Thread Alexey Kondratov
On 2019-12-26 16:35, Alexey Kondratov wrote: Another concern is that ReplicationSlotIsDirty is added with the only one user. It also cannot be used by SaveSlotToPath due to the simultaneous usage of both flags dirty and just_dirtied there. In that way, I hope that we should call

Re: Physical replication slot advance is not persistent

2019-12-26 Thread Alexey Kondratov
On 26.12.2019 11:33, Kyotaro Horiguchi wrote: At Wed, 25 Dec 2019 20:28:04 +0300, Alexey Kondratov wrote in Yep, it helps with physical replication slot persistence after advance, but the whole validation (moveto <= endlsn) does not make sense for me. The value of moveto should be >= than

Re: Physical replication slot advance is not persistent

2019-12-26 Thread Kyotaro Horiguchi
At Wed, 25 Dec 2019 20:28:04 +0300, Alexey Kondratov wrote in > > Yep, it helps with physical replication slot persistence after > > advance, but the whole validation (moveto <= endlsn) does not make > > sense for me. The value of moveto should be >= than minlsn == > > confirmed_flush /

Re: Physical replication slot advance is not persistent

2019-12-25 Thread Alexey Kondratov
On 25.12.2019 16:51, Alexey Kondratov wrote: On 25.12.2019 07:03, Kyotaro Horiguchi wrote: As the result I think what is needed there is just checking if the returned lsn is equal or larger than moveto. Doen't the following change work? -    if (XLogRecPtrIsInvalid(endlsn)) +    if (moveto <=

Re: Physical replication slot advance is not persistent

2019-12-25 Thread Alexey Kondratov
On 25.12.2019 07:03, Kyotaro Horiguchi wrote: At Tue, 24 Dec 2019 20:12:32 +0300, Alexey Kondratov wrote in I dig into the code and it happens because of this if statement:     /* Update the on disk state when lsn was updated. */     if (XLogRecPtrIsInvalid(endlsn))     {        

Re: Physical replication slot advance is not persistent

2019-12-24 Thread Kyotaro Horiguchi
At Tue, 24 Dec 2019 20:12:32 +0300, Alexey Kondratov wrote in > I dig into the code and it happens because of this if statement: > >     /* Update the on disk state when lsn was updated. */ >     if (XLogRecPtrIsInvalid(endlsn)) >     { >         ReplicationSlotMarkDirty(); >        

Physical replication slot advance is not persistent

2019-12-24 Thread Alexey Kondratov
Hi Hackers, I have accidentally noticed that pg_replication_slot_advance only changes in-memory state of the slot when its type is physical. Its new value does not survive restart. Reproduction steps: 1) Create new slot and remember its restart_lsn SELECT