Re: Synchronizing slots from primary to standby

2024-05-08 Thread Bertrand Drouvot
Hi, On Mon, Apr 29, 2024 at 11:58:09AM +, Zhijie Hou (Fujitsu) wrote: > On Monday, April 29, 2024 5:11 PM shveta malik wrote: > > > > On Mon, Apr 29, 2024 at 11:38 AM shveta malik > > wrote: > > > > > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > > > wrote: > > > > > > > > On

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 5:28 PM Zhijie Hou (Fujitsu) wrote: > > On Monday, April 29, 2024 5:11 PM shveta malik wrote: > > > > On Mon, Apr 29, 2024 at 11:38 AM shveta malik > > wrote: > > > > > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > > > wrote: > > > > > > > > On Friday, March

RE: Synchronizing slots from primary to standby

2024-04-29 Thread Zhijie Hou (Fujitsu)
On Monday, April 29, 2024 5:11 PM shveta malik wrote: > > On Mon, Apr 29, 2024 at 11:38 AM shveta malik > wrote: > > > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > > wrote: > > > > > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > wrote: > > > > > > > > Hi, > > > > > > >

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 11:38 AM shveta malik wrote: > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > > wrote: > > > > > > Hi, > > > > > > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote: > > >

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 11:38 AM shveta malik wrote: > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > > wrote: > > > > > > Hi, > > > > > > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote: > > >

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) wrote: > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote: > > > Hi, > > > > > > Since the standby_slot_names patch has been committed, I

RE: Synchronizing slots from primary to standby

2024-04-28 Thread Zhijie Hou (Fujitsu)
On Friday, March 15, 2024 10:45 PM Bertrand Drouvot wrote: > > Hi, > > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote: > > Hi, > > > > Since the standby_slot_names patch has been committed, I am attaching > > the last doc patch for review. > > > > Thanks! > > 1 === > >

RE: Synchronizing slots from primary to standby

2024-04-12 Thread Zhijie Hou (Fujitsu)
On Friday, April 12, 2024 11:31 AM Amit Kapila wrote: > > On Thu, Apr 11, 2024 at 5:04 PM Zhijie Hou (Fujitsu) > wrote: > > > > On Thursday, April 11, 2024 12:11 PM Amit Kapila > wrote: > > > > > > > > 2. > > > - if (remote_slot->restart_lsn < slot->data.restart_lsn) > > > + if

Re: Synchronizing slots from primary to standby

2024-04-11 Thread Amit Kapila
On Thu, Apr 11, 2024 at 5:04 PM Zhijie Hou (Fujitsu) wrote: > > On Thursday, April 11, 2024 12:11 PM Amit Kapila > wrote: > > > > > 2. > > - if (remote_slot->restart_lsn < slot->data.restart_lsn) > > + if (remote_slot->confirmed_lsn < slot->data.confirmed_flush) > > elog(ERROR, > > "cannot

RE: Synchronizing slots from primary to standby

2024-04-11 Thread Zhijie Hou (Fujitsu)
On Thursday, April 11, 2024 12:11 PM Amit Kapila wrote: > > On Wed, Apr 10, 2024 at 5:28 PM Zhijie Hou (Fujitsu) > wrote: > > > > On Thursday, April 4, 2024 5:37 PM Amit Kapila > wrote: > > > > > > BTW, while thinking on this one, I > > > noticed that in the function

Re: Synchronizing slots from primary to standby

2024-04-10 Thread Amit Kapila
On Wed, Apr 10, 2024 at 5:28 PM Zhijie Hou (Fujitsu) wrote: > > On Thursday, April 4, 2024 5:37 PM Amit Kapila > wrote: > > > > BTW, while thinking on this one, I > > noticed that in the function LogicalConfirmReceivedLocation(), we first > > update > > the disk copy, see comment [1] and then

RE: Synchronizing slots from primary to standby

2024-04-10 Thread Zhijie Hou (Fujitsu)
On Thursday, April 4, 2024 5:37 PM Amit Kapila wrote: > > BTW, while thinking on this one, I > noticed that in the function LogicalConfirmReceivedLocation(), we first update > the disk copy, see comment [1] and then in-memory whereas the same is not > true in > update_local_synced_slot() for

RE: Synchronizing slots from primary to standby

2024-04-09 Thread Zhijie Hou (Fujitsu)
On Thursday, April 4, 2024 4:25 PM Masahiko Sawada wrote: Hi, > On Wed, Apr 3, 2024 at 7:06 PM Amit Kapila > wrote: > > > > On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila > wrote: > > > > > > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy > > > wrote: > > > > > > > I quickly looked at v8,

Re: Synchronizing slots from primary to standby

2024-04-08 Thread Amit Kapila
On Mon, Apr 8, 2024 at 7:01 PM Zhijie Hou (Fujitsu) wrote: > > Thanks for pushing. > > I checked the BF status, and noticed one BF failure, which I think is related > to > a miss in the test code. > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder=2024-04-08%2012%3A04%3A27 > > From

Re: Synchronizing slots from primary to standby

2024-04-08 Thread Amit Kapila
On Mon, Apr 8, 2024 at 9:49 PM Andres Freund wrote: > > On 2024-04-08 16:01:41 +0530, Amit Kapila wrote: > > Pushed. > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder=2024-04-08%2012%3A04%3A27 > > This unfortunately is a commit after > Right, and thanks for the report. Hou-San

Re: Synchronizing slots from primary to standby

2024-04-08 Thread Andres Freund
Hi, On 2024-04-08 16:01:41 +0530, Amit Kapila wrote: > Pushed. https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder=2024-04-08%2012%3A04%3A27 This unfortunately is a commit after commit 6f3d8d5e7cc Author: Amit Kapila Date: 2024-04-08 13:21:55 +0530 Fix the intermittent

RE: Synchronizing slots from primary to standby

2024-04-08 Thread Zhijie Hou (Fujitsu)
On Monday, April 8, 2024 6:32 PM Amit Kapila wrote: > > On Mon, Apr 8, 2024 at 12:19 PM Zhijie Hou (Fujitsu) > wrote: > > > > On Saturday, April 6, 2024 12:43 PM Amit Kapila > wrote: > > > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot > > > wrote: > > > > > > Yeah, that could be the first

Re: Synchronizing slots from primary to standby

2024-04-08 Thread Amit Kapila
On Mon, Apr 8, 2024 at 12:19 PM Zhijie Hou (Fujitsu) wrote: > > On Saturday, April 6, 2024 12:43 PM Amit Kapila > wrote: > > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot > > wrote: > > > > Yeah, that could be the first step. We can probably add an injection point > > to > > control the

RE: Synchronizing slots from primary to standby

2024-04-08 Thread Zhijie Hou (Fujitsu)
On Saturday, April 6, 2024 12:43 PM Amit Kapila wrote: > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot > wrote: > > > > On Fri, Apr 05, 2024 at 06:23:10PM +0530, Amit Kapila wrote: > > > On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila > wrote: > > > Thinking more on this, it doesn't seem related

Re: Synchronizing slots from primary to standby

2024-04-07 Thread Amit Kapila
On Sun, Apr 7, 2024 at 3:06 AM Andres Freund wrote: > > On 2024-04-06 10:58:32 +0530, Amit Kapila wrote: > > On Sat, Apr 6, 2024 at 10:13 AM Amit Kapila wrote: > > > > > > > There are still a few pending issues to be fixed in this feature but > > otherwise, we have committed all the main

Re: Synchronizing slots from primary to standby

2024-04-06 Thread Andres Freund
Hi, On 2024-04-06 10:58:32 +0530, Amit Kapila wrote: > On Sat, Apr 6, 2024 at 10:13 AM Amit Kapila wrote: > > > > There are still a few pending issues to be fixed in this feature but > otherwise, we have committed all the main patches, so I marked the CF > entry corresponding to this work as

Re: Synchronizing slots from primary to standby

2024-04-06 Thread Bertrand Drouvot
Hi, On Sat, Apr 06, 2024 at 10:13:00AM +0530, Amit Kapila wrote: > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot > wrote: > > I think the new LSN can be visible only when the corresponding WAL is > written by XLogWrite(). I don't know what in XLogSetAsyncXactLSN() can > make it visible. In

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Amit Kapila
On Sat, Apr 6, 2024 at 10:13 AM Amit Kapila wrote: > There are still a few pending issues to be fixed in this feature but otherwise, we have committed all the main patches, so I marked the CF entry corresponding to this work as committed. -- With Regards, Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Amit Kapila
On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot wrote: > > On Fri, Apr 05, 2024 at 06:23:10PM +0530, Amit Kapila wrote: > > On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila wrote: > > Thinking more on this, it doesn't seem related to > > c9920a9068eac2e6c8fb34988d18c0b42b9bf811 as that commit doesn't

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 02:35:42PM +, Bertrand Drouvot wrote: > I think that maybe as a first step we should move the "elog(DEBUG2," message > as > proposed above to help debugging (that could help to confirm the above > theory). If you agree and think that makes sense, pleae find

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 06:23:10PM +0530, Amit Kapila wrote: > On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila wrote: > Thinking more on this, it doesn't seem related to > c9920a9068eac2e6c8fb34988d18c0b42b9bf811 as that commit doesn't change > any locking or something like that which impacts

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Amit Kapila
On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila wrote: > > On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > > There is an intermittent BF failure observed at [1] after this commit > > (2ec005b). > > > > Thanks for analyzing and providing the patch. I'll look into it. There > is another BF

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Amit Kapila
On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > There is an intermittent BF failure observed at [1] after this commit > (2ec005b). > Thanks for analyzing and providing the patch. I'll look into it. There is another BF failure [1] which I have analyzed. The main reason for failure is the

Re: Synchronizing slots from primary to standby

2024-04-05 Thread shveta malik
On Fri, Apr 5, 2024 at 4:31 PM Bertrand Drouvot wrote: > > BTW, I just realized that the LSN I used in my example in the > LSN_FORMAT_ARGS() > are not the right ones. Noted. Thanks. Please find v3 with the comments addressed. thanks Shveta

Re: Synchronizing slots from primary to standby

2024-04-05 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 04:09:01PM +0530, shveta malik wrote: > On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot > wrote: > > > > What about something like? > > > > ereport(LOG, > > errmsg("synchronized confirmed_flush_lsn for slot \"%s\" differs > > from remote slot", > >

Re: Synchronizing slots from primary to standby

2024-04-05 Thread shveta malik
On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot wrote: > > What about something like? > > ereport(LOG, > errmsg("synchronized confirmed_flush_lsn for slot \"%s\" differs from > remote slot", > remote_slot->name), > errdetail("Remote slot has LSN %X/%X but local slot has

Re: Synchronizing slots from primary to standby

2024-04-04 Thread Bertrand Drouvot
Hi, On Fri, Apr 05, 2024 at 09:43:35AM +0530, shveta malik wrote: > On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote: > > > On Thu, Apr 4, 2024 at 2:59 PM shveta malik > > > wrote: > > 2 === > > > > +

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot wrote: > > Hi, > > On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote: > > On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > > > > > > > Prior to commit 2ec005b, this check was okay, as we did not expect > > > restart_lsn of the

Re: Synchronizing slots from primary to standby

2024-04-04 Thread Bertrand Drouvot
Hi, On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote: > On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > > > > Prior to commit 2ec005b, this check was okay, as we did not expect > > restart_lsn of the synced slot to be ahead of remote since we were > > directly copying the

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > Prior to commit 2ec005b, this check was okay, as we did not expect > restart_lsn of the synced slot to be ahead of remote since we were > directly copying the lsns. But now when we use 'advance' to do logical > decoding on standby, there is

Re: Synchronizing slots from primary to standby

2024-04-04 Thread Amit Kapila
On Thu, Apr 4, 2024 at 1:55 PM Masahiko Sawada wrote: > > While testing this change, I realized that it could happen that the > server logs are flooded with the following logical decoding logs that > are written every 200 ms: > > 2024-04-04 16:15:19.270 JST [3838739] LOG: starting logical

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Wed, Apr 3, 2024 at 3:36 PM Amit Kapila wrote: > > On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote: > > > > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy > > wrote: > > > > > I quickly looked at v8, and have a nit, rest all looks good. > > > > > > +if (DecodingContextReady(ctx)

Re: Synchronizing slots from primary to standby

2024-04-04 Thread Masahiko Sawada
On Wed, Apr 3, 2024 at 7:06 PM Amit Kapila wrote: > > On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote: > > > > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy > > wrote: > > > > > I quickly looked at v8, and have a nit, rest all looks good. > > > > > > +if (DecodingContextReady(ctx)

Re: Synchronizing slots from primary to standby

2024-04-03 Thread Amit Kapila
On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote: > > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy > wrote: > > > I quickly looked at v8, and have a nit, rest all looks good. > > > > +if (DecodingContextReady(ctx) && found_consistent_snapshot) > > +

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Amit Kapila
On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy wrote: > > On Wed, Apr 3, 2024 at 9:04 AM Amit Kapila wrote: > > > > > I'd just rename LogicalSlotAdvanceAndCheckSnapState(XLogRecPtr > > > moveto, bool *found_consistent_snapshot) to > > > pg_logical_replication_slot_advance(XLogRecPtr moveto,

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Bharath Rupireddy
On Wed, Apr 3, 2024 at 9:04 AM Amit Kapila wrote: > > > I'd just rename LogicalSlotAdvanceAndCheckSnapState(XLogRecPtr > > moveto, bool *found_consistent_snapshot) to > > pg_logical_replication_slot_advance(XLogRecPtr moveto, bool > > *found_consistent_snapshot) and use it. If others don't like

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Amit Kapila
On Tue, Apr 2, 2024 at 7:42 PM Bharath Rupireddy wrote: > > On Tue, Apr 2, 2024 at 7:25 PM Zhijie Hou (Fujitsu) > wrote: > > > > > 1. Can we just remove pg_logical_replication_slot_advance and use > > > LogicalSlotAdvanceAndCheckSnapState instead? If worried about the > > > function naming,

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Bharath Rupireddy
On Tue, Apr 2, 2024 at 7:25 PM Zhijie Hou (Fujitsu) wrote: > > > 1. Can we just remove pg_logical_replication_slot_advance and use > > LogicalSlotAdvanceAndCheckSnapState instead? If worried about the > > function naming, LogicalSlotAdvanceAndCheckSnapState can be renamed to > >

RE: Synchronizing slots from primary to standby

2024-04-02 Thread Zhijie Hou (Fujitsu)
On Tuesday, April 2, 2024 8:49 PM Bharath Rupireddy wrote: > > On Tue, Apr 2, 2024 at 2:11 PM Zhijie Hou (Fujitsu) > wrote: > > > > CFbot[1] complained about one query result's order in the tap-test, so I am > > attaching a V7 patch set which fixed this. There are no changes in 0001. > > > >

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Bharath Rupireddy
On Tue, Apr 2, 2024 at 2:11 PM Zhijie Hou (Fujitsu) wrote: > > CFbot[1] complained about one query result's order in the tap-test, so I am > attaching a V7 patch set which fixed this. There are no changes in 0001. > > [1] https://cirrus-ci.com/task/6375962162495488 Thanks. Here are some

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Bertrand Drouvot
Hi, On Tue, Apr 02, 2024 at 02:19:30PM +0530, Amit Kapila wrote: > On Tue, Apr 2, 2024 at 1:54 PM Bertrand Drouvot > wrote: > > What about adding a "wait" injection point in LogStandbySnapshot() to > > prevent > > checkpointer/bgwriter to log a standby snapshot? Something among those > >

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Amit Kapila
On Tue, Apr 2, 2024 at 1:54 PM Bertrand Drouvot wrote: > > On Tue, Apr 02, 2024 at 07:20:46AM +, Zhijie Hou (Fujitsu) wrote: > > I added one test in 040_standby_failover_slots_sync.pl in 0002 patch, which > > can > > reproduce the data loss issue consistently on my machine. > > Thanks! > > >

RE: Synchronizing slots from primary to standby

2024-04-02 Thread Zhijie Hou (Fujitsu)
On Tuesday, April 2, 2024 3:21 PM Zhijie Hou (Fujitsu) wrote: > On Tuesday, April 2, 2024 8:35 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Monday, April 1, 2024 7:30 PM Amit Kapila > > wrote: > > > > > > On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu) > > > > > > wrote: > > > > > > > > On

Re: Synchronizing slots from primary to standby

2024-04-02 Thread Bertrand Drouvot
Hi, On Tue, Apr 02, 2024 at 07:20:46AM +, Zhijie Hou (Fujitsu) wrote: > I added one test in 040_standby_failover_slots_sync.pl in 0002 patch, which > can > reproduce the data loss issue consistently on my machine. Thanks! > It may not reproduce > in some rare cases if concurrent

RE: Synchronizing slots from primary to standby

2024-04-02 Thread Zhijie Hou (Fujitsu)
On Tuesday, April 2, 2024 8:35 AM Zhijie Hou (Fujitsu) wrote: > > On Monday, April 1, 2024 7:30 PM Amit Kapila > wrote: > > > > On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu) > > > > wrote: > > > > > > On Friday, March 29, 2024 2:50 PM Amit Kapila > > > > > wrote: > > > > > > > > > > >

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bertrand Drouvot
Hi, On Tue, Apr 02, 2024 at 04:24:49AM +, Zhijie Hou (Fujitsu) wrote: > On Monday, April 1, 2024 9:28 PM Bertrand Drouvot > wrote: > > > > On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote: > > > On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot > > > > > > > > > 2 === > > > > > >

Re: Synchronizing slots from primary to standby

2024-04-01 Thread shveta malik
On Mon, Apr 1, 2024 at 5:05 PM Amit Kapila wrote: > > > 2 === > > > > + { > > + if (SnapBuildSnapshotExists(remote_slot->restart_lsn)) > > + { > > > > That could call SnapBuildSnapshotExists() multiple times for the same > > "restart_lsn" (for example in case of

RE: Synchronizing slots from primary to standby

2024-04-01 Thread Zhijie Hou (Fujitsu)
On Tuesday, April 2, 2024 8:43 AM Bharath Rupireddy wrote: > > On Mon, Apr 1, 2024 at 11:36 AM Zhijie Hou (Fujitsu) > wrote: > > > > Attach the V4 patch which includes the optimization to skip the > > decoding if the snapshot at the syncing restart_lsn is already > > serialized. It can avoid

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Amit Kapila
On Mon, Apr 1, 2024 at 6:58 PM Bertrand Drouvot wrote: > > On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote: > > On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot > > wrote: > > > Then there is no need to call WaitForStandbyConfirmation() as it could go > > > until > > > the

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bharath Rupireddy
On Mon, Apr 1, 2024 at 11:36 AM Zhijie Hou (Fujitsu) wrote: > > Attach the V4 patch which includes the optimization to skip the decoding if > the snapshot at the syncing restart_lsn is already serialized. It can avoid > most > of the duplicate decoding in my test, and I am doing some more tests

RE: Synchronizing slots from primary to standby

2024-04-01 Thread Zhijie Hou (Fujitsu)
On Monday, April 1, 2024 7:30 PM Amit Kapila wrote: > > On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Friday, March 29, 2024 2:50 PM Amit Kapila > wrote: > > > > > > > > > > > > > > 2. > > > +extern XLogRecPtr pg_logical_replication_slot_advance(XLogRecPtr > moveto, >

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bertrand Drouvot
Hi, On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote: > On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot > wrote: > > Then there is no need to call WaitForStandbyConfirmation() as it could go > > until > > the RecoveryInProgress() in StandbySlotsHaveCaughtup() for nothing (as we > >

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bharath Rupireddy
On Mon, Apr 1, 2024 at 10:40 AM Amit Kapila wrote: > > After this step and before the next, did you ensure that the slot sync > has synced the latest confirmed_flush/restart LSNs? You can query: > "select slot_name,restart_lsn, confirmed_flush_lsn from > pg_replication_slots;" to ensure the same

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Amit Kapila
On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot wrote: > > Hi, > > On Mon, Apr 01, 2024 at 06:05:34AM +, Zhijie Hou (Fujitsu) wrote: > > On Monday, April 1, 2024 8:56 AM Zhijie Hou (Fujitsu) > > wrote: > > Attach the V4 patch which includes the optimization to skip the decoding if > > the

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Amit Kapila
On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu) wrote: > > On Friday, March 29, 2024 2:50 PM Amit Kapila wrote: > > > > > > > > > 2. > > +extern XLogRecPtr pg_logical_replication_slot_advance(XLogRecPtr moveto, > > + bool *found_consistent_point); > > + > > > > This API looks a bit awkward

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Nisha Moond
Did performance test on optimization patch (v2-0001-optimize-the-slot-advancement.patch). Please find the results: Setup: - One primary node with 100 failover-enabled logical slots - 20 DBs, each having 5 failover-enabled logical replication slots - One physical standby node with

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Bertrand Drouvot
Hi, On Mon, Apr 01, 2024 at 06:05:34AM +, Zhijie Hou (Fujitsu) wrote: > On Monday, April 1, 2024 8:56 AM Zhijie Hou (Fujitsu) > wrote: > Attach the V4 patch which includes the optimization to skip the decoding if > the snapshot at the syncing restart_lsn is already serialized. It can avoid

Re: Synchronizing slots from primary to standby

2024-04-01 Thread shveta malik
On Mon, Apr 1, 2024 at 10:40 AM Amit Kapila wrote: > > On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy > wrote: > > > > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) > > wrote: > > > > > > [2] The steps to reproduce the data miss issue on a primary->standby > > > setup: > > > > I'm

RE: Synchronizing slots from primary to standby

2024-04-01 Thread Zhijie Hou (Fujitsu)
On Monday, April 1, 2024 8:56 AM Zhijie Hou (Fujitsu) wrote: > > On Friday, March 29, 2024 2:50 PM Amit Kapila > wrote: > > > > On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu) > > > > wrote: > > > > > > > > > Attach a new version patch which fixed an un-initialized variable > > > issue

Re: Synchronizing slots from primary to standby

2024-03-31 Thread Amit Kapila
On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy wrote: > > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) > wrote: > > > > [2] The steps to reproduce the data miss issue on a primary->standby setup: > > I'm trying to reproduce the problem with [1], but I can see the > changes after the

Re: Synchronizing slots from primary to standby

2024-03-31 Thread Bharath Rupireddy
On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) wrote: > > [2] The steps to reproduce the data miss issue on a primary->standby setup: I'm trying to reproduce the problem with [1], but I can see the changes after the standby is promoted. Am I missing anything here?

RE: Synchronizing slots from primary to standby

2024-03-31 Thread Zhijie Hou (Fujitsu)
On Friday, March 29, 2024 2:50 PM Amit Kapila wrote: > > On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu) > wrote: > > > > > > Attach a new version patch which fixed an un-initialized variable > > issue and added some comments. > > > > The other approach to fix this issue could be that the

Re: Synchronizing slots from primary to standby

2024-03-29 Thread Bertrand Drouvot
Hi, On Fri, Mar 29, 2024 at 02:35:22PM +0530, Amit Kapila wrote: > On Fri, Mar 29, 2024 at 1:08 PM Bertrand Drouvot > wrote: > > > > On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote: > > > On Friday, March 29, 2024 2:48 PM Bertrand Drouvot > > > wrote: > > > > > > > > Hi, >

Re: Synchronizing slots from primary to standby

2024-03-29 Thread Amit Kapila
On Fri, Mar 29, 2024 at 1:08 PM Bertrand Drouvot wrote: > > On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote: > > On Friday, March 29, 2024 2:48 PM Bertrand Drouvot > > wrote: > > > > > > Hi, > > > > > > On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote: >

Re: Synchronizing slots from primary to standby

2024-03-29 Thread Amit Kapila
On Fri, Mar 29, 2024 at 9:34 AM Hayato Kuroda (Fujitsu) wrote: > > Thanks for updating the patch! Here is a comment for it. > > ``` > +/* > + * By advancing the restart_lsn, confirmed_lsn, and xmin using > + * fast-forward logical decoding, we can verify whether a

Re: Synchronizing slots from primary to standby

2024-03-29 Thread Bertrand Drouvot
Hi, On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote: > On Friday, March 29, 2024 2:48 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote: > > > Attach a new version patch which fixed an un-initialized

RE: Synchronizing slots from primary to standby

2024-03-29 Thread Zhijie Hou (Fujitsu)
On Friday, March 29, 2024 2:48 PM Bertrand Drouvot wrote: > > Hi, > > On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote: > > Attach a new version patch which fixed an un-initialized variable > > issue and added some comments. Also, temporarily enable DEBUG2 for the > > 040

Re: Synchronizing slots from primary to standby

2024-03-29 Thread Amit Kapila
On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu) wrote: > > > Attach a new version patch which fixed an un-initialized variable issue and > added some comments. > The other approach to fix this issue could be that the slotsync worker get the serialized snapshot using pg_read_binary_file()

Re: Synchronizing slots from primary to standby

2024-03-29 Thread Bertrand Drouvot
Hi, On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote: > Attach a new version patch which fixed an un-initialized variable issue and > added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so > that > we can analyze the possible CFbot failures easily. >

RE: Synchronizing slots from primary to standby

2024-03-28 Thread Hayato Kuroda (Fujitsu)
Dear Hou, Thanks for updating the patch! Here is a comment for it. ``` +/* + * By advancing the restart_lsn, confirmed_lsn, and xmin using + * fast-forward logical decoding, we can verify whether a consistent + * snapshot can be built. This process also involves

Re: Synchronizing slots from primary to standby

2024-03-28 Thread shveta malik
On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu) wrote: > > Attach a new version patch which fixed an un-initialized variable issue and > added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so > that > we can analyze the possible CFbot failures easily. As suggested by

RE: Synchronizing slots from primary to standby

2024-03-28 Thread Zhijie Hou (Fujitsu)
On Thursday, March 28, 2024 10:02 PM Zhijie Hou (Fujitsu) wrote: > > On Thursday, March 28, 2024 7:32 PM Amit Kapila > wrote: > > > > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) > > > > wrote: > > > > > > When analyzing one BF error[1], we find an issue of slotsync: Since > > > we

RE: Synchronizing slots from primary to standby

2024-03-28 Thread Zhijie Hou (Fujitsu)
On Thursday, March 28, 2024 7:32 PM Amit Kapila wrote: > > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) > wrote: > > > > When analyzing one BF error[1], we find an issue of slotsync: Since we > > don't perform logical decoding for the synced slots when syncing the > > lsn/xmin of slot,

Re: Synchronizing slots from primary to standby

2024-03-28 Thread Bertrand Drouvot
Hi, On Thu, Mar 28, 2024 at 05:05:35PM +0530, Amit Kapila wrote: > On Thu, Mar 28, 2024 at 3:34 PM Bertrand Drouvot > wrote: > > > > On Thu, Mar 28, 2024 at 04:38:19AM +, Zhijie Hou (Fujitsu) wrote: > > > > > To fix this, we could use the fast forward logical decoding to advance > > > the

Re: Synchronizing slots from primary to standby

2024-03-28 Thread Amit Kapila
On Thu, Mar 28, 2024 at 3:34 PM Bertrand Drouvot wrote: > > On Thu, Mar 28, 2024 at 04:38:19AM +, Zhijie Hou (Fujitsu) wrote: > > > To fix this, we could use the fast forward logical decoding to advance the > > synced > > slot's lsn/xmin when syncing these values instead of directly updating

Re: Synchronizing slots from primary to standby

2024-03-28 Thread Amit Kapila
On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) wrote: > > When analyzing one BF error[1], we find an issue of slotsync: Since we don't > perform logical decoding for the synced slots when syncing the lsn/xmin of > slot, no logical snapshots will be serialized to disk. So, when user starts

Re: Synchronizing slots from primary to standby

2024-03-28 Thread Bertrand Drouvot
Hi, On Thu, Mar 28, 2024 at 04:38:19AM +, Zhijie Hou (Fujitsu) wrote: > Hi, > > When analyzing one BF error[1], we find an issue of slotsync: Since we don't > perform logical decoding for the synced slots when syncing the lsn/xmin of > slot, no logical snapshots will be serialized to disk.

RE: Synchronizing slots from primary to standby

2024-03-27 Thread Zhijie Hou (Fujitsu)
Hi, When analyzing one BF error[1], we find an issue of slotsync: Since we don't perform logical decoding for the synced slots when syncing the lsn/xmin of slot, no logical snapshots will be serialized to disk. So, when user starts to use these synced slots after promotion, it needs to re-build

Re: Synchronizing slots from primary to standby

2024-03-15 Thread Bertrand Drouvot
Hi, On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote: > Hi, > > Since the standby_slot_names patch has been committed, I am attaching the last > doc patch for review. > Thanks! 1 === + continue subscribing to publications now on the new primary server without + any

RE: Synchronizing slots from primary to standby

2024-03-13 Thread Zhijie Hou (Fujitsu)
Hi, Since the standby_slot_names patch has been committed, I am attaching the last doc patch for review. Best Regards, Hou zj v109-0001-Document-the-steps-to-check-if-the-standby-is-r.patch Description: v109-0001-Document-the-steps-to-check-if-the-standby-is-r.patch

Re: Synchronizing slots from primary to standby

2024-03-07 Thread shveta malik
On Fri, Mar 8, 2024 at 9:56 AM Ajin Cherian wrote: > >> Pushed with minor modifications. I'll keep an eye on BF. >> >> BTW, one thing that we should try to evaluate a bit more is the >> traversal of slots in StandbySlotsHaveCaughtup() where we verify if >> all the slots mentioned in

Re: Synchronizing slots from primary to standby

2024-03-07 Thread Ajin Cherian
On Fri, Mar 8, 2024 at 2:33 PM Amit Kapila wrote: > On Thu, Mar 7, 2024 at 12:00 PM Zhijie Hou (Fujitsu) > wrote: > > > > > > Attach the V108 patch set which addressed above and Peter's comments. > > I also removed the check for "*" in guc check hook. > > > > > Pushed with minor modifications.

Re: Synchronizing slots from primary to standby

2024-03-07 Thread Amit Kapila
On Thu, Mar 7, 2024 at 12:00 PM Zhijie Hou (Fujitsu) wrote: > > > Attach the V108 patch set which addressed above and Peter's comments. > I also removed the check for "*" in guc check hook. > Pushed with minor modifications. I'll keep an eye on BF. BTW, one thing that we should try to evaluate

RE: Synchronizing slots from primary to standby

2024-03-06 Thread Zhijie Hou (Fujitsu)
On Thursday, March 7, 2024 12:46 PM Amit Kapila wrote: > > On Thu, Mar 7, 2024 at 7:35 AM Peter Smith > wrote: > > > > Here are some review comments for v107-0001 > > > > == > > src/backend/replication/slot.c > > > > 1. > > +/* > > + * Struct for the configuration of standby_slot_names. > >

RE: Synchronizing slots from primary to standby

2024-03-06 Thread Zhijie Hou (Fujitsu)
On Thursday, March 7, 2024 10:05 AM Peter Smith wrote: > > Here are some review comments for v107-0001 Thanks for the comments. > > == > src/backend/replication/slot.c > > 1. > +/* > + * Struct for the configuration of standby_slot_names. > + * > + * Note: this must be a flat

Re: Synchronizing slots from primary to standby

2024-03-06 Thread Amit Kapila
On Thu, Mar 7, 2024 at 8:37 AM shveta malik wrote: > I thought about whether we can make standby_slot_names as USERSET instead of SIGHUP and it doesn't sound like a good idea as that can lead to inconsistent standby replicas even after configuring the correct value of standby_slot_names. One can

Re: Synchronizing slots from primary to standby

2024-03-06 Thread Amit Kapila
On Thu, Mar 7, 2024 at 7:35 AM Peter Smith wrote: > > Here are some review comments for v107-0001 > > == > src/backend/replication/slot.c > > 1. > +/* > + * Struct for the configuration of standby_slot_names. > + * > + * Note: this must be a flat representation that can be held in a single >

Re: Synchronizing slots from primary to standby

2024-03-06 Thread Masahiko Sawada
On Wed, Mar 6, 2024 at 5:53 PM Amit Kapila wrote: > > On Wed, Mar 6, 2024 at 12:07 PM Masahiko Sawada wrote: > > > > On Fri, Mar 1, 2024 at 3:22 PM Peter Smith wrote: > > > > > > On Fri, Mar 1, 2024 at 5:11 PM Masahiko Sawada > > > wrote: > > > > > > > ... > > > > +/* > > > > +

Re: Synchronizing slots from primary to standby

2024-03-06 Thread shveta malik
On Wed, Mar 6, 2024 at 6:54 PM Zhijie Hou (Fujitsu) wrote: > > On Wednesday, March 6, 2024 9:13 PM Zhijie Hou (Fujitsu) > wrote: > > > > On Wednesday, March 6, 2024 11:04 AM Zhijie Hou (Fujitsu) > > wrote: > > > > > > On Wednesday, March 6, 2024 9:30 AM Masahiko Sawada > > > wrote: > > > > >

Re: Synchronizing slots from primary to standby

2024-03-06 Thread Peter Smith
Here are some review comments for v107-0001 == src/backend/replication/slot.c 1. +/* + * Struct for the configuration of standby_slot_names. + * + * Note: this must be a flat representation that can be held in a single chunk + * of guc_malloc'd memory, so that it can be stored as the "extra"

RE: Synchronizing slots from primary to standby

2024-03-06 Thread Zhijie Hou (Fujitsu)
On Wednesday, March 6, 2024 9:13 PM Zhijie Hou (Fujitsu) wrote: > > On Wednesday, March 6, 2024 11:04 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Wednesday, March 6, 2024 9:30 AM Masahiko Sawada > > wrote: > > > > Hi, > > > > > On Fri, Mar 1, 2024 at 4:21 PM Zhijie Hou (Fujitsu) > > > > > >

RE: Synchronizing slots from primary to standby

2024-03-06 Thread Zhijie Hou (Fujitsu)
On Wednesday, March 6, 2024 11:04 AM Zhijie Hou (Fujitsu) wrote: > > On Wednesday, March 6, 2024 9:30 AM Masahiko Sawada > wrote: > > Hi, > > > On Fri, Mar 1, 2024 at 4:21 PM Zhijie Hou (Fujitsu) > > > > wrote: > > > > > > On Friday, March 1, 2024 2:11 PM Masahiko Sawada > > wrote: > > > >

Re: Synchronizing slots from primary to standby

2024-03-06 Thread Amit Kapila
On Wed, Mar 6, 2024 at 12:07 PM Masahiko Sawada wrote: > > On Fri, Mar 1, 2024 at 3:22 PM Peter Smith wrote: > > > > On Fri, Mar 1, 2024 at 5:11 PM Masahiko Sawada > > wrote: > > > > > ... > > > +/* > > > + * "*" is not accepted as in that case primary will not be able > > >

Re: Synchronizing slots from primary to standby

2024-03-05 Thread Masahiko Sawada
On Fri, Mar 1, 2024 at 3:22 PM Peter Smith wrote: > > On Fri, Mar 1, 2024 at 5:11 PM Masahiko Sawada wrote: > > > ... > > +/* > > + * "*" is not accepted as in that case primary will not be able to > > know > > + * for which all standbys to wait for. Even if we have

Re: Synchronizing slots from primary to standby

2024-03-05 Thread Masahiko Sawada
On Wed, Mar 6, 2024 at 12:47 PM Amit Kapila wrote: > > On Wed, Mar 6, 2024 at 7:36 AM Masahiko Sawada wrote: > > > > On Tue, Mar 5, 2024 at 4:21 PM Zhijie Hou (Fujitsu) > > wrote: > > > > I have one question about PhysicalWakeupLogicalWalSnd(): > > > > +/* > > + * Wake up the logical walsender

  1   2   3   4   5   6   7   8   9   >