Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-21 Thread Bertrand Drouvot
and via > create/alter subscription respectively, and implement > pg_alter_replication_slot(). +1 on this. > FWIW, adding the new SQL API pg_alter_replication_slot() isn't that hard. Also I think we should ensure that one could "only" alter the timeout property for the time being (if not that could lead to the subscription inconsistency mentioned above). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-20 Thread Bertrand Drouvot
Hi, On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote: > On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot > wrote: > > > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > > > > > > 2. last_inactive_at and inactiv

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-20 Thread Bertrand Drouvot
Hi, On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot > wrote: > > > > > > Hm. Are you suggesting inactive_timeout to be a slot level parameter > > > > similar t

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-20 Thread Bertrand Drouvot
Hi, On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote: > On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot > wrote: > > > > > > Hm. Are you suggesting inactive_timeout to be a slot level parameter > > > > similar t

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-20 Thread Bertrand Drouvot
, so that queries would return the right invalidation status. > > 4. New invalidation mechanisms interaction with slot sync feature. > > > > Yeah, this is important. My initial thoughts are that synced slots > shouldn't be invalidated on the standby due to timeout. +1 Rega

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-19 Thread Bertrand Drouvot
Hi, On Tue, Mar 19, 2024 at 04:20:35PM +0530, Amit Kapila wrote: > On Tue, Mar 19, 2024 at 3:11 PM Bertrand Drouvot > wrote: > > > > On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote: > > > On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot > > >

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-19 Thread Bertrand Drouvot
Hi, On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote: > On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot > wrote: > > Agree. While it makes sense to invalidate slots for wal removal in > > CreateCheckPoint() (because this is the place where wal is removed), I 'm

Re: Autogenerate some wait events code and documentation

2024-03-19 Thread Bertrand Drouvot
Hi, On Tue, Mar 19, 2024 at 09:59:35AM +0900, Michael Paquier wrote: > On Mon, Mar 18, 2024 at 05:57:02PM +0000, Bertrand Drouvot wrote: > > Thanks for looking at it! > > Oh right, the comment is wrong, re-worded in v2 attached. > > I've added a couple of fake

Re: Autogenerate some wait events code and documentation

2024-03-18 Thread Bertrand Drouvot
Hi, On Mon, Mar 18, 2024 at 08:49:34AM -0700, Noah Misch wrote: > On Mon, Mar 18, 2024 at 09:04:44AM +0000, Bertrand Drouvot wrote: > > --- a/src/backend/utils/activity/wait_event_names.txt > > +++ b/src/backend/utils/activity/wait_event_names.txt > > @@ -24,7 +24,12 @@ &g

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-18 Thread Bertrand Drouvot
em when?: - their usage resume - in pg_get_replication_slots() The idea is to invalidate the slot when one resumes activity on it or wants to get information about it (and among other things wants to know if the slot is valid or not). Thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-18 Thread Bertrand Drouvot
replication slot to s/a invalidation/an invalidation/? 4 === While at it, shouldn't we also rename "conflict" to say "invalidation_cause" in InvalidatePossiblyObsoleteSlot()? 5 === + * rows_removed and wal_level_insufficient are only two reasons s/are only two/are the only t

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-18 Thread Bertrand Drouvot
to some plugins). As far max_slot_xid_age I've the feeling that a new GUC is good enough. > > Alternatively, we can go the route of making GUC a list of key-value > > pairs of {slot_name, inactive_timeout}, but this kind of GUC for > > setting slot level parameters is going to be the firs

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-18 Thread Bertrand Drouvot
Hi, On Sat, Mar 16, 2024 at 09:29:01AM +0530, Bharath Rupireddy wrote: > I've also responded to Bertrand's comments here. Thanks! > > On Wed, Mar 6, 2024 at 3:56 PM Bertrand Drouvot > wrote: > > > > 5 === > > > > -# Check conflict_reason is

Re: Autogenerate some wait events code and documentation

2024-03-18 Thread Bertrand Drouvot
ce that "Backpatch" is at > the end of a section, meaning that we would need a second keyword like > a "Section: EndBackpatch" or something like that. That's not strictly > necessary IMO as the format of the txt is easy to follow. I gave it a try in the POC patc

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-15 Thread Bertrand Drouvot
we just mentioning in the > > docs that "if invalidation_reason is rows_removed or > > wal_level_insufficient it's the reason for conflict with recovery". > > > > Fair point. I think we can go either way. Bertrand, Nathan, and > others, do you have an opinio

Re: Synchronizing slots from primary to standby

2024-03-15 Thread Bertrand Drouvot
and standby_slot_names is set - new data is inserted into the primary - then not replicated to subscriber (due to standby_slot_names) Then I think the both above steps will return true but data would be lost in case of failover. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open S

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-13 Thread Bertrand Drouvot
Hi, On Tue, Mar 12, 2024 at 09:19:35PM +0530, Bharath Rupireddy wrote: > On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot > wrote: > > > > > AFAIR, we don't prevent similar invalidations due to > > > 'max_slot_wal_keep_size' for sync slots, > > > >

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bertrand Drouvot
Hi, On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote: > On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot > wrote: > > > > On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote: > > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila > > > w

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bertrand Drouvot
"synced" slot on standby? I mean, do they make sense given the fact that those slots are not usable until the standby is promoted? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Spurious pgstat_drop_replslot() call

2024-03-12 Thread Bertrand Drouvot
t; ReplicationSlotAllocationLock? Yeah, makes fully sense and looks good to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Statistics Import and Export

2024-03-11 Thread Bertrand Drouvot
hrowing an error? (if ok, then I think we can get rid of returning a bool). 7 === +* If this relation is an index and that index has expressions in +* it, and the attnum specified s/is an index and that index has/is an index that has/? Regards, -- Bertrand Drouvot PostgreSQL Cont

Re: Spurious pgstat_drop_replslot() call

2024-03-11 Thread Bertrand Drouvot
kes sense to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Spurious pgstat_drop_replslot() call

2024-03-08 Thread Bertrand Drouvot
Hi, On Fri, Mar 08, 2024 at 07:55:39PM +0900, Michael Paquier wrote: > On Fri, Mar 08, 2024 at 10:19:11AM +0000, Bertrand Drouvot wrote: > > Indeed, it does not seem appropriate to remove stats during slot > > invalidation as > > one could still be interested to look at the

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-08 Thread Bertrand Drouvot
Hi, On Thu, Mar 07, 2024 at 02:17:53PM +0900, Michael Paquier wrote: > On Wed, Mar 06, 2024 at 09:05:59AM +0000, Bertrand Drouvot wrote: > > Yeah, all of the above done in v3 attached. > > In passing.. pgstat_create_replslot() and pgstat_drop_replslot() rely > on the assumpti

Spurious pgstat_drop_replslot() call

2024-03-08 Thread Bertrand Drouvot
, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com >From 56ade7f561c180ee0120cb8c33d1c39e32ab7863 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date: Fri, 8 Mar 2024 09:29:29 + Subject: [PATCH v1] Remove spuri

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-03-06 Thread Bertrand Drouvot
Hi, On Wed, Mar 06, 2024 at 05:45:56PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 6, 2024 at 4:51 PM Michael Paquier wrote: > > > > On Wed, Mar 06, 2024 at 09:17:58AM +, Bertrand Drouvot wrote: > > > Right, somehow out of context here. > > > > We're

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-06 Thread Bertrand Drouvot
Hi, On Wed, Mar 06, 2024 at 02:46:57PM +0530, Bharath Rupireddy wrote: > On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot > wrote: > > Yeah, I'm okay with one column. > > Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change. Thanks!

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-03-06 Thread Bertrand Drouvot
Hi, On Wed, Mar 06, 2024 at 02:47:55PM +0900, Michael Paquier wrote: > On Tue, Mar 05, 2024 at 10:17:03AM +0000, Bertrand Drouvot wrote: > > On Tue, Mar 05, 2024 at 09:42:20AM +0900, Michael Paquier wrote: > > The issue is that then the new ASSERT is > > triggered leading to

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-06 Thread Bertrand Drouvot
Hi, On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote: > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot > > wrote: > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart w

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-06 Thread Bertrand Drouvot
Hi, On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote: > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot > wrote: > Thanks. Can we try to get rid of multiple LwLockRelease in > pgstat_reset_replslot(). Is this any better? > > /* > -* Nothing to

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread Bertrand Drouvot
anged > > to false. > > > > Also don't we need to release the lock when we return here: > > /* > * Nothing to do for physical slots as we collect stats only for logical > * slots. > */ > if (SlotIsPhysical(slot)) > return; D'oh! Thanks! Fixed in v2 shared up-thread

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread Bertrand Drouvot
Hi, On Tue, Mar 05, 2024 at 09:55:32AM +0200, Heikki Linnakangas wrote: > On 01/03/2024 12:15, Bertrand Drouvot wrote: > > Hi hackers, > > > > I think that pgstat_reset_replslot() is missing LWLock protection. Indeed, > > we > > don't have any guarantee that th

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-03-05 Thread Bertrand Drouvot
Hi, On Tue, Mar 05, 2024 at 09:42:20AM +0900, Michael Paquier wrote: > On Mon, Feb 26, 2024 at 02:01:45PM +0000, Bertrand Drouvot wrote: > > Though [1] mentioned up-thread is not pushed yet, I'm Sharing the POC patch > > now > > (see the attached). > > I have l

Re: Synchronizing slots from primary to standby

2024-03-04 Thread Bertrand Drouvot
ted = (slot->data.invalidated != RS_INVAL_NONE); inactive = (slot->active_pid == 0); instead? I think it's easier to read and it looks like this is the way it's written in other places (at least the few I checked). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source D

Re: Statistics Import and Export

2024-03-04 Thread Bertrand Drouvot
t; > Leaving the export/import scripts off for the time being, as they haven't > > changed and the next likely change is to fold them into pg_dump. > > > > > > > v6 posted below. Thanks! I had in mind to look at it but it looks like a rebase is needed. Regards, -- B

Re: Synchronizing slots from primary to standby

2024-03-04 Thread Bertrand Drouvot
r target_lsn, XLogRecPtr flushed_lsn, +uint32 *wait_event) Same questions as for NeedToWaitForStandby(). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-03-04 Thread Bertrand Drouvot
; you have to read the accompanying comment or documentation to have any > > idea of its purpose. > > > > Yeah, one has to read the description but that is true for other > parameters like "temp_tablespaces". I don't have any better ideas but > open to suggestions. What ab

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-04 Thread Bertrand Drouvot
l slots. I'm > not seeing a strong need for another column. Yeah having two columns was more for convenience purpose. Without the "conflict" one, a slot conflicting with recovery would be "a logical slot having a non NULL invalidation_reason". I'm also fine with one col

Re: Injection points: some tools to wait and wake

2024-03-03 Thread Bertrand Drouvot
Hi, On Mon, Mar 04, 2024 at 10:44:34AM +0900, Michael Paquier wrote: > On Fri, Mar 01, 2024 at 06:52:45AM +0000, Bertrand Drouvot wrote: > > + if (defined($backend_type)) > > + { > > + $backend_type = qq('$backend_type'); > > + $die_

Re: Regardign RecentFlushPtr in WalSndWaitForWal()

2024-03-01 Thread Bertrand Drouvot
f (!RecoveryInProgress()) > RecentFlushPtr = GetFlushRecPtr(NULL); > else > RecentFlushPtr = GetXLogReplayRecPtr(NULL); > It seems to me that [2] alone could be sufficient. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Missing LWLock protection in pgstat_reset_replslot()

2024-03-01 Thread Bertrand Drouvot
Hi hackers, I think that pgstat_reset_replslot() is missing LWLock protection. Indeed, we don't have any guarantee that the slot is active (then preventing it to be dropped/recreated) when this function is executed. Attached a patch to add the missing protection. Regards, -- Bertrand Drouvot

Re: Synchronizing slots from primary to standby

2024-02-29 Thread Bertrand Drouvot
nes that rely on pg_logical_slot_get_changes_guts(), means: - pg_logical_slot_get_changes - pg_logical_slot_peek_changes - pg_logical_slot_get_binary_changes - pg_logical_slot_peek_binary_changes Not sure it's worth to mention the "binary" ones though as their doc mention they behave as their &q

Re: Injection points: some tools to wait and wake

2024-02-29 Thread Bertrand Drouvot
> > The log should provide some context about the caller failing, meaning > that the backend type and the injection point name should be mentioned > in these logs to help in debugging issues. Yeah, done in v3 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS O

Re: Injection points: some tools to wait and wake

2024-02-29 Thread Bertrand Drouvot
Hi, On Thu, Feb 29, 2024 at 02:34:35PM +0500, Andrey M. Borodin wrote: > > > > On 29 Feb 2024, at 13:35, Bertrand Drouvot > > wrote: > > > > Something like the attached? > > There's extraneous print "done\n"; doh! bad copy/paste ;-) > Also

Re: Injection points: some tools to wait and wake

2024-02-29 Thread Bertrand Drouvot
Hi, On Thu, Feb 29, 2024 at 02:56:23PM +0900, Michael Paquier wrote: > On Wed, Feb 28, 2024 at 06:20:41AM +0000, Bertrand Drouvot wrote: > > On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote: > >> On Tue, Feb 27, 2024 at 01:39:59PM +, Bertrand Drouvot wrote:

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
fer the original proposal but I think we can keep what has been pushed (Peter's proposal). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
ction > > instead of worker to test the operator-redirection using search-patch. > > This has been done to simplify the test case and reduce the added > > time. Thanks! > I have slightly adjusted the comments in the attached, otherwise, LGTM. Same here, LGTM. Regards, -- B

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
bout adding one to test the "new" behavior discussed up-thread? (logical replication will wait if slot mentioned in standby_slot_names is dropped and/or does not exist when the engine starts?) Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-28 Thread Bertrand Drouvot
+# to handle such attacks, it would have failed during Worth to mention the underlying check / function that would get an "unexpected" result? Except for the above (nit) comments the patch looks good to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot
Hi, On Wed, Feb 28, 2024 at 06:48:37AM +, Zhijie Hou (Fujitsu) wrote: > On Wednesday, February 28, 2024 2:38 PM Bertrand Drouvot > wrote: > > On Wed, Feb 28, 2024 at 08:49:19AM +0530, Amit Kapila wrote: > > > On Mon, Feb 26, 2024 at 9:13 AM shveta malik > > wr

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot
Hi, On Wed, Feb 28, 2024 at 08:49:19AM +0530, Amit Kapila wrote: > On Mon, Feb 26, 2024 at 9:13 AM shveta malik wrote: > > > > On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot > > wrote: > > > > > > Hi, > > > > I think to set sec

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot
Hi, On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote: > On Tue, Feb 27, 2024 at 01:39:59PM +0000, Bertrand Drouvot wrote: > > So, I'm ok with the new helper too. > > If both of you feel strongly about that, I'm OK with introducing > something like that. Thanks!

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot
Hi, On Tue, Feb 27, 2024 at 04:49:03PM +0500, Andrey M. Borodin wrote: > Hi, > > > On 27 Feb 2024, at 16:08, Bertrand Drouvot > > wrote: > > > > On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote: > >> > >> But, AFAICS, th

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot
cleaner, I think it may make them more difficult to understand as 'await_injection_point' would probably be too generic. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot
FilterStandbySlots. Maybe we don't need the double-checking and it is > enough to check in FilterStandbySlots? Good point, I have the feeling that it is enough to check in FilterStandbySlots(). Indeed, if the value is syntactically correct, then I think that its actual value "really&

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-26 Thread Bertrand Drouvot
Hi, On Tue, Feb 20, 2024 at 04:03:53PM +, Bertrand Drouvot wrote: > Hi, > > On Tue, Feb 20, 2024 at 02:33:44PM +0900, Michael Paquier wrote: > > On Tue, Feb 20, 2024 at 08:51:17AM +0900, Michael Paquier wrote: > > > Prefixing these with "initia

Re: Synchronizing slots from primary to standby

2024-02-26 Thread Bertrand Drouvot
syntax before > we return due to NULL ReplicationSlotCtl. We get NULL > ReplicationSlotCtl during instance startup in > check_standby_slot_names() as postmaster first loads GUC-table and > then initializes shared-memory for replication slots. See calls of > InitializeGUCOptions() and Cr

Re: Synchronizing slots from primary to standby

2024-02-26 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 05:18:25PM +0530, Amit Kapila wrote: > On Mon, Feb 26, 2024 at 12:59 PM Bertrand Drouvot > wrote: > > > > 10 === > > > > > > > > + i

Re: Documentation: warn about two_phase when altering a subscription

2024-02-26 Thread Bertrand Drouvot
better when reading. But other > than that I think it's quite clear. Thanks! [1]: https://www.postgresql.org/message-id/CAA4eK1KdZMtJfo%3DK%3DXWsQQu8OHutT_dwJV2D3zs4h9g5zdNz2A%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-26 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 12:57:09PM +0900, Michael Paquier wrote: > On Thu, Feb 22, 2024 at 08:00:24AM +0000, Bertrand Drouvot wrote: > > +/* Maximum number of wait usable in injection points at once */ > > > > s/Maximum number of wait/Maximum number of waits/ ? >

Re: Synchronizing slots from primary to standby

2024-02-25 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 10:48:38AM +0530, Amit Kapila wrote: > On Fri, Feb 23, 2024 at 4:45 PM Bertrand Drouvot > wrote: > > > > On Fri, Feb 23, 2024 at 09:46:00AM +, Zhijie Hou (Fujitsu) wrote: > > > > > > Besides, I'd like

Re: Synchronizing slots from primary to standby

2024-02-25 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 09:13:05AM +0530, shveta malik wrote: > On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot > wrote: > > > > Hi, > > > I think to set secure search path for remote connection, the standard > > > approach > > > could be

Re: Synchronizing slots from primary to standby

2024-02-25 Thread Bertrand Drouvot
Hi, On Mon, Feb 26, 2024 at 02:18:58AM +, Zhijie Hou (Fujitsu) wrote: > On Friday, February 23, 2024 6:12 PM Bertrand Drouvot > wrote: > > + if (!ok) > > + GUC_check_errdetail("List syntax is invalid."); > > + > > + /* &

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 09:30:58AM +, Zhijie Hou (Fujitsu) wrote: > On Friday, February 23, 2024 5:07 PM Bertrand Drouvot > wrote: > > On Fri, Feb 23, 2024 at 02:15:11PM +0530, shveta malik wrote: > > > > > > Thanks for the details. I understand it

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
reason is that setting standby_slot_names to a non empty value means that one wants the walsender to wait until the standby catchup. The way to remove this intentional behavior should be by changing the standby_slot_names value (not the existence or the state of the slot(s) it points too). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
rving the standby. I think we should handle this case differently, thoughts? 11 === +* Specified physical slot have been invalidated, so no point s/have been/has been/? 12 === +++ b/src/backend/replication/slotfuncs.c @@ -22,6 +22,7 @@ #include "replicat

Re: Synchronizing slots from primary to standby

2024-02-23 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 02:15:11PM +0530, shveta malik wrote: > On Fri, Feb 23, 2024 at 1:28 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > Because one could create say the "=" OPERATOR in their own schema, attach a > > function to it do

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 09:43:48AM +0530, shveta malik wrote: > On Fri, Feb 23, 2024 at 8:35 AM shveta malik wrote: > > > > On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot > > wrote: > > > > > > Suppose that in synchronize_slots() the query woul

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
Hi, On Fri, Feb 23, 2024 at 08:35:44AM +0530, shveta malik wrote: > On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot > wrote: > > > > Suppose that in synchronize_slots() the query would be: > > > > const char *query = "SELECT slot_name, plugin, confirmed_fl

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
Hi, On Thu, Feb 22, 2024 at 04:01:34PM +0530, shveta malik wrote: > On Thu, Feb 22, 2024 at 3:44 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > Thanks! > > > > Some random comments about v92_001 (Sorry if it has already been discussed > > up-

Re: Synchronizing slots from primary to standby

2024-02-22 Thread Bertrand Drouvot
* not interact with user tables, eliminating the risk of executing +* arbitrary code within triggers. Right. I did not check but if we are using operators in our remote SPI calls then it would be worth to ensure they are coming from the pg_catalog schema? Using something like "OPER

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-02-22 Thread Bertrand Drouvot
Hi, On Wed, Feb 21, 2024 at 08:10:00PM +0530, Bharath Rupireddy wrote: > On Wed, Feb 21, 2024 at 5:55 PM Bertrand Drouvot > wrote: > > My initial thought was to put "conflict" value in this new field in case of > > conflict (not to mention the conflict rea

Re: Injection points: some tools to wait and wake

2024-02-22 Thread Bertrand Drouvot
Hi, On Thu, Feb 22, 2024 at 12:02:01PM +0900, Michael Paquier wrote: > On Wed, Feb 21, 2024 at 11:50:21AM +0000, Bertrand Drouvot wrote: > > A few comments: > > > > 1 === > > I think "up" is missing at several places in the patch where "wake" is u

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-02-21 Thread Bertrand Drouvot
e should add a few words about it in the doc? Looking at v5-0001: + + invalidation_reason text + + My initial thought was to put "conflict" value in this new field in case of conflict (not to mention the conflict reason in it). With the current proposal invalidation_r

Re: Injection points: some tools to wait and wake

2024-02-21 Thread Bertrand Drouvot
up(PG_FUNCTION_ARGS) Empty line missing before "Datum"? 6 === Also maybe some comments are missing above injection_point_init_state(), injection_init_shmem() but it's more a Nit. 7 === While at it, should we add a second injection wait point in t/041_invalid_checkpoint_after_promote.pl to check that they are wake up individually? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-21 Thread Bertrand Drouvot
Hi, On Wed, Feb 21, 2024 at 07:08:03AM +0900, Michael Paquier wrote: > On Tue, Feb 20, 2024 at 03:55:08PM +0000, Bertrand Drouvot wrote: > > +PG_FUNCTION_INFO_V1(injection_points_wake); > > +Datum > > +injection_points_wake(PG_FUNCTION_ARGS) > > +{ > > > >

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-20 Thread Bertrand Drouvot
ary). I'll polish and propose my POC test once [1] is pushed (unless you're curious about it before). [1]: https://www.postgresql.org/message-id/flat/ZdLuxBk5hGpol91B%40paquier.xyz Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-20 Thread Bertrand Drouvot
Hi, On Tue, Feb 20, 2024 at 08:28:28AM +0900, Michael Paquier wrote: > On Mon, Feb 19, 2024 at 02:28:04PM +0000, Bertrand Drouvot wrote: > > If slock_t protects "only" one counter, then what about using > > pg_atomic_uint64 > > or pg_atomic_uint32 instead? And bt

Re: Injection points: some tools to wait and wake

2024-02-19 Thread Bertrand Drouvot
it * condition and otherwise sleeps " but is it needed in our particular case here? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-19 Thread Bertrand Drouvot
k a while for me to understand the above condition, can we > simplify it like below using De Morgan's laws for better readability? > > Assert((conflict_prev == RS_INVAL_NONE) || !terminated || > (conflict_prev == conflict)); I don't have a strong opinon on this, looks like a matter of taste. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: System username in pg_stat_activity

2024-02-19 Thread Bertrand Drouvot
Hi, On Fri, Feb 16, 2024 at 09:41:41PM +0100, Magnus Hagander wrote: > On Fri, Feb 16, 2024 at 8:55 PM Andres Freund wrote: > > > > Hi, > > > > On 2024-01-12 17:16:53 +0100, Magnus Hagander wrote: > > > On Thu, Jan 11, 2024 at 5:55 PM Bertrand Drouvot > &

Re: System username in pg_stat_activity

2024-02-19 Thread Bertrand Drouvot
id/ZdMWux1HpIebkEmd%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Avoid switching between system-user and system-username in the doc

2024-02-19 Thread Bertrand Drouvot
/CAOBaU_Yp08MQOK7_k4QVyxL6sf7TURGpjX3tn1Z%2BWxJo2x7%2BGQ%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com >From 379473a4d7172042107ef587e430f168bc133ad5 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date:

Re: System username in pg_stat_activity

2024-02-19 Thread Bertrand Drouvot
Hi, On Fri, Feb 16, 2024 at 08:17:41PM +0100, Magnus Hagander wrote: > On Fri, Jan 19, 2024 at 7:20 AM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Thu, Jan 18, 2024 at 04:01:33PM +0100, Magnus Hagander wrote: > > > On Mon, Jan 15, 2024 at 11:17 AM

Re: Synchronizing slots from primary to standby

2024-02-18 Thread Bertrand Drouvot
Hi, On Sat, Feb 17, 2024 at 10:10:18AM +0530, Amit Kapila wrote: > On Fri, Feb 16, 2024 at 4:10 PM shveta malik wrote: > > > > On Thu, Feb 15, 2024 at 10:48 PM Bertrand Drouvot > > wrote: > > > > > 5 === > > &

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-18 Thread Bertrand Drouvot
Hi, On Mon, Feb 19, 2024 at 03:14:07PM +0900, Michael Paquier wrote: > On Thu, Feb 15, 2024 at 11:24:59AM +0000, Bertrand Drouvot wrote: > > Thanks for looking at it! > > > > Agree, using an assertion instead in v3 attached. > > And you did not forget the PG_USE

Re: Synchronizing slots from primary to standby

2024-02-16 Thread Bertrand Drouvot
of primary's > slot would precede standby's catalog_xmin and we see this failure. > > As per this theory, we should disable autovacuum on primary to avoid > updates to catalog_xmin values. Makes sense to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
./src/test/recovery/tmp_check/log/040_standby_failover_slots_sync_standby1.log 4.0K ./src/test/recovery/tmp_check/log/040_standby_failover_slots_sync_subscriber1.log 12K ./src/test/recovery/tmp_check/log/regress_log_040_standby_failover_slots_sync Regards, -- Bertrand Drouvot PostgreSQL Co

Re: 035_standby_logical_decoding unbounded hang

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 12:48:16PM -0800, Noah Misch wrote: > On Wed, Feb 14, 2024 at 03:31:16PM +0000, Bertrand Drouvot wrote: > > What about creating a sub, say wait_for_restart_lsn_calculation() in > > Cluster.pm > > and then make use of it in create_logical_slot_o

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 05:58:47PM +0530, Amit Kapila wrote: > On Thu, Feb 15, 2024 at 5:46 PM Bertrand Drouvot > wrote: > > Also I was thinking: what about adding an output to > > pg_sync_replication_slots()? > > The output could be the number of sync slot

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
failover_slots_sync.pl for it? [1]: https://www.postgresql.org/message-id/CAA4eK1JcBG6TJ3o5iUd4z0BuTbciLV3dK4aKgb7OgrNGoLcfSQ%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
ication_slots()? The output could be the number of sync slots that have been created and are not considered as sync-ready during the execution. I think that could be a good addition to v2-0001-Add-a-log-if-remote-slot-didn-t-catch-up-to-local.patch proposed here (should trigger special attention in cas

Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 02:09:45PM +0900, Michael Paquier wrote: > On Thu, Jan 18, 2024 at 02:20:28PM +0000, Bertrand Drouvot wrote: > > On Thu, Jan 18, 2024 at 04:59:39PM +0530, Bharath Rupireddy wrote: > >> I'm not sure if it > >> can happen at all, but I th

Re: Synchronizing slots from primary to standby

2024-02-15 Thread Bertrand Drouvot
Hi, On Thu, Feb 15, 2024 at 02:49:54PM +0530, Amit Kapila wrote: > On Thu, Feb 15, 2024 at 9:05 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Thursday, February 15, 2024 10:49 AM Amit Kapila > > wrote: > > > > > > On Wed, Feb 14, 2024 at 7:26 PM Ber

Re: 035_standby_logical_decoding unbounded hang

2024-02-14 Thread Bertrand Drouvot
e "timed out waiting for logical slot to calculate its > restart_lsn"; > + What about creating a sub, say wait_for_restart_lsn_calculation() in Cluster.pm and then make use of it in create_logical_slot_on_standby() and above? (something like wait_for_restart_lsn_calculation-v1.patch attached). Regards, -- Bert

Re: Synchronizing slots from primary to standby

2024-02-14 Thread Bertrand Drouvot
that this doesn't fix the BF > > failures in > > which case I'll again look into those. > > I have verified that the patch can fix the issue on my machine(after adding > few > more checkpoints before slot invalidation test.) I also added one more check > in > the test to confirm the synced slot is not temp slot. Here is the v2 patch. Thanks! +# To ensure that restart_lsn has moved to a recent WAL position, we need +# to log XLOG_RUNNING_XACTS and make sure the same is processed as well +$primary->psql('postgres', "CHECKPOINT"); Instead of "CHECKPOINT" wouldn't a less heavy "SELECT pg_log_standby_snapshot();" be enough? Not a big deal but maybe we could do the change while modifying 040_standby_failover_slots_sync.pl in the next patch "Add a new slotsync worker". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-13 Thread Bertrand Drouvot
Hi, On Tue, Feb 13, 2024 at 05:20:35PM +0530, Amit Kapila wrote: > On Tue, Feb 13, 2024 at 4:59 PM Bertrand Drouvot > wrote: > > - 84% of the slotsync.c code is covered, the parts that are not are mainly > > related to "errors". > > > > Worth to try to ext

Re: Synchronizing slots from primary to standby

2024-02-13 Thread Bertrand Drouvot
rage? (I've in mind 731, 739, 766, 778, 786, 796, 808) [1]: https://htmlpreview.github.io/?https://raw.githubusercontent.com/bdrouvot/pg_code_coverage/main/src/backend/replication/logical/slotsync.c.gcov.html Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-12 Thread Bertrand Drouvot
Hi, On Mon, Feb 12, 2024 at 04:19:33PM +0530, Amit Kapila wrote: > On Mon, Feb 12, 2024 at 3:33 PM Bertrand Drouvot > wrote: > > > > A few random comments: > > > > > > 003 === > > > > + If, after executing the function, > > +

Re: Synchronizing slots from primary to standby

2024-02-12 Thread Bertrand Drouvot
le). 009 === Regarding 040_standby_failover_slots_sync.pl what about adding tests for? - synced slot invalidation (and ensure it's recreated once pg_sync_replication_slots() is called and when the slot in primary is valid) - cannot enable failover for a temporary replication slot - replication slots can only be synchronized from a standby server Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com

<    1   2   3   >