Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-28 Thread Amit Kapila
On Thu, Apr 25, 2024 at 11:11 AM Bharath Rupireddy
 wrote:
>
> On Mon, Apr 22, 2024 at 7:21 PM Masahiko Sawada  wrote:
> >
> > > Please find the attached v35 patch.
> >
> > The documentation says about both 'active' and 'inactive_since'
> > columns of pg_replication_slots say:
> >
> > ---
> > active bool
> > True if this slot is currently actively being used
> >
> > inactive_since timestamptz
> > The time since the slot has become inactive. NULL if the slot is
> > currently being used. Note that for slots on the standby that are
> > being synced from a primary server (whose synced field is true), the
> > inactive_since indicates the last synchronization (see Section 47.2.3)
> > time.
> > ---
> >
> > When reading the description I thought if 'active' is true,
> > 'inactive_since' is NULL, but it doesn't seem to apply for temporary
> > slots.
>
> Right.
>
> > Since we don't reset the active_pid field of temporary slots
> > when the release, the 'active' is still true in the view but
> > 'inactive_since' is not NULL.
>
> Right. inactive_since is reset whenever the temporary slot is acquired
> again within the same backend that created the temporary slot.
>
> > Do you think we need to mention it in
> > the documentation?
>
> I think that's the reason we dropped "active" from the statement. It
> was earlier "NULL if the slot is currently actively being used.". But,
> per Bertrand's comment
> https://www.postgresql.org/message-id/ZehE2IJcsetSJMHC%40ip-10-97-1-34.eu-west-3.compute.internal
> changed it to ""NULL if the slot is currently being used.".
>
> Temporary slots retain the active = true and active_pid =  backend that created it> even when the slot is not being used until
> the lifetime of the backend process. We haven't tied active or
> active_pid flags to inactive_since, doing so now to represent the
> temporary slot behaviour for active and active_pid will confuse users
> more.
>

This is true and it's probably easy for us to understand as we
developed this feature but the same may not be true for others. I
wonder if we can be explicit about the difference of
active/inactive_since by adding something like the following for
inactive_since: Note that this field is not related to the active flag
as temporary slots can remain active till the session ends even when
they are not being used.

Sawada-San, do you have any suggestions on the wording?

>
 As far as the inactive_since of a slot is concerned, it is set
> to 0 when the slot is being used (acquired) and set to current
> timestamp when the slot is not being used (released).
>
> > As for the timeout-based slot invalidation feature, we could end up
> > invalidating the temporary slots even if they are shown as active,
> > which could confuse users. Do we want to somehow deal with it?
>
> Yes. As long as the temporary slot is lying unused holding up
> resources for more than the specified
> replication_slot_inactive_timeout, it is bound to get invalidated.
> This keeps behaviour consistent and less-confusing to the users.
>

Agreed. We may want to add something in the docs for this to avoid
confusion with the active flag.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-24 Thread Bharath Rupireddy
On Mon, Apr 22, 2024 at 7:21 PM Masahiko Sawada  wrote:
>
> > Please find the attached v35 patch.
>
> The documentation says about both 'active' and 'inactive_since'
> columns of pg_replication_slots say:
>
> ---
> active bool
> True if this slot is currently actively being used
>
> inactive_since timestamptz
> The time since the slot has become inactive. NULL if the slot is
> currently being used. Note that for slots on the standby that are
> being synced from a primary server (whose synced field is true), the
> inactive_since indicates the last synchronization (see Section 47.2.3)
> time.
> ---
>
> When reading the description I thought if 'active' is true,
> 'inactive_since' is NULL, but it doesn't seem to apply for temporary
> slots.

Right.

> Since we don't reset the active_pid field of temporary slots
> when the release, the 'active' is still true in the view but
> 'inactive_since' is not NULL.

Right. inactive_since is reset whenever the temporary slot is acquired
again within the same backend that created the temporary slot.

> Do you think we need to mention it in
> the documentation?

I think that's the reason we dropped "active" from the statement. It
was earlier "NULL if the slot is currently actively being used.". But,
per Bertrand's comment
https://www.postgresql.org/message-id/ZehE2IJcsetSJMHC%40ip-10-97-1-34.eu-west-3.compute.internal
changed it to ""NULL if the slot is currently being used.".

Temporary slots retain the active = true and active_pid =  even when the slot is not being used until
the lifetime of the backend process. We haven't tied active or
active_pid flags to inactive_since, doing so now to represent the
temporary slot behaviour for active and active_pid will confuse users
more. As far as the inactive_since of a slot is concerned, it is set
to 0 when the slot is being used (acquired) and set to current
timestamp when the slot is not being used (released).

> As for the timeout-based slot invalidation feature, we could end up
> invalidating the temporary slots even if they are shown as active,
> which could confuse users. Do we want to somehow deal with it?

Yes. As long as the temporary slot is lying unused holding up
resources for more than the specified
replication_slot_inactive_timeout, it is bound to get invalidated.
This keeps behaviour consistent and less-confusing to the users.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-22 Thread Masahiko Sawada
Hi,

On Thu, Apr 4, 2024 at 9:23 PM Bharath Rupireddy
 wrote:
>
> On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila  wrote:
> >
> > > Thanks for the changes. v34-0001 LGTM.
> >
> > I was doing a final review before pushing 0001 and found that
> > 'inactive_since' could be set twice during startup after promotion,
> > once while restoring slots and then via ShutDownSlotSync(). The reason
> > is that ShutDownSlotSync() will be invoked in normal startup on
> > primary though it won't do anything apart from setting inactive_since
> > if we have synced slots. I think you need to check 'StandbyMode' in
> > update_synced_slots_inactive_since() and return if the same is not
> > set. We can't use 'InRecovery' flag as that will be set even during
> > crash recovery.
> >
> > Can you please test this once unless you don't agree with the above theory?
>
> Nice catch. I've verified that update_synced_slots_inactive_since is
> called even for normal server startups/crash recovery. I've added a
> check to exit if the StandbyMode isn't set.
>
> Please find the attached v35 patch.
>

The documentation says about both 'active' and 'inactive_since'
columns of pg_replication_slots say:

---
active bool
True if this slot is currently actively being used

inactive_since timestamptz
The time since the slot has become inactive. NULL if the slot is
currently being used. Note that for slots on the standby that are
being synced from a primary server (whose synced field is true), the
inactive_since indicates the last synchronization (see Section 47.2.3)
time.
---

When reading the description I thought if 'active' is true,
'inactive_since' is NULL, but it doesn't seem to apply for temporary
slots. Since we don't reset the active_pid field of temporary slots
when the release, the 'active' is still true in the view but
'inactive_since' is not NULL. Do you think we need to mention it in
the documentation?

As for the timeout-based slot invalidation feature, we could end up
invalidating the temporary slots even if they are shown as active,
which could confuse users. Do we want to somehow deal with it?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-12 Thread Bharath Rupireddy
On Sat, Apr 6, 2024 at 5:10 PM Bharath Rupireddy
 wrote:
>
> Please see the attached v38 patch.

Hi, thanks everyone for reviewing the design and patches so far. Here
I'm with the v39 patches implementing inactive timeout based (0001)
and XID age based (0002) invalidation mechanisms.

I'm quoting the hackers who are okay with inactive timeout based
invalidation mechanism:
Bertrand Drouvot -
https://www.postgresql.org/message-id/ZgL0N%2BxVJNkyqsKL%40ip-10-97-1-34.eu-west-3.compute.internal
and 
https://www.postgresql.org/message-id/ZgPHDAlM79iLtGIH%40ip-10-97-1-34.eu-west-3.compute.internal
Amit Kapila - 
https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com
Nathan Bossart -
https://www.postgresql.org/message-id/20240325195443.GA2923888%40nathanxps13
Robert Haas - 
https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com

I'm quoting the hackers who are okay with XID age based invalidation mechanism:
Nathan Bossart -
https://www.postgresql.org/message-id/20240326150918.GB3181099%40nathanxps13
and https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13
Alvaro Herrera -
https://www.postgresql.org/message-id/202403261539.xcjfle7sksz7%40alvherre.pgsql
Bertrand Drouvot -
https://www.postgresql.org/message-id/ZgPHDAlM79iLtGIH%40ip-10-97-1-34.eu-west-3.compute.internal
Amit Kapila - 
https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com

There was a point raised by Robert
https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
for XID age based invalidation. An issue related to
vacuum_defer_cleanup_age
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=be504a3e974d75be6f95c8f9b7367126034f2d12
led to the removal of the GUC
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1118cd37eb61e6a2428f457a8b2026a7bb3f801a.
The same issue may not happen for the XID age based invaliation. This
is because the XID age is not calculated using FullTransactionId but
using TransactionId as the slot's xmin and catalog_xmin are tracked as
TransactionId.

There was a point raised by Amit
https://www.postgresql.org/message-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ%2Bg%40mail.gmail.com
on when to do the XID age based invalidation - whether in checkpointer
or when vacuum is being run or whenever ComputeXIDHorizons gets called
or in autovacuum process. For now, I've chosen the design to do these
new invalidation checks in two places - 1) whenever the slot is
acquired and the slot acquisition errors out if invalidated, 2) during
checkpoint. However, I'm open to suggestions on this.

I've also verified the case whether the replication_slot_xid_age
setting can help in case of server inching towards the XID wraparound.
I've created a primary and streaming standby setup with
hot_standby_feedback set to on (so that the slot gets an xmin). Then,
I've set replication_slot_xid_age to 2 billion on the primary, and
used xid_wraparound extension to reach XID wraparound on the primary.
Once I start receiving the WARNINGs about VACUUM, I did a checkpoint
after which the slot got invalidated enabling my VACUUM to freeze XIDs
saving my database from XID wraparound problem.

Thanks a lot Masahiko Sawada for an offlist chat about the XID age
calculation logic.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From f3ba2562ba7d9c4f13e283740260025b8d1c9b0f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 12 Apr 2024 14:52:35 +
Subject: [PATCH v39 1/2] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby, because such synced slots are typically considered not
active (for them to be later 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-06 Thread Bharath Rupireddy
On Sat, Apr 6, 2024 at 12:18 PM Amit Kapila  wrote:
>
> Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot()
> is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to
> ensure that there is no other active slot user? Is it sufficient to
> check inactive_since for the same? If so, we need some comments to
> explain the same.

I removed the separate functions and with minimal changes, I've now
placed the RS_INVAL_INACTIVE_TIMEOUT logic into
InvalidatePossiblyObsoleteSlot and use that even in
CheckPointReplicationSlots.

> Can we avoid introducing the new functions like
> SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we
> do the required work in the caller?

Hm. Removed them now.

Please see the attached v38 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v38-0001-Add-inactive_timeout-based-replication-slot-inva.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-06 Thread Amit Kapila
On Sat, Apr 6, 2024 at 11:55 AM Bharath Rupireddy
 wrote:
>

Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot()
is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to
ensure that there is no other active slot user? Is it sufficient to
check inactive_since for the same? If so, we need some comments to
explain the same.

Can we avoid introducing the new functions like
SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we
do the required work in the caller?

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-06 Thread Bharath Rupireddy
On Fri, Apr 5, 2024 at 1:14 PM Bertrand Drouvot
 wrote:
>
> > Please find the attached v36 patch.
>
> A few comments:
>
> 1 ===
>
> +   
> +The timeout is measured from the time since the slot has become
> +inactive (known from its
> +inactive_since value) until it gets
> +used (i.e., its active is set to true).
> +   
>
> That's right except when it's invalidated during the checkpoint (as the slot
> is not acquired in CheckPointReplicationSlots()).
>
> So, what about adding: "or a checkpoint occurs"? That would also explain that
> the invalidation could occur during checkpoint.

Reworded.

> 2 ===
>
> +   /* If the slot has been invalidated, recalculate the resource limits 
> */
> +   if (invalidated)
> +   {
>
> /If the slot/If a slot/?

Modified it to be like elsewhere.

> 3 ===
>
> + * NB - this function also runs as part of checkpoint, so avoid raising 
> errors
>
> s/NB - this/NB: This function/? (that looks more consistent with other 
> comments
> in the code)

Done.

> 4 ===
>
> + * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause 
> instead
>
> I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause 
> instead"
> looks weird to me. Maybe it would make sense to reword this a bit.

Reworded.

> 5 ===
>
> +* considered not active as they don't actually perform logical 
> decoding.
>
> Not sure that's 100% accurate as we switched in fast forward logical
> in 2ec005b4e2.
>
> "as they perform only fast forward logical decoding (or not at all)", maybe?

Changed it to "as they don't perform logical decoding to produce the
changes". In fast_forward mode no changes are produced.

> 6 ===
>
> +   if (RecoveryInProgress() && slot->data.synced)
> +   return false;
> +
> +   if (replication_slot_inactive_timeout == 0)
> +   return false;
>
> What about just using one if? It's more a matter of taste but it also probably
> reduces the object file size a bit for non optimized build.

Changed.

> 7 ===
>
> +   /*
> +* Do not invalidate the slots which are currently being 
> synced from
> +* the primary to the standby.
> +*/
> +   if (RecoveryInProgress() && slot->data.synced)
> +   return false;
>
> I think we don't need this check as the exact same one is done just before.

Right. Removed.

> 8 ===
>
> +sub check_for_slot_invalidation_in_server_log
> +{
> +   my ($node, $slot_name, $offset) = @_;
> +   my $invalidated = 0;
> +
> +   for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; 
> $i++)
> +   {
> +   $node->safe_psql('postgres', "CHECKPOINT");
>
> Wouldn't be better to wait for the replication_slot_inactive_timeout time 
> before
> instead of triggering all those checkpoints? (it could be passed as an extra 
> arg
> to wait_for_slot_invalidation()).

Done.

> 9 ===
>
> # Synced slot mustn't get invalidated on the standby, it must sync 
> invalidation
> # from the primary. So, we must not see the slot's invalidation message in 
> server
> # log.
> ok( !$standby1->log_contains(
> "invalidating obsolete replication slot \"lsub1_sync_slot\"",
> $standby1_logstart),
> 'check that syned slot has not been invalidated on the standby');
>
> Would that make sense to trigger a checkpoint on the standby before this test?
> I mean I think that without a checkpoint on the standby we should not see the
> invalidation in the log anyway.

Done.

Please find the attached v37 patch for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v37-0001-Add-inactive_timeout-based-replication-slot-inva.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-05 Thread Bertrand Drouvot
Hi,

On Fri, Apr 05, 2024 at 11:21:43AM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot
>  wrote:
> Please find the attached v36 patch.

Thanks!

A few comments:

1 ===

+   
+The timeout is measured from the time since the slot has become
+inactive (known from its
+inactive_since value) until it gets
+used (i.e., its active is set to true).
+   

That's right except when it's invalidated during the checkpoint (as the slot
is not acquired in CheckPointReplicationSlots()).

So, what about adding: "or a checkpoint occurs"? That would also explain that
the invalidation could occur during checkpoint.

2 ===

+   /* If the slot has been invalidated, recalculate the resource limits */
+   if (invalidated)
+   {

/If the slot/If a slot/?

3 ===

+ * NB - this function also runs as part of checkpoint, so avoid raising errors

s/NB - this/NB: This function/? (that looks more consistent with other comments
in the code)

4 ===

+ * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead

I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause 
instead"
looks weird to me. Maybe it would make sense to reword this a bit.

5 ===

+* considered not active as they don't actually perform logical 
decoding.

Not sure that's 100% accurate as we switched in fast forward logical
in 2ec005b4e2.

"as they perform only fast forward logical decoding (or not at all)", maybe?

6 ===

+   if (RecoveryInProgress() && slot->data.synced)
+   return false;
+
+   if (replication_slot_inactive_timeout == 0)
+   return false;

What about just using one if? It's more a matter of taste but it also probably
reduces the object file size a bit for non optimized build.

7 ===

+   /*
+* Do not invalidate the slots which are currently being synced 
from
+* the primary to the standby.
+*/
+   if (RecoveryInProgress() && slot->data.synced)
+   return false;

I think we don't need this check as the exact same one is done just before.

8 ===

+sub check_for_slot_invalidation_in_server_log
+{
+   my ($node, $slot_name, $offset) = @_;
+   my $invalidated = 0;
+
+   for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; 
$i++)
+   {
+   $node->safe_psql('postgres', "CHECKPOINT");

Wouldn't be better to wait for the replication_slot_inactive_timeout time before
instead of triggering all those checkpoints? (it could be passed as an extra arg
to wait_for_slot_invalidation()).

9 ===

# Synced slot mustn't get invalidated on the standby, it must sync invalidation
# from the primary. So, we must not see the slot's invalidation message in 
server
# log.
ok( !$standby1->log_contains(
"invalidating obsolete replication slot \"lsub1_sync_slot\"",
$standby1_logstart),
'check that syned slot has not been invalidated on the standby');

Would that make sense to trigger a checkpoint on the standby before this test?
I mean I think that without a checkpoint on the standby we should not see the
invalidation in the log anyway.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread Bharath Rupireddy
On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot
 wrote:
>
> > > shouldn't the slot be dropped/recreated instead of updating 
> > > inactive_since?
> >
> > The sync slots that are invalidated on the primary aren't dropped and
> > recreated on the standby.
>
> Yeah, right (I was confused with synced slots that are invalidated locally).
>
> > However, I
> > found that the synced slot is acquired and released unnecessarily
> > after the invalidation_reason is synced from the primary. I added a
> > skip check in synchronize_one_slot to skip acquiring and releasing the
> > slot if it's locally found inactive. With this, inactive_since won't
> > get updated for invalidated sync slots on the standby as we don't
> > acquire and release the slot.
>
> CR1 ===
>
> Yeah, I can see:
>
> @@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid 
> remote_dbid)
>" name slot \"%s\" already 
> exists on the standby",
>remote_slot->name));
>
> +   /*
> +* Skip the sync if the local slot is already invalidated. We 
> do this
> +* beforehand to save on slot acquire and release.
> +*/
> +   if (slot->data.invalidated != RS_INVAL_NONE)
> +   return false;
>
> Thanks to the drop_local_obsolete_slots() call I think we are not missing the 
> case
> where the slot has been invalidated on the primary, invalidation reason has 
> been
> synced on the standby and later the slot is dropped/ recreated manually on the
> primary (then it should be dropped/recreated on the standby too).
>
> Also it seems we are not missing the case where a sync slot is invalidated
> locally due to wal removal (it should be dropped/recreated).

Right.

> > > CR5 ===
> > >
> > > +   /*
> > > +* This function isn't expected to be called for inactive timeout 
> > > based
> > > +* invalidation. A separate function 
> > > InvalidateInactiveReplicationSlot is
> > > +* to be used for that.
> > >
> > > Do you think it's worth to explain why?
> >
> > Hm, I just wanted to point out the actual function here. I modified it
> > to something like the following, if others feel we don't need that, I
> > can remove it.
>
> Sorry If I was not clear but I meant to say "Do you think it's worth to 
> explain
> why we decided to create a dedicated function"? (currently we "just" explain 
> that
> we created one).

We added a new function (InvalidateInactiveReplicationSlot) to
invalidate slot based on inactive timeout because 1) we do the
inactive timeout invalidation at slot level as opposed to
InvalidateObsoleteReplicationSlots which does loop over all the slots,
2)
InvalidatePossiblyObsoleteSlot does release the lock in some cases,
has a lot of unneeded code for inactive timeout invalidation check, 3)
we want some control over saving the slot to disk because we hook the
inactive timeout invalidation into the loop that checkpoints the slot
info to the disk in CheckPointReplicationSlots.

I've added a comment atop InvalidateInactiveReplicationSlot.

Please find the attached v36 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v36-0001-Add-inactive_timeout-based-replication-slot-inva.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 5:53 PM Bharath Rupireddy
 wrote:
>
> On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila  wrote:
> >
> > > Thanks for the changes. v34-0001 LGTM.
> >
> > I was doing a final review before pushing 0001 and found that
> > 'inactive_since' could be set twice during startup after promotion,
> > once while restoring slots and then via ShutDownSlotSync(). The reason
> > is that ShutDownSlotSync() will be invoked in normal startup on
> > primary though it won't do anything apart from setting inactive_since
> > if we have synced slots. I think you need to check 'StandbyMode' in
> > update_synced_slots_inactive_since() and return if the same is not
> > set. We can't use 'InRecovery' flag as that will be set even during
> > crash recovery.
> >
> > Can you please test this once unless you don't agree with the above theory?
>
> Nice catch. I've verified that update_synced_slots_inactive_since is
> called even for normal server startups/crash recovery. I've added a
> check to exit if the StandbyMode isn't set.
>
> Please find the attached v35 patch.

Thanks for the patch. Tested it , works well. Few cosmetic changes needed:

in 040 test file:
1)
# Capture the inactive_since of the slot from the primary. Note that the slot
# will be inactive since the corresponding subscription is disabled..

2 .. at the end. Replace with one.

2)
# Synced slot on the standby must get its own inactive_since.

. not needed in single line comment (to be consistent with
neighbouring comments)


3)
update_synced_slots_inactive_since():

if (!StandbyMode)
return;

It will be good to add comments here.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread Bharath Rupireddy
On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila  wrote:
>
> > Thanks for the changes. v34-0001 LGTM.
>
> I was doing a final review before pushing 0001 and found that
> 'inactive_since' could be set twice during startup after promotion,
> once while restoring slots and then via ShutDownSlotSync(). The reason
> is that ShutDownSlotSync() will be invoked in normal startup on
> primary though it won't do anything apart from setting inactive_since
> if we have synced slots. I think you need to check 'StandbyMode' in
> update_synced_slots_inactive_since() and return if the same is not
> set. We can't use 'InRecovery' flag as that will be set even during
> crash recovery.
>
> Can you please test this once unless you don't agree with the above theory?

Nice catch. I've verified that update_synced_slots_inactive_since is
called even for normal server startups/crash recovery. I've added a
check to exit if the StandbyMode isn't set.

Please find the attached v35 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 03a08bd5ab3a305d199d9427491be37d1a1fd61b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 4 Apr 2024 12:06:17 +
Subject: [PATCH v35] Allow synced slots to have their inactive_since.

This commit does two things:
1) Maintains inactive_since for sync slots whenever the slot is released
just like any other regular slot.

2) Ensures the value is set to the current timestamp during the promotion
of standby to help correctly interpret the time after promotion. Whoever
acquires the slot i.e. makes the slot active will reset it to NULL.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik, Masahiko Sawada
Discussion: https://postgr.es/m/CAA4eK1KrPGwfZV9LYGidjxHeW+rxJ=E2ThjXvwRGLO=iLNuo=Q@mail.gmail.com
Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com">https://postgr.es/m/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com
---
 doc/src/sgml/system-views.sgml|  7 +++
 src/backend/replication/logical/slotsync.c| 51 +
 src/backend/replication/slot.c| 22 +++-
 src/test/perl/PostgreSQL/Test/Cluster.pm  | 31 ++
 src/test/recovery/t/019_replslot_limit.pl | 26 +
 .../t/040_standby_failover_slots_sync.pl  | 56 +++
 6 files changed, 154 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7ed617170f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
 The time since the slot has become inactive.
 NULL if the slot is currently being used.
+Note that for slots on the standby that are being synced from a
+primary server (whose synced field is
+true), the
+inactive_since indicates the last
+synchronization (see
+)
+time.
   
  
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 9ac847b780..75e67b966d 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -150,6 +150,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * overwriting 'invalidated' flag to remote_slot's value. See
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
+		 *
+		 * XXX: If it ever turns out that slot acquire/release is costly for
+		 * cases when none of the slot properties is changed then we can do a
+		 * pre-check to ensure that at least one of the slot properties is
+		 * changed before acquiring the slot.
 		 */
 		ReplicationSlotAcquire(remote_slot->name, true);
 
@@ -1355,6 +1361,48 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ *
+ * Note that this function is currently called when we shutdown the slot sync
+ * machinery during standby promotion. This helps correctly interpret the
+ * inactive_since if the standby gets promoted without a restart.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	if (!StandbyMode)
+		return;
+
+	/* The slot sync worker mustn't be running by now */
+	Assert(SlotSyncCtx->pid == InvalidPid);
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread Amit Kapila
On Thu, Apr 4, 2024 at 11:12 AM Bharath Rupireddy
 wrote:
>
> On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila  wrote:
> >
> > The v33-0001 looks good to me. I have made minor changes in the
> > comments/commit message and removed one part of the test which was a
> > bit confusing and didn't seem to add much value. Let me know what you
> > think of the attached?
>
> Thanks for the changes. v34-0001 LGTM.
>

I was doing a final review before pushing 0001 and found that
'inactive_since' could be set twice during startup after promotion,
once while restoring slots and then via ShutDownSlotSync(). The reason
is that ShutDownSlotSync() will be invoked in normal startup on
primary though it won't do anything apart from setting inactive_since
if we have synced slots. I think you need to check 'StandbyMode' in
update_synced_slots_inactive_since() and return if the same is not
set. We can't use 'InRecovery' flag as that will be set even during
crash recovery.

Can you please test this once unless you don't agree with the above theory?

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread Masahiko Sawada
On Thu, Apr 4, 2024 at 5:36 PM Amit Kapila  wrote:
>
> On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada  wrote:
> >
> > On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
> >  wrote:
> > >
> > > On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
> > > > if (SlotSyncCtx->pid == InvalidPid)
> > > > {
> > > > SpinLockRelease(>mutex);
> > > > +   update_synced_slots_inactive_since();
> > > > return;
> > > > }
> > > > SpinLockRelease(>mutex);
> > > > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
> > > > }
> > > >
> > > > SpinLockRelease(>mutex);
> > > > +
> > > > +   update_synced_slots_inactive_since();
> > > >  }
> > > >
> > > > Why do we want to update all synced slots' inactive_since values at
> > > > shutdown in spite of updating the value every time when releasing the
> > > > slot? It seems to contradict the fact that inactive_since is updated
> > > > when releasing or restoring the slot.
> > >
> > > It is to get the inactive_since right for the cases where the standby
> > > is promoted without a restart similar to when a standby is promoted
> > > with restart in which case the inactive_since is set to current time
> > > in RestoreSlotFromDisk.
> > >
> > > Imagine the slot is synced last time at time t1 and then a few hours
> > > passed, the standby is promoted without a restart. If we don't set
> > > inactive_since to current time in this case in ShutDownSlotSync, the
> > > inactive timeout invalidation mechanism can kick in immediately.
> > >
> >
> > Thank you for the explanation! I understood the needs.
> >
>
> Do you want to review the v34_0001* further or shall I proceed with
> the commit of the same?

Thanks for asking. The v34-0001 patch looks good to me.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread Amit Kapila
On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada  wrote:
>
> On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
>  wrote:
> >
> > On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada  
> > wrote:
> > >
> > > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
> > > if (SlotSyncCtx->pid == InvalidPid)
> > > {
> > > SpinLockRelease(>mutex);
> > > +   update_synced_slots_inactive_since();
> > > return;
> > > }
> > > SpinLockRelease(>mutex);
> > > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
> > > }
> > >
> > > SpinLockRelease(>mutex);
> > > +
> > > +   update_synced_slots_inactive_since();
> > >  }
> > >
> > > Why do we want to update all synced slots' inactive_since values at
> > > shutdown in spite of updating the value every time when releasing the
> > > slot? It seems to contradict the fact that inactive_since is updated
> > > when releasing or restoring the slot.
> >
> > It is to get the inactive_since right for the cases where the standby
> > is promoted without a restart similar to when a standby is promoted
> > with restart in which case the inactive_since is set to current time
> > in RestoreSlotFromDisk.
> >
> > Imagine the slot is synced last time at time t1 and then a few hours
> > passed, the standby is promoted without a restart. If we don't set
> > inactive_since to current time in this case in ShutDownSlotSync, the
> > inactive timeout invalidation mechanism can kick in immediately.
> >
>
> Thank you for the explanation! I understood the needs.
>

Do you want to review the v34_0001* further or shall I proceed with
the commit of the same?

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread Masahiko Sawada
On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
 wrote:
>
> On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada  wrote:
> >
> > @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
> > if (SlotSyncCtx->pid == InvalidPid)
> > {
> > SpinLockRelease(>mutex);
> > +   update_synced_slots_inactive_since();
> > return;
> > }
> > SpinLockRelease(>mutex);
> > @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
> > }
> >
> > SpinLockRelease(>mutex);
> > +
> > +   update_synced_slots_inactive_since();
> >  }
> >
> > Why do we want to update all synced slots' inactive_since values at
> > shutdown in spite of updating the value every time when releasing the
> > slot? It seems to contradict the fact that inactive_since is updated
> > when releasing or restoring the slot.
>
> It is to get the inactive_since right for the cases where the standby
> is promoted without a restart similar to when a standby is promoted
> with restart in which case the inactive_since is set to current time
> in RestoreSlotFromDisk.
>
> Imagine the slot is synced last time at time t1 and then a few hours
> passed, the standby is promoted without a restart. If we don't set
> inactive_since to current time in this case in ShutDownSlotSync, the
> inactive timeout invalidation mechanism can kick in immediately.
>

Thank you for the explanation! I understood the needs.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bharath Rupireddy
On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila  wrote:
>
> The v33-0001 looks good to me. I have made minor changes in the
> comments/commit message and removed one part of the test which was a
> bit confusing and didn't seem to add much value. Let me know what you
> think of the attached?

Thanks for the changes. v34-0001 LGTM.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Amit Kapila
On Wed, Apr 3, 2024 at 8:28 PM Bharath Rupireddy
 wrote:
>
> On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
>  wrote:
> >
> > Just one comment on v32-0001:
> >
> > +# Synced slot on the standby must get its own inactive_since.
> > +is( $standby1->safe_psql(
> > +   'postgres',
> > +   "SELECT '$inactive_since_on_primary'::timestamptz <= 
> > '$inactive_since_on_standby'::timestamptz AND
> > +   '$inactive_since_on_standby'::timestamptz <= 
> > '$slot_sync_time'::timestamptz;"
> > +   ),
> > +   "t",
> > +   'synchronized slot has got its own inactive_since');
> > +
> >
> > By using <= we are not testing that it must get its own inactive_since (as 
> > we
> > allow them to be equal in the test). I think we should just add some 
> > usleep()
> > where appropriate and deny equality during the tests on inactive_since.
>
> Thanks. It looks like we can ignore the equality in all of the
> inactive_since comparisons. IIUC, all the TAP tests do run with
> primary and standbys on the single BF animals. And, it looks like
> assigning the inactive_since timestamps to perl variables is giving
> the microseconds precision level
> (./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since
> 2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL
> tests relying on stats_reset timestamps without equality. So, I've
> left the equality for the inactive_since tests.
>
> > Except for the above, v32-0001 LGTM.
>
> Thanks. Please see the attached v33-0001 patch after removing equality
> on inactive_since TAP tests.
>

The v33-0001 looks good to me. I have made minor changes in the
comments/commit message and removed one part of the test which was a
bit confusing and didn't seem to add much value. Let me know what you
think of the attached?

-- 
With Regards,
Amit Kapila.


v34-0001-Allow-synced-slots-to-have-their-inactive_since.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bharath Rupireddy
On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada  wrote:
>
> @@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
> if (SlotSyncCtx->pid == InvalidPid)
> {
> SpinLockRelease(>mutex);
> +   update_synced_slots_inactive_since();
> return;
> }
> SpinLockRelease(>mutex);
> @@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
> }
>
> SpinLockRelease(>mutex);
> +
> +   update_synced_slots_inactive_since();
>  }
>
> Why do we want to update all synced slots' inactive_since values at
> shutdown in spite of updating the value every time when releasing the
> slot? It seems to contradict the fact that inactive_since is updated
> when releasing or restoring the slot.

It is to get the inactive_since right for the cases where the standby
is promoted without a restart similar to when a standby is promoted
with restart in which case the inactive_since is set to current time
in RestoreSlotFromDisk.

Imagine the slot is synced last time at time t1 and then a few hours
passed, the standby is promoted without a restart. If we don't set
inactive_since to current time in this case in ShutDownSlotSync, the
inactive timeout invalidation mechanism can kick in immediately.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Masahiko Sawada
On Wed, Apr 3, 2024 at 11:58 PM Bharath Rupireddy
 wrote:
>
>
> Please find the attached v33 patches.

@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
if (SlotSyncCtx->pid == InvalidPid)
{
SpinLockRelease(>mutex);
+   update_synced_slots_inactive_since();
return;
}
SpinLockRelease(>mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
}

SpinLockRelease(>mutex);
+
+   update_synced_slots_inactive_since();
 }

Why do we want to update all synced slots' inactive_since values at
shutdown in spite of updating the value every time when releasing the
slot? It seems to contradict the fact that inactive_since is updated
when releasing or restoring the slot.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bertrand Drouvot
Hi,

On Wed, Apr 03, 2024 at 08:28:04PM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
>  wrote:
> >
> > Just one comment on v32-0001:
> >
> > +# Synced slot on the standby must get its own inactive_since.
> > +is( $standby1->safe_psql(
> > +   'postgres',
> > +   "SELECT '$inactive_since_on_primary'::timestamptz <= 
> > '$inactive_since_on_standby'::timestamptz AND
> > +   '$inactive_since_on_standby'::timestamptz <= 
> > '$slot_sync_time'::timestamptz;"
> > +   ),
> > +   "t",
> > +   'synchronized slot has got its own inactive_since');
> > +
> >
> > By using <= we are not testing that it must get its own inactive_since (as 
> > we
> > allow them to be equal in the test). I think we should just add some 
> > usleep()
> > where appropriate and deny equality during the tests on inactive_since.
> 
> > Except for the above, v32-0001 LGTM.
> 
> Thanks. Please see the attached v33-0001 patch after removing equality
> on inactive_since TAP tests.

Thanks! v33-0001 LGTM.

> On Wed, Apr 3, 2024 at 1:47 PM Bertrand Drouvot
>  wrote:
> > Some comments regarding v31-0002:
> >
> > T2 ===
> >
> > In case the slot is invalidated on the primary,
> >
> > primary:
> >
> > postgres=# select slot_name, inactive_since, invalidation_reason from 
> > pg_replication_slots where slot_name = 's1';
> >  slot_name |inactive_since | invalidation_reason
> > ---+---+-
> >  s1| 2024-04-03 06:56:28.075637+00 | inactive_timeout
> >
> > then on the standby we get:
> >
> > standby:
> >
> > postgres=# select slot_name, inactive_since, invalidation_reason from 
> > pg_replication_slots where slot_name = 's1';
> >  slot_name |inactive_since| invalidation_reason
> > ---+--+-
> >  s1| 2024-04-03 07:06:43.37486+00 | inactive_timeout
> >
> > shouldn't the slot be dropped/recreated instead of updating inactive_since?
> 
> The sync slots that are invalidated on the primary aren't dropped and
> recreated on the standby.

Yeah, right (I was confused with synced slots that are invalidated locally).

> However, I
> found that the synced slot is acquired and released unnecessarily
> after the invalidation_reason is synced from the primary. I added a
> skip check in synchronize_one_slot to skip acquiring and releasing the
> slot if it's locally found inactive. With this, inactive_since won't
> get updated for invalidated sync slots on the standby as we don't
> acquire and release the slot.

CR1 ===

Yeah, I can see:

@@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid 
remote_dbid)
   " name slot \"%s\" already 
exists on the standby",
   remote_slot->name));

+   /*
+* Skip the sync if the local slot is already invalidated. We 
do this
+* beforehand to save on slot acquire and release.
+*/
+   if (slot->data.invalidated != RS_INVAL_NONE)
+   return false;

Thanks to the drop_local_obsolete_slots() call I think we are not missing the 
case
where the slot has been invalidated on the primary, invalidation reason has been
synced on the standby and later the slot is dropped/ recreated manually on the
primary (then it should be dropped/recreated on the standby too).

Also it seems we are not missing the case where a sync slot is invalidated
locally due to wal removal (it should be dropped/recreated).

> 
> > CR5 ===
> >
> > +   /*
> > +* This function isn't expected to be called for inactive timeout 
> > based
> > +* invalidation. A separate function 
> > InvalidateInactiveReplicationSlot is
> > +* to be used for that.
> >
> > Do you think it's worth to explain why?
> 
> Hm, I just wanted to point out the actual function here. I modified it
> to something like the following, if others feel we don't need that, I
> can remove it.

Sorry If I was not clear but I meant to say "Do you think it's worth to explain
why we decided to create a dedicated function"? (currently we "just" explain 
that
we created one).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bharath Rupireddy
On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
 wrote:
>
> Just one comment on v32-0001:
>
> +# Synced slot on the standby must get its own inactive_since.
> +is( $standby1->safe_psql(
> +   'postgres',
> +   "SELECT '$inactive_since_on_primary'::timestamptz <= 
> '$inactive_since_on_standby'::timestamptz AND
> +   '$inactive_since_on_standby'::timestamptz <= 
> '$slot_sync_time'::timestamptz;"
> +   ),
> +   "t",
> +   'synchronized slot has got its own inactive_since');
> +
>
> By using <= we are not testing that it must get its own inactive_since (as we
> allow them to be equal in the test). I think we should just add some usleep()
> where appropriate and deny equality during the tests on inactive_since.

Thanks. It looks like we can ignore the equality in all of the
inactive_since comparisons. IIUC, all the TAP tests do run with
primary and standbys on the single BF animals. And, it looks like
assigning the inactive_since timestamps to perl variables is giving
the microseconds precision level
(./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since
2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL
tests relying on stats_reset timestamps without equality. So, I've
left the equality for the inactive_since tests.

> Except for the above, v32-0001 LGTM.

Thanks. Please see the attached v33-0001 patch after removing equality
on inactive_since TAP tests.

On Wed, Apr 3, 2024 at 1:47 PM Bertrand Drouvot
 wrote:
>
> Some comments regarding v31-0002:
>
> === testing the behavior
>
> T1 ===
>
> > - synced slots don't get invalidated due to inactive timeout because
> > such slots not considered active at all as they don't perform logical
> > decoding (of course, they will perform in fast_forward mode to fix the
> > other data loss issue, but they don't generate changes for them to be
> > called as *active* slots)
>
> It behaves as described. OTOH non synced logical slots on the standby and
> physical slots on the standby are invalidated which is what is expected.

Right.

> T2 ===
>
> In case the slot is invalidated on the primary,
>
> primary:
>
> postgres=# select slot_name, inactive_since, invalidation_reason from 
> pg_replication_slots where slot_name = 's1';
>  slot_name |inactive_since | invalidation_reason
> ---+---+-
>  s1| 2024-04-03 06:56:28.075637+00 | inactive_timeout
>
> then on the standby we get:
>
> standby:
>
> postgres=# select slot_name, inactive_since, invalidation_reason from 
> pg_replication_slots where slot_name = 's1';
>  slot_name |inactive_since| invalidation_reason
> ---+--+-
>  s1| 2024-04-03 07:06:43.37486+00 | inactive_timeout
>
> shouldn't the slot be dropped/recreated instead of updating inactive_since?

The sync slots that are invalidated on the primary aren't dropped and
recreated on the standby. There's no point in doing so because
invalidated slots on the primary can't be made useful. However, I
found that the synced slot is acquired and released unnecessarily
after the invalidation_reason is synced from the primary. I added a
skip check in synchronize_one_slot to skip acquiring and releasing the
slot if it's locally found inactive. With this, inactive_since won't
get updated for invalidated sync slots on the standby as we don't
acquire and release the slot.

> === code
>
> CR1 ===
>
> +Invalidates replication slots that are inactive for longer the
> +specified amount of time
>
> s/for longer the/for longer that/?

Fixed.

> CR2 ===
>
> +true) as such synced slots don't actually perform
> +logical decoding.
>
> We're switching in fast forward logical due to [1], so I'm not sure that's 
> 100%
> accurate here. I'm not sure we need to specify a reason.

Fixed.

> CR3 ===
>
> + errdetail("This slot has been invalidated because it was inactive for more 
> than the time specified by replication_slot_inactive_timeout parameter.")));
>
> I think we can remove "parameter" (see for example the error message in
> validate_remote_info()) and reduce it a bit, something like?
>
> "This slot has been invalidated because it was inactive for more than 
> replication_slot_inactive_timeout"?

Done.

> CR4 ===
>
> + appendStringInfoString(_detail, _("The slot has been inactive for more 
> than the time specified by replication_slot_inactive_timeout parameter."));
>
> Same.

Done. Changed it to "The slot has been inactive for more than
replication_slot_inactive_timeout."

> CR5 ===
>
> +   /*
> +* This function isn't expected to be called for inactive timeout 
> based
> +* invalidation. A separate function 
> InvalidateInactiveReplicationSlot is
> +* to be used for that.
>
> Do you think it's worth to explain why?

Hm, I just wanted to point out the actual function here. I modified 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bertrand Drouvot
Hi,

On Wed, Apr 03, 2024 at 05:12:12PM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 4:19 PM Amit Kapila  wrote:
> >
> > + 'postgres',
> > + "SELECT '$inactive_since_on_primary'::timestamptz <
> > '$inactive_since_on_standby'::timestamptz AND
> > + '$inactive_since_on_standby'::timestamptz < 
> > '$slot_sync_time'::timestamptz;"
> >
> > Shall we do <= check as we are doing in the main function
> > get_slot_inactive_since_value as the time duration is less so it can
> > be the same as well? Similarly, please check other tests.
> 
> I get you. If the tests are so fast that losing a bit of precision
> might cause tests to fail. So, I'v added equality check for all the
> tests.

> Please find the attached v32-0001 patch with the above review comments
> addressed.

Thanks!

Just one comment on v32-0001:

+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+   'postgres',
+   "SELECT '$inactive_since_on_primary'::timestamptz <= 
'$inactive_since_on_standby'::timestamptz AND
+   '$inactive_since_on_standby'::timestamptz <= 
'$slot_sync_time'::timestamptz;"
+   ),
+   "t",
+   'synchronized slot has got its own inactive_since');
+

By using <= we are not testing that it must get its own inactive_since (as we
allow them to be equal in the test). I think we should just add some usleep()
where appropriate and deny equality during the tests on inactive_since.

Except for the above, v32-0001 LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bharath Rupireddy
On Wed, Apr 3, 2024 at 12:20 PM Bertrand Drouvot
 wrote:
>
> > Please find the attached v31 patches implementing the above idea:
>
> Some comments related to v31-0001:
>
> === testing the behavior
>
> T1 ===
>
> > - synced slots get their on inactive_since just like any other slot
>
> It behaves as described.
>
> T2 ===
>
> > - synced slots inactive_since is set to current timestamp after the
> > standby gets promoted to help inactive_since interpret correctly just
> > like any other slot.
>
> It behaves as described.

Thanks for testing.

> CR1 ===
>
> +inactive_since value will get updated
> +after every synchronization
>
> indicates the last synchronization time? (I think that after every 
> synchronization
> could lead to confusion).

Done.

> CR2 ===
>
> +   /*
> +* Set the time since the slot has become inactive 
> after shutting
> +* down slot sync machinery. This helps correctly 
> interpret the
> +* time if the standby gets promoted without a 
> restart.
> +*/
>
> It looks to me that this comment is not at the right place because there is
> nothing after the comment that indicates that we shutdown the "slot sync 
> machinery".
>
> Maybe a better place is before the function definition and mention that this 
> is
> currently called when we shutdown the "slot sync machinery"?

Done.

> CR3 ===
>
> +* We get the current time beforehand and only once 
> to avoid
> +* system calls overhead while holding the lock.
>
> s/avoid system calls overhead while holding the lock/avoid system calls while 
> holding the spinlock/?

Done.

> CR4 ===
>
> +* Set the time since the slot has become inactive. We get the current
> +* time beforehand to avoid system call overhead while holding the 
> lock
>
> Same.

Done.

> CR5 ===
>
> +   # Check that the captured time is sane
> +   if (defined $reference_time)
> +   {
>
> s/Check that the captured time is sane/Check that the inactive_since is sane/?
>
> Sorry if some of those comments could have been done while I did review 
> v29-0001.

Done.

On Wed, Apr 3, 2024 at 2:58 PM shveta malik  wrote:
>
> Thanks for the patches, please find few comments:
>
> v31-001:
>
> 1)
> system-views.sgml:
> value will get updated  after every synchronization from the
> corresponding remote slot on the primary.
>
> --This is confusing. It will be good to rephrase it.

Done as per Bertrand's suggestion.

> 2)
> update_synced_slots_inactive_since()
>
> --May be, we should mention in the header that this function is called
> only during promotion.

Done as per Bertrand's suggestion.

> 3) 040_standby_failover_slots_sync.pl:
> We capture inactive_since_on_primary when we do this for the first time at 
> #175
> ALTER SUBSCRIPTION regress_mysub1 DISABLE"
>
> But we again recreate the sub and disable it at line #280.
> Do you think we shall get inactive_since_on_primary again here, to be
> compared with inactive_since_on_new_primary later?

Hm. Done that. Recapturing both slot_creation_time_on_primary and
inactive_since_on_primary before and after CREATE SUBSCRIPTION creates
the slot again on the primary/publisher.

On Wed, Apr 3, 2024 at 3:32 PM Amit Kapila  wrote:
>
> > CR2 ===
> >
> > +   /*
> > +* Set the time since the slot has become inactive 
> > after shutting
> > +* down slot sync machinery. This helps correctly 
> > interpret the
> > +* time if the standby gets promoted without a 
> > restart.
> > +*/
> >
> > It looks to me that this comment is not at the right place because there is
> > nothing after the comment that indicates that we shutdown the "slot sync 
> > machinery".
> >
> > Maybe a better place is before the function definition and mention that 
> > this is
> > currently called when we shutdown the "slot sync machinery"?
> >
> Won't it be better to have an assert for SlotSyncCtx->pid? IIRC, we
> have some existing issues where we don't ensure that no one is running
> sync API before shutdown is complete but I think we can deal with that
> separately and here we can still have an Assert.

That can work to ensure the slot sync worker isn't running as
SlotSyncCtx->pid gets updated only for the slot sync worker. I added
this assertion for now.

We need to ensure (in a separate patch and thread) there is no backend
acquiring it and performing sync while the slot sync worker is
shutting down. Otherwise, some of the slots can get resynced and some
are not while we are shutting down the slot sync worker as part of the
standby promotion which might leave the slots in an inconsistent
state.

> > CR3 ===
> >
> > +* We get the current time beforehand and only once 
> > to avoid
> > +* system calls overhead while holding 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Amit Kapila
On Wed, Apr 3, 2024 at 2:58 PM shveta malik  wrote:
>
> On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy
>  wrote:
> >
> > On Wed, Apr 3, 2024 at 8:38 AM shveta malik  wrote:
> > >
> > > > > Or a simple solution is that the slotsync worker updates
> > > > > inactive_since as it does for non-synced slots, and disables
> > > > > timeout-based slot invalidation for synced slots.
> > >
> > > I like this idea better, it takes care of such a case too when the
> > > user is relying on sync-function rather than worker and does not want
> > > to get the slots invalidated in between 2 sync function calls.
> >
> > Please find the attached v31 patches implementing the above idea:
> >
>
> Thanks for the patches, please find few comments:
>
> v31-001:
>
> 1)
> system-views.sgml:
> value will get updated  after every synchronization from the
> corresponding remote slot on the primary.
>
> --This is confusing. It will be good to rephrase it.
>
> 2)
> update_synced_slots_inactive_since()
>
> --May be, we should mention in the header that this function is called
> only during promotion.
>
> 3) 040_standby_failover_slots_sync.pl:
> We capture inactive_since_on_primary when we do this for the first time at 
> #175
> ALTER SUBSCRIPTION regress_mysub1 DISABLE"
>
> But we again recreate the sub and disable it at line #280.
> Do you think we shall get inactive_since_on_primary again here, to be
> compared with inactive_since_on_new_primary later?
>

I think so.

Few additional comments on tests:
1.
+is( $standby1->safe_psql(
+ 'postgres',
+ "SELECT '$inactive_since_on_primary'::timestamptz <
'$inactive_since_on_standby'::timestamptz AND
+ '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"

Shall we do <= check as we are doing in the main function
get_slot_inactive_since_value as the time duration is less so it can
be the same as well? Similarly, please check other tests.

2.
+=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value

I see that all callers validate against reference time. It is better
to name it validate_slot_inactive_since rather than using get_* as the
main purpose is to validate the passed value.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Amit Kapila
On Wed, Apr 3, 2024 at 12:20 PM Bertrand Drouvot
 wrote:
>
> On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote:
> > On Wed, Apr 3, 2024 at 8:38 AM shveta malik  wrote:
> > >
> > > > > Or a simple solution is that the slotsync worker updates
> > > > > inactive_since as it does for non-synced slots, and disables
> > > > > timeout-based slot invalidation for synced slots.
> > >
> > > I like this idea better, it takes care of such a case too when the
> > > user is relying on sync-function rather than worker and does not want
> > > to get the slots invalidated in between 2 sync function calls.
> >
> > Please find the attached v31 patches implementing the above idea:
>
> Thanks!
>
> Some comments related to v31-0001:
>
> === testing the behavior
>
> T1 ===
>
> > - synced slots get their on inactive_since just like any other slot
>
> It behaves as described.
>
> T2 ===
>
> > - synced slots inactive_since is set to current timestamp after the
> > standby gets promoted to help inactive_since interpret correctly just
> > like any other slot.
>
> It behaves as described.
>
> CR1 ===
>
> +inactive_since value will get updated
> +after every synchronization
>
> indicates the last synchronization time? (I think that after every 
> synchronization
> could lead to confusion).
>

+1.

> CR2 ===
>
> +   /*
> +* Set the time since the slot has become inactive 
> after shutting
> +* down slot sync machinery. This helps correctly 
> interpret the
> +* time if the standby gets promoted without a 
> restart.
> +*/
>
> It looks to me that this comment is not at the right place because there is
> nothing after the comment that indicates that we shutdown the "slot sync 
> machinery".
>
> Maybe a better place is before the function definition and mention that this 
> is
> currently called when we shutdown the "slot sync machinery"?
>

Won't it be better to have an assert for SlotSyncCtx->pid? IIRC, we
have some existing issues where we don't ensure that no one is running
sync API before shutdown is complete but I think we can deal with that
separately and here we can still have an Assert.

> CR3 ===
>
> +* We get the current time beforehand and only once 
> to avoid
> +* system calls overhead while holding the lock.
>
> s/avoid system calls overhead while holding the lock/avoid system calls while 
> holding the spinlock/?
>

Is it valid to say that there is overhead of this call while holding
spinlock? Because I don't think at the time of promotion we expect any
other concurrent slot activity. The first reason seems good enough.

One other observation:
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"

Is there a reason for this inclusion? I don't see any change which
should need this one.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread shveta malik
On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy
 wrote:
>
> On Wed, Apr 3, 2024 at 8:38 AM shveta malik  wrote:
> >
> > > > Or a simple solution is that the slotsync worker updates
> > > > inactive_since as it does for non-synced slots, and disables
> > > > timeout-based slot invalidation for synced slots.
> >
> > I like this idea better, it takes care of such a case too when the
> > user is relying on sync-function rather than worker and does not want
> > to get the slots invalidated in between 2 sync function calls.
>
> Please find the attached v31 patches implementing the above idea:
>

Thanks for the patches, please find few comments:

v31-001:

1)
system-views.sgml:
value will get updated  after every synchronization from the
corresponding remote slot on the primary.

--This is confusing. It will be good to rephrase it.

2)
update_synced_slots_inactive_since()

--May be, we should mention in the header that this function is called
only during promotion.

3) 040_standby_failover_slots_sync.pl:
We capture inactive_since_on_primary when we do this for the first time at #175
ALTER SUBSCRIPTION regress_mysub1 DISABLE"

But we again recreate the sub and disable it at line #280.
Do you think we shall get inactive_since_on_primary again here, to be
compared with inactive_since_on_new_primary later?


v31-002:
(I had reviewed v29-002 but missed to post comments,  I think these
are still applicable)

1) I think replication_slot_inactivity_timeout was recommended here
(instead of replication_slot_inactive_timeout, so please give it a
thought):
https://www.postgresql.org/message-id/202403260739.udlp7lxixktx%40alvherre.pgsql

2) Commit msg:
a)
"It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped."

Shall we say invalidated rather than dropped?

b)
"To achieve the above, postgres introduces a GUC allowing users
set inactive timeout and then a slot stays inactive for this much
amount of time it invalidates the slot."

Broken sentence.



thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bertrand Drouvot
Hi,

On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 8:38 AM shveta malik  wrote:
> >
> > > > Or a simple solution is that the slotsync worker updates
> > > > inactive_since as it does for non-synced slots, and disables
> > > > timeout-based slot invalidation for synced slots.
> >
> > I like this idea better, it takes care of such a case too when the
> > user is relying on sync-function rather than worker and does not want
> > to get the slots invalidated in between 2 sync function calls.
> 
> Please find the attached v31 patches implementing the above idea:

Thanks!

Some comments regarding v31-0002:

=== testing the behavior

T1 ===

> - synced slots don't get invalidated due to inactive timeout because
> such slots not considered active at all as they don't perform logical
> decoding (of course, they will perform in fast_forward mode to fix the
> other data loss issue, but they don't generate changes for them to be
> called as *active* slots)

It behaves as described. OTOH non synced logical slots on the standby and
physical slots on the standby are invalidated which is what is expected.

T2 ===

In case the slot is invalidated on the primary,

primary:

postgres=# select slot_name, inactive_since, invalidation_reason from 
pg_replication_slots where slot_name = 's1';
 slot_name |inactive_since | invalidation_reason
---+---+-
 s1| 2024-04-03 06:56:28.075637+00 | inactive_timeout

then on the standby we get:

standby:

postgres=# select slot_name, inactive_since, invalidation_reason from 
pg_replication_slots where slot_name = 's1';
 slot_name |inactive_since| invalidation_reason
---+--+-
 s1| 2024-04-03 07:06:43.37486+00 | inactive_timeout

shouldn't the slot be dropped/recreated instead of updating inactive_since?

=== code

CR1 ===

+Invalidates replication slots that are inactive for longer the
+specified amount of time

s/for longer the/for longer that/?

CR2 ===

+true) as such synced slots don't actually perform
+logical decoding.

We're switching in fast forward logical due to [1], so I'm not sure that's 100%
accurate here. I'm not sure we need to specify a reason.

CR3 ===

+ errdetail("This slot has been invalidated because it was inactive for more 
than the time specified by replication_slot_inactive_timeout parameter.")));

I think we can remove "parameter" (see for example the error message in
validate_remote_info()) and reduce it a bit, something like?

"This slot has been invalidated because it was inactive for more than 
replication_slot_inactive_timeout"?

CR4 ===

+ appendStringInfoString(_detail, _("The slot has been inactive for more 
than the time specified by replication_slot_inactive_timeout parameter."));

Same.

CR5 ===

+   /*
+* This function isn't expected to be called for inactive timeout based
+* invalidation. A separate function InvalidateInactiveReplicationSlot 
is
+* to be used for that.

Do you think it's worth to explain why?

CR6 ===

+   if (replication_slot_inactive_timeout == 0)
+   return false;
+   else if (slot->inactive_since > 0)

"else" is not needed here.

CR7 ===

+   SpinLockAcquire(>mutex);
+
+   /*
+* Check if the slot needs to be invalidated due to
+* replication_slot_inactive_timeout GUC. We do this with the 
spinlock
+* held to avoid race conditions -- for example the 
inactive_since
+* could change, or the slot could be dropped.
+*/
+   now = GetCurrentTimestamp();

We should not call GetCurrentTimestamp() while holding a spinlock.

CR8 ===

+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to inactive timeout GUC. Also, check the logical

s/inactive timeout GUC/replication_slot_inactive_timeout/?

CR9 ===

+# Start: Helper functions used for this test file
+# End: Helper functions used for this test file

I think that's the first TAP test with this comment. Not saying we should not 
but
why did you feel the need to add those?

[1]: 
https://www.postgresql.org/message-id/os0pr01mb5716b3942ae49f3f725aca9294...@os0pr01mb5716.jpnprd01.prod.outlook.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread Bertrand Drouvot
Hi,

On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote:
> On Wed, Apr 3, 2024 at 8:38 AM shveta malik  wrote:
> >
> > > > Or a simple solution is that the slotsync worker updates
> > > > inactive_since as it does for non-synced slots, and disables
> > > > timeout-based slot invalidation for synced slots.
> >
> > I like this idea better, it takes care of such a case too when the
> > user is relying on sync-function rather than worker and does not want
> > to get the slots invalidated in between 2 sync function calls.
> 
> Please find the attached v31 patches implementing the above idea:

Thanks!

Some comments related to v31-0001:

=== testing the behavior

T1 ===

> - synced slots get their on inactive_since just like any other slot

It behaves as described.

T2 ===

> - synced slots inactive_since is set to current timestamp after the
> standby gets promoted to help inactive_since interpret correctly just
> like any other slot.
 
It behaves as described.

CR1 ===

+inactive_since value will get updated
+after every synchronization

indicates the last synchronization time? (I think that after every 
synchronization
could lead to confusion).

CR2 ===

+   /*
+* Set the time since the slot has become inactive 
after shutting
+* down slot sync machinery. This helps correctly 
interpret the
+* time if the standby gets promoted without a restart.
+*/

It looks to me that this comment is not at the right place because there is
nothing after the comment that indicates that we shutdown the "slot sync 
machinery".

Maybe a better place is before the function definition and mention that this is
currently called when we shutdown the "slot sync machinery"?

CR3 ===

+* We get the current time beforehand and only once to 
avoid
+* system calls overhead while holding the lock.

s/avoid system calls overhead while holding the lock/avoid system calls while 
holding the spinlock/?

CR4 ===

+* Set the time since the slot has become inactive. We get the current
+* time beforehand to avoid system call overhead while holding the lock

Same.

CR5 ===

+   # Check that the captured time is sane
+   if (defined $reference_time)
+   {

s/Check that the captured time is sane/Check that the inactive_since is sane/?

Sorry if some of those comments could have been done while I did review 
v29-0001.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread Bharath Rupireddy
On Wed, Apr 3, 2024 at 8:38 AM shveta malik  wrote:
>
> > > Or a simple solution is that the slotsync worker updates
> > > inactive_since as it does for non-synced slots, and disables
> > > timeout-based slot invalidation for synced slots.
>
> I like this idea better, it takes care of such a case too when the
> user is relying on sync-function rather than worker and does not want
> to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

- synced slots get their on inactive_since just like any other slot
- synced slots don't get invalidated due to inactive timeout because
such slots not considered active at all as they don't perform logical
decoding (of course, they will perform in fast_forward mode to fix the
other data loss issue, but they don't generate changes for them to be
called as *active* slots)
- synced slots inactive_since is set to current timestamp after the
standby gets promoted to help inactive_since interpret correctly just
like any other slot.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From fd2cb48726dd4e1932f8809dfb36e0fe9f96 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 3 Apr 2024 05:03:22 +
Subject: [PATCH v31 1/2] Allow synced slots to have their own inactive_since.

---
 doc/src/sgml/system-views.sgml|  7 +++
 src/backend/replication/logical/slotsync.c| 44 +
 src/backend/replication/slot.c| 23 +++--
 src/test/perl/PostgreSQL/Test/Cluster.pm  | 34 +
 src/test/recovery/t/019_replslot_limit.pl | 26 +-
 .../t/040_standby_failover_slots_sync.pl  | 49 +++
 6 files changed, 144 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..b64274a1fb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
 The time since the slot has become inactive.
 NULL if the slot is currently being used.
+Note that for slots on the standby that are being synced from a
+primary server (whose synced field is
+true), the
+inactive_since value will get updated
+after every synchronization (see
+)
+from the corresponding remote slot on the primary.
   
  
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..4050bd40f8 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -140,6 +140,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -1296,6 +1297,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = >replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(>mutex);
+			s->inactive_since = now;
+			SpinLockRelease(>mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1350,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(>mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(>mutex);
@@ -1341,6 +1383,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(>mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..5549ca9640 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -690,13 +691,10 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread shveta malik
On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote:
> > On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy
> >
> > FWIW, coming to this thread late, I think that the inactive_since
> > should not be synchronized from the primary. The wall clocks are
> > different on the primary and the standby so having the primary's
> > timestamp on the standby can confuse users, especially when there is a
> > big clock drift. Also, as Amit mentioned, inactive_since seems not to
> > be consistent with other parameters we copy. The
> > replication_slot_inactive_timeout feature should work on the standby
> > independent from the primary, like other slot invalidation mechanisms,
> > and it should be based on its own local clock.
>
> Thanks for sharing your thoughts! So, it looks like that most of us agree to 
> not
> sync inactive_since from the primary, I'm fine with that.

+1 on not syncing slots from primary.

> > If we want to invalidate the synced slots due to the timeout, I think
> > we need to define what is "inactive" for synced slots.
> >
> > Suppose that the slotsync worker updates the local (synced) slot's
> > inactive_since whenever releasing the slot, irrespective of the actual
> > LSNs (or other slot parameters) having been updated. I think that this
> > idea cannot handle a slot that is not acquired on the primary. In this
> > case, the remote slot is inactive but the local slot is regarded as
> > active.  WAL files are piled up on the standby (and on the primary) as
> > the slot's LSNs don't move forward. I think we want to regard such a
> > slot as "inactive" also on the standby and invalidate it because of
> > the timeout.
>
> I think that makes sense to somehow link inactive_since on the standby to
> the actual LSNs (or other slot parameters) being updated or not.
>
> > > > Now, the other concern is that calling GetCurrentTimestamp()
> > > > could be costly when the values for the slot are not going to be
> > > > updated but if that happens we can optimize such that before acquiring
> > > > the slot we can have some minimal pre-checks to ensure whether we need
> > > > to update the slot or not.
> >
> > If we use such pre-checks, another problem might happen; it cannot
> > handle a case where the slot is acquired on the primary but its LSNs
> > don't move forward. Imagine a logical replication conflict happened on
> > the subscriber, and the logical replication enters the retry loop. In
> > this case, the remote slot's inactive_since gets updated for every
> > retry, but it looks inactive from the standby since the slot LSNs
> > don't change. Therefore, only the local slot could be invalidated due
> > to the timeout but probably we don't want to regard such a slot as
> > "inactive".
> >
> > Another idea I came up with is that the slotsync worker updates the
> > local slot's inactive_since to the local timestamp only when the
> > remote slot might have got inactive. If the remote slot is acquired by
> > someone, the local slot's inactive_since is also NULL. If the remote
> > slot gets inactive, the slotsync worker sets the local timestamp to
> > the local slot's inactive_since. Since the remote slot could be
> > acquired and released before the slotsync worker gets the remote slot
> > data again, if the remote slot's inactive_since > the local slot's
> > inactive_since, the slotsync worker updates the local one.
>
> Then I think we would need to be careful about time zone comparison.

Yes. Also we need to consider the case when a user is relying on
pg_sync_replication_slots() and has not enabled slot-sync worker. In
such a case if synced slot's inactive_since is derived from inactivity
of remote-slot, it might not be that frequently updated (based on when
the user actually runs the SQL function) and thus may be misleading.
OTOH, if inactivty_since of synced slots represents its own
inactivity, then it will give correct info even for the case when the
SQL function is run after a long time and slot-sync worker is
disabled.

> > IOW, we
> > detect whether the remote slot was acquired and released since the
> > last synchronization, by checking the remote slot's inactive_since.
> > This idea seems to handle these cases I mentioned unless I'm missing
> > something, but it requires for the slotsync worker to update
> > inactive_since in a different way than other parameters.
> >
> > Or a simple solution is that the slotsync worker updates
> > inactive_since as it does for non-synced slots, and disables
> > timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

> Yeah, I think the main question to help us decide is: do we want to invalidate
> "inactive" synced slots locally (in addition to synchronizing the invalidation
> from the primary)?


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread Bertrand Drouvot
Hi,

On Tue, Apr 02, 2024 at 12:41:35PM +0530, Bharath Rupireddy wrote:
> On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot
>  wrote:
> >
> > > Or a simple solution is that the slotsync worker updates
> > > inactive_since as it does for non-synced slots, and disables
> > > timeout-based slot invalidation for synced slots.
> >
> > Yeah, I think the main question to help us decide is: do we want to 
> > invalidate
> > "inactive" synced slots locally (in addition to synchronizing the 
> > invalidation
> > from the primary)?
> 
> I think this approach looks way simpler than the other one. The other
> approach of linking inactive_since on the standby for synced slots to
> the actual LSNs (or other slot parameters) being updated or not looks
> more complicated, and might not go well with the end user.  However,
> we need to be able to say why we don't invalidate synced slots due to
> inactive timeout unlike the wal_removed invalidation that can happen
> right now on the standby for synced slots. This leads us to define
> actually what a slot being active means. Is syncing the data from the
> remote slot considered as the slot being active?
> 
> On the other hand, it may not sound great if we don't invalidate
> synced slots due to inactive timeout even though they hold resources
> such as WAL and XIDs.

Right and the "only" benefit then would be to give an idea as to when the last
sync did occur on the local slot.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread Bharath Rupireddy
On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot
 wrote:
>
> > Or a simple solution is that the slotsync worker updates
> > inactive_since as it does for non-synced slots, and disables
> > timeout-based slot invalidation for synced slots.
>
> Yeah, I think the main question to help us decide is: do we want to invalidate
> "inactive" synced slots locally (in addition to synchronizing the invalidation
> from the primary)?

I think this approach looks way simpler than the other one. The other
approach of linking inactive_since on the standby for synced slots to
the actual LSNs (or other slot parameters) being updated or not looks
more complicated, and might not go well with the end user.  However,
we need to be able to say why we don't invalidate synced slots due to
inactive timeout unlike the wal_removed invalidation that can happen
right now on the standby for synced slots. This leads us to define
actually what a slot being active means. Is syncing the data from the
remote slot considered as the slot being active?

On the other hand, it may not sound great if we don't invalidate
synced slots due to inactive timeout even though they hold resources
such as WAL and XIDs.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread Bertrand Drouvot
Hi,

On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote:
> On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy
> 
> FWIW, coming to this thread late, I think that the inactive_since
> should not be synchronized from the primary. The wall clocks are
> different on the primary and the standby so having the primary's
> timestamp on the standby can confuse users, especially when there is a
> big clock drift. Also, as Amit mentioned, inactive_since seems not to
> be consistent with other parameters we copy. The
> replication_slot_inactive_timeout feature should work on the standby
> independent from the primary, like other slot invalidation mechanisms,
> and it should be based on its own local clock.

Thanks for sharing your thoughts! So, it looks like that most of us agree to not
sync inactive_since from the primary, I'm fine with that.

> If we want to invalidate the synced slots due to the timeout, I think
> we need to define what is "inactive" for synced slots.
> 
> Suppose that the slotsync worker updates the local (synced) slot's
> inactive_since whenever releasing the slot, irrespective of the actual
> LSNs (or other slot parameters) having been updated. I think that this
> idea cannot handle a slot that is not acquired on the primary. In this
> case, the remote slot is inactive but the local slot is regarded as
> active.  WAL files are piled up on the standby (and on the primary) as
> the slot's LSNs don't move forward. I think we want to regard such a
> slot as "inactive" also on the standby and invalidate it because of
> the timeout.

I think that makes sense to somehow link inactive_since on the standby to 
the actual LSNs (or other slot parameters) being updated or not.

> > > Now, the other concern is that calling GetCurrentTimestamp()
> > > could be costly when the values for the slot are not going to be
> > > updated but if that happens we can optimize such that before acquiring
> > > the slot we can have some minimal pre-checks to ensure whether we need
> > > to update the slot or not.
> 
> If we use such pre-checks, another problem might happen; it cannot
> handle a case where the slot is acquired on the primary but its LSNs
> don't move forward. Imagine a logical replication conflict happened on
> the subscriber, and the logical replication enters the retry loop. In
> this case, the remote slot's inactive_since gets updated for every
> retry, but it looks inactive from the standby since the slot LSNs
> don't change. Therefore, only the local slot could be invalidated due
> to the timeout but probably we don't want to regard such a slot as
> "inactive".
> 
> Another idea I came up with is that the slotsync worker updates the
> local slot's inactive_since to the local timestamp only when the
> remote slot might have got inactive. If the remote slot is acquired by
> someone, the local slot's inactive_since is also NULL. If the remote
> slot gets inactive, the slotsync worker sets the local timestamp to
> the local slot's inactive_since. Since the remote slot could be
> acquired and released before the slotsync worker gets the remote slot
> data again, if the remote slot's inactive_since > the local slot's
> inactive_since, the slotsync worker updates the local one.

Then I think we would need to be careful about time zone comparison.

> IOW, we
> detect whether the remote slot was acquired and released since the
> last synchronization, by checking the remote slot's inactive_since.
> This idea seems to handle these cases I mentioned unless I'm missing
> something, but it requires for the slotsync worker to update
> inactive_since in a different way than other parameters.
> 
> Or a simple solution is that the slotsync worker updates
> inactive_since as it does for non-synced slots, and disables
> timeout-based slot invalidation for synced slots.

Yeah, I think the main question to help us decide is: do we want to invalidate
"inactive" synced slots locally (in addition to synchronizing the invalidation
from the primary)? 

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-01 Thread Masahiko Sawada
On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy
 wrote:
>
> On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila  wrote:
> >
> > Commit message states: "why we can't just update inactive_since for
> > synced slots on the standby with the value received from remote slot
> > on the primary. This is consistent with any other slot parameter i.e.
> > all of them are synced from the primary."
> >
> > The inactive_since is not consistent with other slot parameters which
> > we copy. We don't perform anything related to those other parameters
> > like say two_phase phase which can change that property. However, we
> > do acquire the slot, advance the slot (as per recent discussion [1]),
> > and release it. Since these operations can impact inactive_since, it
> > seems to me that inactive_since is not the same as other parameters.
> > It can have a different value than the primary. Why would anyone want
> > to know the value of inactive_since from primary after the standby is
> > promoted?
>
> After thinking about it for a while now, it feels to me that the
> synced slots (slots on the standby that are being synced from the
> primary) can have their own inactive_sicne value. Fundamentally,
> inactive_sicne is set to 0 when slot is acquired and set to current
> time when slot is released, no matter who acquires and releases it -
> be it walsenders for replication, or backends for slot advance, or
> backends for slot sync using pg_sync_replication_slots, or backends
> for other slot functions, or background sync worker. Remember the
> earlier patch was updating inactive_since just for walsenders, but
> then the suggestion was to update it unconditionally -
> https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com.
> Whoever syncs the slot, *acutally* acquires the slot i.e. makes it
> theirs, syncs it from the primary, and releases it. IMO, no
> differentiation is to be made for synced slots.

FWIW, coming to this thread late, I think that the inactive_since
should not be synchronized from the primary. The wall clocks are
different on the primary and the standby so having the primary's
timestamp on the standby can confuse users, especially when there is a
big clock drift. Also, as Amit mentioned, inactive_since seems not to
be consistent with other parameters we copy. The
replication_slot_inactive_timeout feature should work on the standby
independent from the primary, like other slot invalidation mechanisms,
and it should be based on its own local clock.

> Coming to a future patch for inactive timeout based slot invalidation,
> we can either allow invalidation without any differentiation for
> synced slots or restrict invalidation to avoid more sync work. For
> instance, if inactive timeout is kept low on the standby, the sync
> worker will be doing more work as it drops and recreates a slot
> repeatedly if it keeps getting invalidated. Another thing is that the
> standby takes independent invalidation decisions for synced slots.
> AFAICS, invalidation due to wal_removal is the only sole reason (out
> of all available invalidation reasons) for a synced slot to get
> invalidated independently of the primary. Check
> https://www.postgresql.org/message-id/CAA4eK1JXBwTaDRD_%3D8t6UB1fhRNjC1C%2BgH4YdDxj_9U6djLnXw%40mail.gmail.com
> for the suggestion on we better not differentiaing invalidation
> decisions for synced slots.
>
> The assumption of letting synced slots have their own inactive_since
> not only simplifies the code, but also looks less-confusing and more
> meaningful to the user. The only code that we put in on top of the
> committed code is to use InRecovery in place of
> RecoveryInProgress() in RestoreSlotFromDisk() to fix the issue raised
> by Shveta upthread.

If we want to invalidate the synced slots due to the timeout, I think
we need to define what is "inactive" for synced slots.

Suppose that the slotsync worker updates the local (synced) slot's
inactive_since whenever releasing the slot, irrespective of the actual
LSNs (or other slot parameters) having been updated. I think that this
idea cannot handle a slot that is not acquired on the primary. In this
case, the remote slot is inactive but the local slot is regarded as
active.  WAL files are piled up on the standby (and on the primary) as
the slot's LSNs don't move forward. I think we want to regard such a
slot as "inactive" also on the standby and invalidate it because of
the timeout.

>
> > Now, the other concern is that calling GetCurrentTimestamp()
> > could be costly when the values for the slot are not going to be
> > updated but if that happens we can optimize such that before acquiring
> > the slot we can have some minimal pre-checks to ensure whether we need
> > to update the slot or not.

If we use such pre-checks, another problem might happen; it cannot
handle a case where the slot is acquired on the primary but its LSNs
don't move forward. Imagine a logical replication conflict happened on
the 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-01 Thread Bertrand Drouvot
Hi,

On Sun, Mar 31, 2024 at 10:25:46AM +0530, Bharath Rupireddy wrote:
> On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot
>  wrote:
> > I think in this case it should always reflect the value from the primary (so
> > that one can understand why it is invalidated).
> 
> I'll come back to this as soon as we all agree on inactive_since
> behavior for synced slots.

Makes sense. Also if the majority of us thinks it's not needed for 
inactive_since
to be an exact copy of the primary, then let's go that way.

> > I think when it is invalidated it should always reflect the value from the
> > primary (so that one can understand why it is invalidated).
> 
> I'll come back to this as soon as we all agree on inactive_since
> behavior for synced slots.

Yeah.

> > T4 ===
> >
> > Also, it looks like querying pg_replication_slots() does not trigger an
> > invalidation: I think it should if the slot is not invalidated yet (and 
> > matches
> > the invalidation criteria).
> 
> There's a different opinion on this, check comment #3 from
> https://www.postgresql.org/message-id/CAA4eK1LLj%2BeaMN-K8oeOjfG%2BUuzTY%3DL5PXbcMJURZbFm%2B_aJSA%40mail.gmail.com.

Oh right, I can see Amit's point too. Let's put pg_replication_slots() out of
the game then.

> > CR6 ===
> >
> > +static bool
> > +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
> > +{
> >
> > InvalidatePossiblyInactiveSlot() maybe?
> 
> I think we will lose the essence i.e. timeout from the suggested
> function name, otherwise just the inactive doesn't give a clearer
> meaning. I kept it that way unless anyone suggests otherwise.

Right. OTOH I think that "Possibly" adds some nuance (like 
InvalidatePossiblyObsoleteSlot()
is already doing).

> Please see the attached v30 patch. 0002 is where all of the above
> review comments have been addressed.

Thanks! FYI, I did not look at the content yet, just replied to the above
comments.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-01 Thread Bertrand Drouvot
Hi,

On Mon, Apr 01, 2024 at 08:47:59AM +0530, Bharath Rupireddy wrote:
> On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila  wrote:
> >
> > Commit message states: "why we can't just update inactive_since for
> > synced slots on the standby with the value received from remote slot
> > on the primary. This is consistent with any other slot parameter i.e.
> > all of them are synced from the primary."
> >
> > The inactive_since is not consistent with other slot parameters which
> > we copy. We don't perform anything related to those other parameters
> > like say two_phase phase which can change that property. However, we
> > do acquire the slot, advance the slot (as per recent discussion [1]),
> > and release it. Since these operations can impact inactive_since, it
> > seems to me that inactive_since is not the same as other parameters.
> > It can have a different value than the primary. Why would anyone want
> > to know the value of inactive_since from primary after the standby is
> > promoted?
> 
> After thinking about it for a while now, it feels to me that the
> synced slots (slots on the standby that are being synced from the
> primary) can have their own inactive_sicne value. Fundamentally,
> inactive_sicne is set to 0 when slot is acquired and set to current
> time when slot is released, no matter who acquires and releases it -
> be it walsenders for replication, or backends for slot advance, or
> backends for slot sync using pg_sync_replication_slots, or backends
> for other slot functions, or background sync worker. Remember the
> earlier patch was updating inactive_since just for walsenders, but
> then the suggestion was to update it unconditionally -
> https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com.
> Whoever syncs the slot, *acutally* acquires the slot i.e. makes it
> theirs, syncs it from the primary, and releases it. IMO, no
> differentiation is to be made for synced slots.
> 
> There was a suggestion on using inactive_since of the synced slot on
> the standby to know the inactivity of the slot on the primary. If one
> wants to do that, they better look at/monitor the primary slot
> info/logs/pg_replication_slot/whatever.

Yeah but the use case was in case the primary is down for whatever reason.

> I really don't see a point in
> having two different meanings for a single property of a replication
> slot - inactive_since for a regular slot tells since when this slot
> has become inactive, and for a synced slot since when the
> corresponding remote slot has become inactive. I think this will
> confuse users for sure.

I'm not sure as we are speaking about "synced" slots. I can also see some 
confusion
if this value is not "synced".

> Also, if inactive_since is being changed on the primary so frequently,
> and none of the other parameters are changing, if we copy
> inactive_since to the synced slots, then standby will just be doing
> *sync* work (mark the slots dirty and save to disk) for updating
> inactive_since. I think this is unnecessary behaviour for sure.

Right, I think we should avoid the save slot to disk in that case (question 
raised
by Amit in [1]).

> Coming to a future patch for inactive timeout based slot invalidation,
> we can either allow invalidation without any differentiation for
> synced slots or restrict invalidation to avoid more sync work. For
> instance, if inactive timeout is kept low on the standby, the sync
> worker will be doing more work as it drops and recreates a slot
> repeatedly if it keeps getting invalidated. Another thing is that the
> standby takes independent invalidation decisions for synced slots.
> AFAICS, invalidation due to wal_removal is the only sole reason (out
> of all available invalidation reasons) for a synced slot to get
> invalidated independently of the primary. Check
> https://www.postgresql.org/message-id/CAA4eK1JXBwTaDRD_%3D8t6UB1fhRNjC1C%2BgH4YdDxj_9U6djLnXw%40mail.gmail.com
> for the suggestion on we better not differentiaing invalidation
> decisions for synced slots.

Yeah, I think the invalidation decision on the standby is highly linked to
what inactive_since on the standby is: synced from primary or not.

> The assumption of letting synced slots have their own inactive_since
> not only simplifies the code, but also looks less-confusing and more
> meaningful to the user.

I'm not sure at all. But if the majority of us thinks it's the case then let's
go that way.

> > Now, the other concern is that calling GetCurrentTimestamp()
> > could be costly when the values for the slot are not going to be
> > updated but if that happens we can optimize such that before acquiring
> > the slot we can have some minimal pre-checks to ensure whether we need
> > to update the slot or not.

Also maybe we could accept a bit less accuracy and use
GetCurrentTransactionStopTimestamp() instead?

> If we are too much concerned about the cost of GetCurrentTimestamp(),
> a possible approach is just don't 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-01 Thread Bertrand Drouvot
Hi,

On Mon, Apr 01, 2024 at 09:04:43AM +0530, Amit Kapila wrote:
> On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot
>  wrote:
> >
> > On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote:
> > > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
> > >  wrote:
> > > >
> > > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:
> > > > >
> > > > > Commit message states: "why we can't just update inactive_since for
> > > > > synced slots on the standby with the value received from remote slot
> > > > > on the primary. This is consistent with any other slot parameter i.e.
> > > > > all of them are synced from the primary."
> > > > >
> > > > > The inactive_since is not consistent with other slot parameters which
> > > > > we copy. We don't perform anything related to those other parameters
> > > > > like say two_phase phase which can change that property. However, we
> > > > > do acquire the slot, advance the slot (as per recent discussion [1]),
> > > > > and release it. Since these operations can impact inactive_since, it
> > > > > seems to me that inactive_since is not the same as other parameters.
> > > > > It can have a different value than the primary. Why would anyone want
> > > > > to know the value of inactive_since from primary after the standby is
> > > > > promoted?
> > > >
> > > > I think it can be useful "before" it is promoted and in case the 
> > > > primary is down.
> > > >
> > >
> > > It is not clear to me what is user going to do by checking the
> > > inactivity time for slots when the corresponding server is down.
> >
> > Say a failover needs to be done, then it could be useful to know for which
> > slots the activity needs to be resumed (thinking about external logical 
> > decoding
> > plugin, not about pub/sub here). If one see an inactive slot (since long 
> > "enough")
> > then he can start to reasonate about what to do with it.
> >
> > > I thought the idea was to check such slots and see if they need to be
> > > dropped or enabled again to avoid excessive disk usage, etc.
> >
> > Yeah that's the case but it does not mean inactive_since can't be useful in 
> > other
> > ways.
> >
> > Also, say the slot has been invalidated on the primary (due to inactivity 
> > timeout),
> > primary is down and there is a failover. By keeping the inactive_since from
> > the primary, one could know when the inactivity that lead to the timeout 
> > started.
> >
> 
> So, this means at promotion, we won't set the current_time for
> inactive_since which is not what the currently proposed patch is
> doing.

Yeah, that's why I made the comment T2 in [1].

> Moreover, doing the invalidation on promoted standby based on
> inactive_since of the primary node sounds debatable because the
> inactive_timeout could be different on the new node (promoted
> standby).

I think that if the slot is not invalidated before the promotion then we should
erase the value from the primary and use the promotion time.

> > Again, more concerned about external logical decoding plugin than pub/sub 
> > here.
> >
> > > > I agree that tracking the activity time of a synced slot can be useful, 
> > > > why
> > > > not creating a dedicated field for that purpose (and keep 
> > > > inactive_since a
> > > > perfect "copy" of the primary)?
> > > >
> > >
> > > We can have a separate field for this but not sure if it is worth it.
> >
> > OTOH I'm not sure that erasing this information from the primary is useful. 
> > I
> > think that 2 fields would be the best option and would be less subject of
> > misinterpretation.
> >
> > > > > Now, the other concern is that calling GetCurrentTimestamp()
> > > > > could be costly when the values for the slot are not going to be
> > > > > updated but if that happens we can optimize such that before acquiring
> > > > > the slot we can have some minimal pre-checks to ensure whether we need
> > > > > to update the slot or not.
> > > >
> > > > Right, but for a very active slot it is likely that we call 
> > > > GetCurrentTimestamp()
> > > > during almost each sync cycle.
> > > >
> > >
> > > True, but if we have to save a slot to disk each time to persist the
> > > changes (for an active slot) then probably GetCurrentTimestamp()
> > > shouldn't be costly enough to matter.
> >
> > Right, persisting the changes to disk would be even more costly.
> >
> 
> The point I was making is that currently after copying the
> remote_node's values, we always persist the slots to disk, so the cost
> of current_time shouldn't be much.

Oh right, I missed this (was focusing only on inactive_since that we don't 
persist
to disk IIRC).

BTW, If we are going this way, maybe we could accept a bit less accuracy
and use GetCurrentTransactionStopTimestamp() instead?

> Now, if the values won't change
> then probably there is some cost but in most cases (active slots), the
> values will always change.

Right.

> Also, if all the slots are inactive then we
> will slow down the speed of sync.

Yes.

> We also need to 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-31 Thread Amit Kapila
On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot
 wrote:
>
> On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote:
> > On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
> >  wrote:
> > >
> > > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:
> > > >
> > > > Commit message states: "why we can't just update inactive_since for
> > > > synced slots on the standby with the value received from remote slot
> > > > on the primary. This is consistent with any other slot parameter i.e.
> > > > all of them are synced from the primary."
> > > >
> > > > The inactive_since is not consistent with other slot parameters which
> > > > we copy. We don't perform anything related to those other parameters
> > > > like say two_phase phase which can change that property. However, we
> > > > do acquire the slot, advance the slot (as per recent discussion [1]),
> > > > and release it. Since these operations can impact inactive_since, it
> > > > seems to me that inactive_since is not the same as other parameters.
> > > > It can have a different value than the primary. Why would anyone want
> > > > to know the value of inactive_since from primary after the standby is
> > > > promoted?
> > >
> > > I think it can be useful "before" it is promoted and in case the primary 
> > > is down.
> > >
> >
> > It is not clear to me what is user going to do by checking the
> > inactivity time for slots when the corresponding server is down.
>
> Say a failover needs to be done, then it could be useful to know for which
> slots the activity needs to be resumed (thinking about external logical 
> decoding
> plugin, not about pub/sub here). If one see an inactive slot (since long 
> "enough")
> then he can start to reasonate about what to do with it.
>
> > I thought the idea was to check such slots and see if they need to be
> > dropped or enabled again to avoid excessive disk usage, etc.
>
> Yeah that's the case but it does not mean inactive_since can't be useful in 
> other
> ways.
>
> Also, say the slot has been invalidated on the primary (due to inactivity 
> timeout),
> primary is down and there is a failover. By keeping the inactive_since from
> the primary, one could know when the inactivity that lead to the timeout 
> started.
>

So, this means at promotion, we won't set the current_time for
inactive_since which is not what the currently proposed patch is
doing. Moreover, doing the invalidation on promoted standby based on
inactive_since of the primary node sounds debatable because the
inactive_timeout could be different on the new node (promoted
standby).

> Again, more concerned about external logical decoding plugin than pub/sub 
> here.
>
> > > I agree that tracking the activity time of a synced slot can be useful, 
> > > why
> > > not creating a dedicated field for that purpose (and keep inactive_since a
> > > perfect "copy" of the primary)?
> > >
> >
> > We can have a separate field for this but not sure if it is worth it.
>
> OTOH I'm not sure that erasing this information from the primary is useful. I
> think that 2 fields would be the best option and would be less subject of
> misinterpretation.
>
> > > > Now, the other concern is that calling GetCurrentTimestamp()
> > > > could be costly when the values for the slot are not going to be
> > > > updated but if that happens we can optimize such that before acquiring
> > > > the slot we can have some minimal pre-checks to ensure whether we need
> > > > to update the slot or not.
> > >
> > > Right, but for a very active slot it is likely that we call 
> > > GetCurrentTimestamp()
> > > during almost each sync cycle.
> > >
> >
> > True, but if we have to save a slot to disk each time to persist the
> > changes (for an active slot) then probably GetCurrentTimestamp()
> > shouldn't be costly enough to matter.
>
> Right, persisting the changes to disk would be even more costly.
>

The point I was making is that currently after copying the
remote_node's values, we always persist the slots to disk, so the cost
of current_time shouldn't be much. Now, if the values won't change
then probably there is some cost but in most cases (active slots), the
values will always change. Also, if all the slots are inactive then we
will slow down the speed of sync. We also need to consider if we want
to copy the value of inactive_since from the primary and if that is
the only value changed then shall we persist the slot or not?

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-31 Thread Bharath Rupireddy
On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila  wrote:
>
> Commit message states: "why we can't just update inactive_since for
> synced slots on the standby with the value received from remote slot
> on the primary. This is consistent with any other slot parameter i.e.
> all of them are synced from the primary."
>
> The inactive_since is not consistent with other slot parameters which
> we copy. We don't perform anything related to those other parameters
> like say two_phase phase which can change that property. However, we
> do acquire the slot, advance the slot (as per recent discussion [1]),
> and release it. Since these operations can impact inactive_since, it
> seems to me that inactive_since is not the same as other parameters.
> It can have a different value than the primary. Why would anyone want
> to know the value of inactive_since from primary after the standby is
> promoted?

After thinking about it for a while now, it feels to me that the
synced slots (slots on the standby that are being synced from the
primary) can have their own inactive_sicne value. Fundamentally,
inactive_sicne is set to 0 when slot is acquired and set to current
time when slot is released, no matter who acquires and releases it -
be it walsenders for replication, or backends for slot advance, or
backends for slot sync using pg_sync_replication_slots, or backends
for other slot functions, or background sync worker. Remember the
earlier patch was updating inactive_since just for walsenders, but
then the suggestion was to update it unconditionally -
https://www.postgresql.org/message-id/CAJpy0uD64X%3D2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg%40mail.gmail.com.
Whoever syncs the slot, *acutally* acquires the slot i.e. makes it
theirs, syncs it from the primary, and releases it. IMO, no
differentiation is to be made for synced slots.

There was a suggestion on using inactive_since of the synced slot on
the standby to know the inactivity of the slot on the primary. If one
wants to do that, they better look at/monitor the primary slot
info/logs/pg_replication_slot/whatever. I really don't see a point in
having two different meanings for a single property of a replication
slot - inactive_since for a regular slot tells since when this slot
has become inactive, and for a synced slot since when the
corresponding remote slot has become inactive. I think this will
confuse users for sure.

Also, if inactive_since is being changed on the primary so frequently,
and none of the other parameters are changing, if we copy
inactive_since to the synced slots, then standby will just be doing
*sync* work (mark the slots dirty and save to disk) for updating
inactive_since. I think this is unnecessary behaviour for sure.

Coming to a future patch for inactive timeout based slot invalidation,
we can either allow invalidation without any differentiation for
synced slots or restrict invalidation to avoid more sync work. For
instance, if inactive timeout is kept low on the standby, the sync
worker will be doing more work as it drops and recreates a slot
repeatedly if it keeps getting invalidated. Another thing is that the
standby takes independent invalidation decisions for synced slots.
AFAICS, invalidation due to wal_removal is the only sole reason (out
of all available invalidation reasons) for a synced slot to get
invalidated independently of the primary. Check
https://www.postgresql.org/message-id/CAA4eK1JXBwTaDRD_%3D8t6UB1fhRNjC1C%2BgH4YdDxj_9U6djLnXw%40mail.gmail.com
for the suggestion on we better not differentiaing invalidation
decisions for synced slots.

The assumption of letting synced slots have their own inactive_since
not only simplifies the code, but also looks less-confusing and more
meaningful to the user. The only code that we put in on top of the
committed code is to use InRecovery in place of
RecoveryInProgress() in RestoreSlotFromDisk() to fix the issue raised
by Shveta upthread.

> Now, the other concern is that calling GetCurrentTimestamp()
> could be costly when the values for the slot are not going to be
> updated but if that happens we can optimize such that before acquiring
> the slot we can have some minimal pre-checks to ensure whether we need
> to update the slot or not.
>
> [1] - 
> https://www.postgresql.org/message-id/OS0PR01MB571615D35F486080616CA841943A2%40OS0PR01MB5716.jpnprd01.prod.outlook.com

A quick test with a function to measure the cost of
GetCurrentTimestamp [1] on my Ubuntu dev system (an AWS EC2 c5.4xlarge
instance), gives me [2]. It took 0.388 ms, 2.269 ms, 21.144 ms,
209.333 ms, 2091.174 ms, 20908.942 ms for 10K, 100K, 1million,
10million, 100million, 1billion times respectively. Costs might be
different on various systems with different OS, but it gives us a
rough idea.

If we are too much concerned about the cost of GetCurrentTimestamp(),
a possible approach is just don't set inactive_since for slots being
synced on the standby. Just let the first acquisition and release
after the promotion do that job. We 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-30 Thread Bharath Rupireddy
On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot
 wrote:
>
> Regarding 0002:

Thanks for reviewing it.

> Some testing:
>
> T1 ===
>
> When the slot is invalidated on the primary, then the reason is propagated to
> the sync slot (if any). That's fine but we are loosing the inactive_since on 
> the
> standby:
>
> Primary:
>
> postgres=# select slot_name,inactive_since,conflicting,invalidation_reason 
> from pg_replication_slots where slot_name='lsub29_slot';
>   slot_name  |inactive_since | conflicting | 
> invalidation_reason
> -+---+-+-
>  lsub29_slot | 2024-03-28 08:24:51.672528+00 | f   | inactive_timeout
> (1 row)
>
> Standby:
>
> postgres=# select slot_name,inactive_since,conflicting,invalidation_reason 
> from pg_replication_slots where slot_name='lsub29_slot';
>   slot_name  | inactive_since | conflicting | invalidation_reason
> -++-+-
>  lsub29_slot || f   | inactive_timeout
> (1 row)
>
> I think in this case it should always reflect the value from the primary (so
> that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

> T2 ===
>
> And it is set to a value during promotion:
>
> postgres=# select pg_promote();
>  pg_promote
> 
>  t
> (1 row)
>
> postgres=# select slot_name,inactive_since,conflicting,invalidation_reason 
> from pg_replication_slots where slot_name='lsub29_slot';
>   slot_name  |inactive_since| conflicting | 
> invalidation_reason
> -+--+-+-
>  lsub29_slot | 2024-03-28 08:30:11.74505+00 | f   | inactive_timeout
> (1 row)
>
> I think when it is invalidated it should always reflect the value from the
> primary (so that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

> T3 ===
>
> As far the slot invalidation on the primary:
>
> postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub29_slot', NULL, 
> NULL, 'include-xids', '0');
> ERROR:  cannot acquire invalidated replication slot "lsub29_slot"
>
> Can we make the message more consistent with what can be found in 
> CreateDecodingContext()
> for example?

Hm, that makes sense because slot acquisition and release is something
internal to the server.

> T4 ===
>
> Also, it looks like querying pg_replication_slots() does not trigger an
> invalidation: I think it should if the slot is not invalidated yet (and 
> matches
> the invalidation criteria).

There's a different opinion on this, check comment #3 from
https://www.postgresql.org/message-id/CAA4eK1LLj%2BeaMN-K8oeOjfG%2BUuzTY%3DL5PXbcMJURZbFm%2B_aJSA%40mail.gmail.com.

> Code review:
>
> CR1 ===
>
> +Invalidate replication slots that are inactive for longer than this
> +amount of time. If this value is specified without units, it is taken
>
> s/Invalidate/Invalidates/?

Done.

> Should we mention the relationship with inactive_since?

Done.

> CR2 ===
>
> + *
> + * If check_for_invalidation is true, the slot is checked for invalidation
> + * based on replication_slot_inactive_timeout GUC and an error is raised 
> after making the slot ours.
>   */
>  void
> -ReplicationSlotAcquire(const char *name, bool nowait)
> +ReplicationSlotAcquire(const char *name, bool nowait,
> +  bool check_for_invalidation)
>
>
> s/check_for_invalidation/check_for_timeout_invalidation/?

Done.

> CR3 ===
>
> +   if (slot->inactive_since == 0 ||
> +   replication_slot_inactive_timeout == 0)
> +   return false;
>
> Better to test replication_slot_inactive_timeout first? (I mean there is no
> point of testing inactive_since if replication_slot_inactive_timeout == 0)
>
> CR4 ===
>
> +   if (slot->inactive_since > 0 &&
> +   replication_slot_inactive_timeout > 0)
> +   {
>
> Same.
>
> So, instead of CR3 === and CR4 ===, I wonder if it wouldn't be better to do
> something like:
>
> if (replication_slot_inactive_timeout == 0)
> return false;
> else if (slot->inactive_since > 0)
> .
> else
> return false;
>
> That would avoid checking replication_slot_inactive_timeout and inactive_since
> multiple times.

Done.

> CR5 ===
>
> +* held to avoid race conditions -- for example the restart_lsn could 
> move
> +* forward, or the slot could be dropped.
>
> Does the restart_lsn example makes sense here?

No, it doesn't. Modified that.

> CR6 ===
>
> +static bool
> +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
> +{
>
> InvalidatePossiblyInactiveSlot() maybe?

I think we will lose the essence i.e. timeout from the suggested
function name, otherwise just the inactive doesn't give a clearer
meaning. I kept it 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-29 Thread Bertrand Drouvot
Hi,

On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote:
> On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
>  wrote:
> >
> > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:
> > >
> > > Commit message states: "why we can't just update inactive_since for
> > > synced slots on the standby with the value received from remote slot
> > > on the primary. This is consistent with any other slot parameter i.e.
> > > all of them are synced from the primary."
> > >
> > > The inactive_since is not consistent with other slot parameters which
> > > we copy. We don't perform anything related to those other parameters
> > > like say two_phase phase which can change that property. However, we
> > > do acquire the slot, advance the slot (as per recent discussion [1]),
> > > and release it. Since these operations can impact inactive_since, it
> > > seems to me that inactive_since is not the same as other parameters.
> > > It can have a different value than the primary. Why would anyone want
> > > to know the value of inactive_since from primary after the standby is
> > > promoted?
> >
> > I think it can be useful "before" it is promoted and in case the primary is 
> > down.
> >
> 
> It is not clear to me what is user going to do by checking the
> inactivity time for slots when the corresponding server is down.

Say a failover needs to be done, then it could be useful to know for which
slots the activity needs to be resumed (thinking about external logical decoding
plugin, not about pub/sub here). If one see an inactive slot (since long 
"enough")
then he can start to reasonate about what to do with it.

> I thought the idea was to check such slots and see if they need to be
> dropped or enabled again to avoid excessive disk usage, etc.

Yeah that's the case but it does not mean inactive_since can't be useful in 
other
ways.

Also, say the slot has been invalidated on the primary (due to inactivity 
timeout),
primary is down and there is a failover. By keeping the inactive_since from
the primary, one could know when the inactivity that lead to the timeout 
started.

Again, more concerned about external logical decoding plugin than pub/sub here.

> > I agree that tracking the activity time of a synced slot can be useful, why
> > not creating a dedicated field for that purpose (and keep inactive_since a
> > perfect "copy" of the primary)?
> >
> 
> We can have a separate field for this but not sure if it is worth it.

OTOH I'm not sure that erasing this information from the primary is useful. I
think that 2 fields would be the best option and would be less subject of
misinterpretation.

> > > Now, the other concern is that calling GetCurrentTimestamp()
> > > could be costly when the values for the slot are not going to be
> > > updated but if that happens we can optimize such that before acquiring
> > > the slot we can have some minimal pre-checks to ensure whether we need
> > > to update the slot or not.
> >
> > Right, but for a very active slot it is likely that we call 
> > GetCurrentTimestamp()
> > during almost each sync cycle.
> >
> 
> True, but if we have to save a slot to disk each time to persist the
> changes (for an active slot) then probably GetCurrentTimestamp()
> shouldn't be costly enough to matter.

Right, persisting the changes to disk would be even more costly.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-29 Thread Amit Kapila
On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
 wrote:
>
> On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:
> >
> > Commit message states: "why we can't just update inactive_since for
> > synced slots on the standby with the value received from remote slot
> > on the primary. This is consistent with any other slot parameter i.e.
> > all of them are synced from the primary."
> >
> > The inactive_since is not consistent with other slot parameters which
> > we copy. We don't perform anything related to those other parameters
> > like say two_phase phase which can change that property. However, we
> > do acquire the slot, advance the slot (as per recent discussion [1]),
> > and release it. Since these operations can impact inactive_since, it
> > seems to me that inactive_since is not the same as other parameters.
> > It can have a different value than the primary. Why would anyone want
> > to know the value of inactive_since from primary after the standby is
> > promoted?
>
> I think it can be useful "before" it is promoted and in case the primary is 
> down.
>

It is not clear to me what is user going to do by checking the
inactivity time for slots when the corresponding server is down. I
thought the idea was to check such slots and see if they need to be
dropped or enabled again to avoid excessive disk usage, etc.

> I agree that tracking the activity time of a synced slot can be useful, why
> not creating a dedicated field for that purpose (and keep inactive_since a
> perfect "copy" of the primary)?
>

We can have a separate field for this but not sure if it is worth it.

> > Now, the other concern is that calling GetCurrentTimestamp()
> > could be costly when the values for the slot are not going to be
> > updated but if that happens we can optimize such that before acquiring
> > the slot we can have some minimal pre-checks to ensure whether we need
> > to update the slot or not.
>
> Right, but for a very active slot it is likely that we call 
> GetCurrentTimestamp()
> during almost each sync cycle.
>

True, but if we have to save a slot to disk each time to persist the
changes (for an active slot) then probably GetCurrentTimestamp()
shouldn't be costly enough to matter.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-29 Thread Bertrand Drouvot
Hi,

On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:
> On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy
>  wrote:
> >
> >
> > Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
> > standby for sync slots.
> >
> 
> Commit message states: "why we can't just update inactive_since for
> synced slots on the standby with the value received from remote slot
> on the primary. This is consistent with any other slot parameter i.e.
> all of them are synced from the primary."
> 
> The inactive_since is not consistent with other slot parameters which
> we copy. We don't perform anything related to those other parameters
> like say two_phase phase which can change that property. However, we
> do acquire the slot, advance the slot (as per recent discussion [1]),
> and release it. Since these operations can impact inactive_since, it
> seems to me that inactive_since is not the same as other parameters.
> It can have a different value than the primary. Why would anyone want
> to know the value of inactive_since from primary after the standby is
> promoted?

I think it can be useful "before" it is promoted and in case the primary is 
down.
I agree that tracking the activity time of a synced slot can be useful, why
not creating a dedicated field for that purpose (and keep inactive_since a
perfect "copy" of the primary)?

> Now, the other concern is that calling GetCurrentTimestamp()
> could be costly when the values for the slot are not going to be
> updated but if that happens we can optimize such that before acquiring
> the slot we can have some minimal pre-checks to ensure whether we need
> to update the slot or not.

Right, but for a very active slot it is likely that we call 
GetCurrentTimestamp()
during almost each sync cycle.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-28 Thread Amit Kapila
On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy
 wrote:
>
>
> Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
> standby for sync slots.
>

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted? Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

[1] - 
https://www.postgresql.org/message-id/OS0PR01MB571615D35F486080616CA841943A2%40OS0PR01MB5716.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-28 Thread Bertrand Drouvot
Hi,

On Wed, Mar 27, 2024 at 09:00:37PM +0530, Bharath Rupireddy wrote:
> standby for sync slots. 0002 implementing inactive timeout GUC based
> invalidation mechanism.
> 
> Please have a look.

Thanks!

Regarding 0002:

Some testing:

T1 ===

When the slot is invalidated on the primary, then the reason is propagated to
the sync slot (if any). That's fine but we are loosing the inactive_since on the
standby:

Primary:

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from 
pg_replication_slots where slot_name='lsub29_slot';
  slot_name  |inactive_since | conflicting | invalidation_reason
-+---+-+-
 lsub29_slot | 2024-03-28 08:24:51.672528+00 | f   | inactive_timeout
(1 row)

Standby:

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from 
pg_replication_slots where slot_name='lsub29_slot';
  slot_name  | inactive_since | conflicting | invalidation_reason
-++-+-
 lsub29_slot || f   | inactive_timeout
(1 row)

I think in this case it should always reflect the value from the primary (so
that one can understand why it is invalidated).

T2 ===

And it is set to a value during promotion:

postgres=# select pg_promote();
 pg_promote

 t
(1 row)

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from 
pg_replication_slots where slot_name='lsub29_slot';
  slot_name  |inactive_since| conflicting | invalidation_reason
-+--+-+-
 lsub29_slot | 2024-03-28 08:30:11.74505+00 | f   | inactive_timeout
(1 row)

I think when it is invalidated it should always reflect the value from the
primary (so that one can understand why it is invalidated).

T3 ===

As far the slot invalidation on the primary:

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub29_slot', NULL, NULL, 
'include-xids', '0');
ERROR:  cannot acquire invalidated replication slot "lsub29_slot"

Can we make the message more consistent with what can be found in 
CreateDecodingContext()
for example?

T4 ===

Also, it looks like querying pg_replication_slots() does not trigger an
invalidation: I think it should if the slot is not invalidated yet (and matches
the invalidation criteria).

Code review:

CR1 ===

+Invalidate replication slots that are inactive for longer than this
+amount of time. If this value is specified without units, it is taken

s/Invalidate/Invalidates/?

Should we mention the relationship with inactive_since?

CR2 ===

+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on replication_slot_inactive_timeout GUC and an error is raised after 
making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+  bool check_for_invalidation)


s/check_for_invalidation/check_for_timeout_invalidation/?

CR3 ===

+   if (slot->inactive_since == 0 ||
+   replication_slot_inactive_timeout == 0)
+   return false;

Better to test replication_slot_inactive_timeout first? (I mean there is no
point of testing inactive_since if replication_slot_inactive_timeout == 0)

CR4 ===

+   if (slot->inactive_since > 0 &&
+   replication_slot_inactive_timeout > 0)
+   {

Same.

So, instead of CR3 === and CR4 ===, I wonder if it wouldn't be better to do
something like:

if (replication_slot_inactive_timeout == 0)
return false;
else if (slot->inactive_since > 0)
.
.
.
.
else
return false;

That would avoid checking replication_slot_inactive_timeout and inactive_since
multiple times.

CR5 ===

+* held to avoid race conditions -- for example the restart_lsn could 
move
+* forward, or the slot could be dropped.

Does the restart_lsn example makes sense here?

CR6 ===

+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
+{

InvalidatePossiblyInactiveSlot() maybe?

CR7 ===

+   /* Make sure the invalidated state persists across server restart */
+   slot->just_dirtied = true;
+   slot->dirty = true;
+   SpinLockRelease(>mutex);

Maybe we could create a new function say MarkGivenReplicationSlotDirty()
with a slot as parameter, that ReplicationSlotMarkDirty could call too?

Then maybe we could set slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT in
InvalidateSlotForInactiveTimeout()? (to avoid multiple 
SpinLockAcquire/SpinLockRelease).

CR8 ===

+   if (persist_state)
+   {
+   charpath[MAXPGPATH];
+
+   sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+   SaveSlotToPath(slot, path, ERROR);
+   }

Maybe we could create a new function say 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy
 wrote:
>
> Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
> standby for sync slots. 0002 implementing inactive timeout GUC based
> invalidation mechanism.
>
> Please have a look.

Thanks for the patches. v29-001 looks good to me.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread Bertrand Drouvot
Hi,

On Wed, Mar 27, 2024 at 09:00:37PM +0530, Bharath Rupireddy wrote:
> On Wed, Mar 27, 2024 at 6:54 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:
> > > On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
> > > Please see the attached v28 patch.
> >
> > Thanks!
> >
> > 1 === sorry I missed it in the previous review
> >
> > if (!(RecoveryInProgress() && slot->data.synced))
> > +   {
> > now = GetCurrentTimestamp();
> > +   update_inactive_since = true;
> > +   }
> > +   else
> > +   update_inactive_since = false;
> >
> > I think update_inactive_since is not needed, we could rely on (now > 0) 
> > instead.
> 
> Thought of using it, but, at the expense of readability. I prefer to
> use a variable instead.

That's fine too.

> However, I changed the variable to be more meaningful to is_slot_being_synced.

Yeah makes sense and even easier to read.

v29-0001 LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread Bharath Rupireddy
On Wed, Mar 27, 2024 at 6:54 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:
> > On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
> > Please see the attached v28 patch.
>
> Thanks!
>
> 1 === sorry I missed it in the previous review
>
> if (!(RecoveryInProgress() && slot->data.synced))
> +   {
> now = GetCurrentTimestamp();
> +   update_inactive_since = true;
> +   }
> +   else
> +   update_inactive_since = false;
>
> I think update_inactive_since is not needed, we could rely on (now > 0) 
> instead.

Thought of using it, but, at the expense of readability. I prefer to
use a variable instead. However, I changed the variable to be more
meaningful to is_slot_being_synced.

> 2 ===
>
> +=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname)
> +
> +Get inactive_since column value for a given replication slot validating it
> +against optional reference time.
> +
> +=cut
> +
> +sub get_slot_inactive_since_value
> +{
>
> shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, 
> reference_time)"
> instead?

Ugh. Changed.

> Apart from the above, LGTM.

Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
standby for sync slots. 0002 implementing inactive timeout GUC based
invalidation mechanism.

Please have a look.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v29-0001-Maintain-inactive_since-for-synced-slots-correct.patch
Description: Binary data


v29-0002-Add-inactive_timeout-based-replication-slot-inva.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread Bertrand Drouvot
Hi,

On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:
> On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
> Please see the attached v28 patch.

Thanks!

1 === sorry I missed it in the previous review

if (!(RecoveryInProgress() && slot->data.synced))
+   {
now = GetCurrentTimestamp();
+   update_inactive_since = true;
+   }
+   else
+   update_inactive_since = false;

I think update_inactive_since is not needed, we could rely on (now > 0) instead.

2 ===

+=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{

shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, 
reference_time)"
instead?

Apart from the above, LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread Bharath Rupireddy
On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
 wrote:
>
> 1 ===
>
> My proposal (in text) but feel free to reword it:
>
> Note that the slots on the standbys that are being synced from a
> primary server (whose synced field is true), will get the inactive_since value
> from the corresponding remote slot on the primary. Also, after the standby 
> starts
> up, the inactive_since (for such synced slots) will remain zero until the next
> synchronization.

WFM.

> 2 ===
>
> +=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)
>
> get_slot_inactive_since_value instead?

Ugh. Changed.

> 3 ===
>
> +against given reference time.
>
> s/given reference/optional given reference/?

Done.

> Apart from the above, LGTM.

Thanks for reviewing.

On Wed, Mar 27, 2024 at 3:43 PM shveta malik  wrote:
>
> Thanks for the patch. Regarding doc, I have few comments.

Thanks for reviewing.

> a)  "inactive_since will remain  zero"
> Since it is user exposed info and the user finds it NULL in
> pg_replication_slots, shall we mention NULL instead of 0?

Right. Changed.

> b) Since we are referring to the sync cycle here, I feel it will be
> good to give a link to that page.
> +zero until the next slot sync cycle (see
> + 
> for
> +slot synchronization details).

WFM.

Please see the attached v28 patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v28-0001-Maintain-inactive_since-for-synced-slots-correct.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 2:55 PM Bharath Rupireddy
 wrote:
>
> On Wed, Mar 27, 2024 at 11:39 AM shveta malik  wrote:
> >
> > Thanks for the patch. Few trivial things:
>
> Thanks for reviewing.
>
> > --
> > 1)
> > system-views.sgml:
> >
> > a) "Note that the slots" --> "Note that the slots on the standbys,"
> > --it is good to mention "standbys" as synced could be true on primary
> > as well (promoted standby)
>
> Done.
>
> > b) If you plan to add more info which Bertrand suggested, then it will
> > be better to make a  section instead of using "Note"
>
> I added the note that Bertrand specified upthread. But, I couldn't
> find an instance of adding  ...  within a table. Hence
> with "Note that " statments just like any other notes in the
> system-views.sgml. pg_replication_slot in system-vews.sgml renders as
> table, so having  ...  may not be a great idea.
>
> > 2)
> > commit msg:
> >
> > "The impact of this
> > on a promoted standby inactive_since is always NULL for all
> > synced slots even after server restart.
> > "
> > Sentence looks broken.
> > -
>
> Reworded.
>
> > Apart from the above trivial things, v26-001 looks good to me.
>
> Please check the attached v27 patch which also has Bertrand's comment
> on deduplicating the TAP function. I've now moved it to Cluster.pm.
>

Thanks for the patch. Regarding doc, I have few comments.

+Note that the slots on the standbys that are being synced from a
+primary server (whose synced field is
+true), will get the
+inactive_since value from the
+corresponding remote slot on the primary. Also, note that for the
+synced slots on the standby, after the standby starts up (i.e. after
+the slots are loaded from the disk), the inactive_since will remain
+zero until the next slot sync cycle.

a)  "inactive_since will remain  zero"
Since it is user exposed info and the user finds it NULL in
pg_replication_slots, shall we mention NULL instead of 0?

b) Since we are referring to the sync cycle here, I feel it will be
good to give a link to that page.
+zero until the next slot sync cycle (see
+ for
+slot synchronization details).

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread Bertrand Drouvot
Hi,

On Wed, Mar 27, 2024 at 02:55:17PM +0530, Bharath Rupireddy wrote:
> Please check the attached v27 patch which also has Bertrand's comment
> on deduplicating the TAP function. I've now moved it to Cluster.pm.

Thanks!

1 ===

+Note that the slots on the standbys that are being synced from a
+primary server (whose synced field is
+true), will get the
+inactive_since value from the
+corresponding remote slot on the primary. Also, note that for the
+synced slots on the standby, after the standby starts up (i.e. after
+the slots are loaded from the disk), the inactive_since will remain
+zero until the next slot sync cycle.

Not sure we should mention the "(i.e. after the slots are loaded from the disk)"
and also "cycle" (as that does not sound right in case of manual sync).

My proposal (in text) but feel free to reword it:

Note that the slots on the standbys that are being synced from a
primary server (whose synced field is true), will get the inactive_since value
from the corresponding remote slot on the primary. Also, after the standby 
starts
up, the inactive_since (for such synced slots) will remain zero until the next
synchronization.

2 ===

+=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)

get_slot_inactive_since_value instead?

3 ===

+against given reference time.

s/given reference/optional given reference/?


Apart from the above, LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread Bharath Rupireddy
On Wed, Mar 27, 2024 at 11:39 AM shveta malik  wrote:
>
> Thanks for the patch. Few trivial things:

Thanks for reviewing.

> --
> 1)
> system-views.sgml:
>
> a) "Note that the slots" --> "Note that the slots on the standbys,"
> --it is good to mention "standbys" as synced could be true on primary
> as well (promoted standby)

Done.

> b) If you plan to add more info which Bertrand suggested, then it will
> be better to make a  section instead of using "Note"

I added the note that Bertrand specified upthread. But, I couldn't
find an instance of adding  ...  within a table. Hence
with "Note that " statments just like any other notes in the
system-views.sgml. pg_replication_slot in system-vews.sgml renders as
table, so having  ...  may not be a great idea.

> 2)
> commit msg:
>
> "The impact of this
> on a promoted standby inactive_since is always NULL for all
> synced slots even after server restart.
> "
> Sentence looks broken.
> -

Reworded.

> Apart from the above trivial things, v26-001 looks good to me.

Please check the attached v27 patch which also has Bertrand's comment
on deduplicating the TAP function. I've now moved it to Cluster.pm.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From b4f113ef1d3467383913d0f04cc672372133420d Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 27 Mar 2024 09:15:41 +
Subject: [PATCH v27] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. Because of this,
inactive_since is always NULL on a promoted standby for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml|  8 +++
 src/backend/replication/logical/slotsync.c| 61 ++-
 src/backend/replication/slot.c| 41 +
 src/test/perl/PostgreSQL/Test/Cluster.pm  | 34 +++
 src/test/recovery/t/019_replslot_limit.pl | 26 +---
 .../t/040_standby_failover_slots_sync.pl  | 43 +
 6 files changed, 174 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..07f6d177e8 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
 The time since the slot has become inactive.
 NULL if the slot is currently being used.
+Note that the slots on the standbys that are being synced from a
+primary server (whose synced field is
+true), will get the
+inactive_since value from the
+corresponding remote slot on the primary. Also, note that for the
+synced slots on the standby, after the standby starts up (i.e. after
+the slots are loaded from the disk), the inactive_since will remain
+zero until the next slot sync cycle.
   
  
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..9c95a4b062 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 11:05 AM Bharath Rupireddy
 wrote:
>
> Fixed an issue in synchronize_slots where DatumGetLSN is being used in
> place of DatumGetTimestampTz. Found this via CF bot member [1], not on
> my dev system.
>
> Please find the attached v6 patch.

Thanks for the patch. Few trivial things:

--
1)
system-views.sgml:

a) "Note that the slots" --> "Note that the slots on the standbys,"
--it is good to mention "standbys" as synced could be true on primary
as well (promoted standby)

b) If you plan to add more info which Bertrand suggested, then it will
be better to make a  section instead of using "Note"

2)
commit msg:

"The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.
"
Sentence looks broken.
-

Apart from the above trivial things, v26-001 looks good to me.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Wed, Mar 27, 2024 at 10:08:33AM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
>  wrote:
> >
> > -   if (!(RecoveryInProgress() && slot->data.synced))
> > +   if (!(InRecovery && slot->data.synced))
> > slot->inactive_since = GetCurrentTimestamp();
> > else
> > slot->inactive_since = 0;
> >
> > Not related to this change but more the way RestoreSlotFromDisk() behaves 
> > here:
> >
> > For a sync slot on standby it will be set to zero and then later will be
> > synchronized with the one coming from the primary. I think that's fine to 
> > have
> > it to zero for this window of time.
> 
> Right.
> 
> > Now, if the standby is down and one sets sync_replication_slots to off,
> > then inactive_since will be set to zero on the standby at startup and not
> > synchronized (unless one triggers a manual sync). I also think that's fine 
> > but
> > it might be worth to document this behavior (that after a standby startup
> > inactive_since is zero until the next sync...).
> 
> Isn't this behaviour applicable for other slot parameters that the
> slot syncs from the remote slot on the primary?

No they are persisted on disk. If not, we'd not know where to resume the 
decoding
from on the standby in case primary is down and/or sync is off.

> I've added the following note in the comments when we update
> inactive_since in RestoreSlotFromDisk.
> 
>  * Note that for synced slots after the standby starts up (i.e. after
>  * the slots are loaded from the disk), the inactive_since will remain
>  * zero until the next slot sync cycle.
>  */
> if (!(InRecovery && slot->data.synced))
> slot->inactive_since = GetCurrentTimestamp();
> else
> slot->inactive_since = 0;

I think we should add some words in the doc too and also about what the meaning
of inactive_since on the standby is (as suggested by Shveta in [1]).

[1]: 
https://www.postgresql.org/message-id/CAJpy0uDkTW%2Bt1k3oPkaipFBzZePfFNB5DmiA%3D%3DpxRGcAdpF%3DPg%40mail.gmail.com

> > 7 ===
> >
> > +# Capture and validate inactive_since of a given slot.
> > +sub capture_and_validate_slot_inactive_since
> > +{
> > +   my ($node, $slot_name, $slot_creation_time) = @_;
> > +   my $name = $node->name;
> >
> > We know have capture_and_validate_slot_inactive_since at 2 places:
> > 040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.
> >
> > Worth to create a sub in Cluster.pm?
> 
> I'd second that thought for now. We might have to debate first if it's
> useful for all the nodes even without replication, and if yes, the
> naming stuff and all that. Historically, we've had such duplicated
> functions until recently, for instance advance_wal and log_contains.
> We
> moved them over to a common perl library Cluster.pm very recently. I'm
> sure we can come back later to move it to Cluster.pm.

I thought that would be the right time not to introduce duplicated code.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bharath Rupireddy
On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
 wrote:
>
> Please find the attached v25-0001 (made this 0001 patch now as
> inactive_since patch is committed) patch with the above changes.

Fixed an issue in synchronize_slots where DatumGetLSN is being used in
place of DatumGetTimestampTz. Found this via CF bot member [1], not on
my dev system.

Please find the attached v6 patch.


[1]
[05:14:39.281] #7  DatumGetLSN (X=) at
../src/include/utils/pg_lsn.h:24
[05:14:39.281] No locals.
[05:14:39.281] #8  synchronize_slots (wrconn=wrconn@entry=0x583cd170)
at ../src/backend/replication/logical/slotsync.c:757
[05:14:39.281] isnull = false
[05:14:39.281] remote_slot = 0x583ce1a8
[05:14:39.281] d = 
[05:14:39.281] col = 10
[05:14:39.281] slotRow = {25, 25, 3220, 3220, 28, 16, 16, 25, 25, 1184}
[05:14:39.281] res = 0x583cd1b8
[05:14:39.281] tupslot = 0x583ce11c
[05:14:39.281] remote_slot_list = 0x0
[05:14:39.281] some_slot_updated = false
[05:14:39.281] started_tx = false
[05:14:39.281] query = 0x57692bc4 "SELECT slot_name, plugin,
confirmed_flush_lsn, restart_lsn, catalog_xmin, two_phase, failover,
database, invalidation_reason, inactive_since FROM
pg_catalog.pg_replication_slots WHERE failover and NOT"...
[05:14:39.281] __func__ = "synchronize_slots"
[05:14:39.281] #9  0x56ff9d1e in SyncReplicationSlots
(wrconn=0x583cd170) at
../src/backend/replication/logical/slotsync.c:1504

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 8c5f7d0064e0e01a2d458872ecc1d8e682ddc033 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 27 Mar 2024 05:29:18 +
Subject: [PATCH v26] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml|  4 ++
 src/backend/replication/logical/slotsync.c| 61 -
 src/backend/replication/slot.c| 41 
 .../t/040_standby_failover_slots_sync.pl  | 66 +++
 4 files changed, 157 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7713f168e7 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
 The time since the slot has become inactive.
 NULL if the slot is currently being used.
+Note that the slots that are being synced from a primary server
+(whose synced field is true), will get the
+inactive_since value from the corresponding
+remote slot on the primary.
   
  
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..9c95a4b062 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		remote_slot->inactive_since == slot->inactive_since &&
 		

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 6:05 PM Bertrand Drouvot
 wrote:
>
>
> > We can think on that later if we really need another
> > field which give us sync time.
>
> I think that calling GetCurrentTimestamp() so frequently could be too costly, 
> so
> I'm not sure we should.

Agreed.

> > In my second approach, I have tried to
> > avoid updating inactive_since for synced slots during sync process. We
> > update that field during creation of synced slot so that
> > inactive_since reflects correct info even for synced slots (rather
> > than copying from primary).
>
> Yeah, and I think we could create a dedicated field with this information
> if we feel the need.

Okay.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bharath Rupireddy
On Wed, Mar 27, 2024 at 10:24 AM shveta malik  wrote:
>
> On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila  wrote:
> >
> > On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
> >  wrote:
> > >
> > > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
> > >  wrote:
> > >
> > > > 3)
> > > > update_synced_slots_inactive_time():
> > > >
> > > > This assert is removed, is it intentional?
> > > > Assert(s->active_pid == 0);
> > >
> > > Yes, the slot can get acquired in the corner case when someone runs
> > > pg_sync_replication_slots concurrently at this time. I'm referring to
> > > the issue reported upthread. We don't prevent one running
> > > pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
> > > Maybe we should prevent that otherwise some of the slots are synced
> > > and the standby gets promoted while others are yet-to-be-synced.
> > >
> >
> > We should do something about it but that shouldn't be done in this
> > patch. We can handle it separately and then add such an assert.
>
> Agreed. Once this patch is concluded, I can fix the slot sync shutdown
> issue and will also add this 'assert' back.

Agreed. Thanks.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila  wrote:
>
> On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
>  wrote:
> >
> > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
> >  wrote:
> >
> > > 3)
> > > update_synced_slots_inactive_time():
> > >
> > > This assert is removed, is it intentional?
> > > Assert(s->active_pid == 0);
> >
> > Yes, the slot can get acquired in the corner case when someone runs
> > pg_sync_replication_slots concurrently at this time. I'm referring to
> > the issue reported upthread. We don't prevent one running
> > pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
> > Maybe we should prevent that otherwise some of the slots are synced
> > and the standby gets promoted while others are yet-to-be-synced.
> >
>
> We should do something about it but that shouldn't be done in this
> patch. We can handle it separately and then add such an assert.

Agreed. Once this patch is concluded, I can fix the slot sync shutdown
issue and will also add this 'assert' back.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Amit Kapila
On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
 wrote:
>
> On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
>  wrote:
>
> > 3)
> > update_synced_slots_inactive_time():
> >
> > This assert is removed, is it intentional?
> > Assert(s->active_pid == 0);
>
> Yes, the slot can get acquired in the corner case when someone runs
> pg_sync_replication_slots concurrently at this time. I'm referring to
> the issue reported upthread. We don't prevent one running
> pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
> Maybe we should prevent that otherwise some of the slots are synced
> and the standby gets promoted while others are yet-to-be-synced.
>

We should do something about it but that shouldn't be done in this
patch. We can handle it separately and then add such an assert.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bharath Rupireddy
On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
 wrote:
>
> > I'm attaching v24 patches. It implements the above idea proposed
> > upthread for synced slots.
>
>  v24-0002
>
> 1 ===
>
> This commit does two things:
> 1) Updates inactive_since for sync slots with the value
> received from the primary's slot.
>
> Tested it and it does that.

Thanks. I've added a test case for this.

> 2 ===
>
> 2) Ensures the value is set to current timestamp during the
> shutdown of slot sync machinery to help correctly interpret the
> time if the standby gets promoted without a restart.
>
> Tested it and it does that.

Thanks. I've added a test case for this.

> 3 ===
>
> +/*
> + * Reset the synced slots info such as inactive_since after shutting
> + * down the slot sync machinery.
> + */
> +static void
> +update_synced_slots_inactive_time(void)
>
> Looks like the comment "reset" is not matching the name of the function and
> what it does.

Changed. I've also changed the function name to
update_synced_slots_inactive_since to be precise on what it exactly
does.

> 4 ===
>
> +   /*
> +* We get the current time beforehand and only once 
> to avoid
> +* system calls overhead while holding the lock.
> +*/
> +   if (now == 0)
> +   now = GetCurrentTimestamp();
>
> Also +1 of having GetCurrentTimestamp() just called one time within the loop.

Right.

> 5 ===
>
> -   if (!(RecoveryInProgress() && slot->data.synced))
> +   if (!(InRecovery && slot->data.synced))
> slot->inactive_since = GetCurrentTimestamp();
> else
> slot->inactive_since = 0;
>
> Not related to this change but more the way RestoreSlotFromDisk() behaves 
> here:
>
> For a sync slot on standby it will be set to zero and then later will be
> synchronized with the one coming from the primary. I think that's fine to have
> it to zero for this window of time.

Right.

> Now, if the standby is down and one sets sync_replication_slots to off,
> then inactive_since will be set to zero on the standby at startup and not
> synchronized (unless one triggers a manual sync). I also think that's fine but
> it might be worth to document this behavior (that after a standby startup
> inactive_since is zero until the next sync...).

Isn't this behaviour applicable for other slot parameters that the
slot syncs from the remote slot on the primary?

I've added the following note in the comments when we update
inactive_since in RestoreSlotFromDisk.

 * Note that for synced slots after the standby starts up (i.e. after
 * the slots are loaded from the disk), the inactive_since will remain
 * zero until the next slot sync cycle.
 */
if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;

> 6 ===
>
> +   print "HI  $slot_name $name $inactive_since $slot_creation_time\n";
>
> garbage?

Removed.

> 7 ===
>
> +# Capture and validate inactive_since of a given slot.
> +sub capture_and_validate_slot_inactive_since
> +{
> +   my ($node, $slot_name, $slot_creation_time) = @_;
> +   my $name = $node->name;
>
> We know have capture_and_validate_slot_inactive_since at 2 places:
> 040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.
>
> Worth to create a sub in Cluster.pm?

I'd second that thought for now. We might have to debate first if it's
useful for all the nodes even without replication, and if yes, the
naming stuff and all that. Historically, we've had such duplicated
functions until recently, for instance advance_wal and log_contains.
We
moved them over to a common perl library Cluster.pm very recently. I'm
sure we can come back later to move it to Cluster.pm.

On Wed, Mar 27, 2024 at 9:02 AM shveta malik  wrote:
>
> 1)
> slot.c:
> + * data from the remote slot. We use InRecovery flag instead of
> + * RecoveryInProgress() as it always returns true even for normal
> + * server startup.
>
> a) Not clear what 'it' refers to. Better to use 'the latter'
> b) Is it better to mention the primary here:
>  'as the latter always returns true even on the primary server during 
> startup'.

Modified.

> 2)
> update_local_synced_slot():
>
> - strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
> + strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
> + remote_slot->inactive_since == slot->inactive_since)
>
> When this code was written initially, the intent was to do strcmp at
> the end (only if absolutely needed). It will be good if we maintain
> the same and add new checks before strcmp.

Done.

> 3)
> update_synced_slots_inactive_time():
>
> This assert is removed, is it intentional?
> Assert(s->active_pid == 0);

Yes, the slot can get acquired in the corner case when someone 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 9:59 PM Bharath Rupireddy
 wrote:
>
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
>  wrote:
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created? We don't expose slot
> > creation time, no? Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time.
>
> I'm attaching v24 patches. It implements the above idea proposed
> upthread for synced slots. I've now separated
> s/last_inactive_time/inactive_since and synced slots behaviour. Please
> have a look.

Thanks for the patches. Few trivial comments for v24-002:

1)
slot.c:
+ * data from the remote slot. We use InRecovery flag instead of
+ * RecoveryInProgress() as it always returns true even for normal
+ * server startup.

a) Not clear what 'it' refers to. Better to use 'the latter'
b) Is it better to mention the primary here:
 'as the latter always returns true even on the primary server during startup'.


2)
update_local_synced_slot():

- strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+ strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+ remote_slot->inactive_since == slot->inactive_since)

When this code was written initially, the intent was to do strcmp at
the end (only if absolutely needed). It will be good if we maintain
the same and add new checks before strcmp.

3)
update_synced_slots_inactive_time():

This assert is removed, is it intentional?
Assert(s->active_pid == 0);


4)
040_standby_failover_slots_sync.pl:

+# Capture the inactive_since of the slot from the standby the logical failover
+# slots are synced/created on the standby.

The comment is unclear, something seems missing.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 09:59:23PM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
>  wrote:
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created? We don't expose slot
> > creation time, no? Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time.
> 
> I'm attaching v24 patches. It implements the above idea proposed
> upthread for synced slots. I've now separated
> s/last_inactive_time/inactive_since and synced slots behaviour. Please
> have a look.

Thanks!

 v24-0001

It's now pure mechanical changes and it looks good to me.

 v24-0002

1 ===

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

Tested it and it does that.

2 ===

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Tested it and it does that.

3 ===

+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+update_synced_slots_inactive_time(void)

Looks like the comment "reset" is not matching the name of the function and
what it does.

4 ===

+   /*
+* We get the current time beforehand and only once to 
avoid
+* system calls overhead while holding the lock.
+*/
+   if (now == 0)
+   now = GetCurrentTimestamp();

Also +1 of having GetCurrentTimestamp() just called one time within the loop.

5 ===

-   if (!(RecoveryInProgress() && slot->data.synced))
+   if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;

Not related to this change but more the way RestoreSlotFromDisk() behaves here:

For a sync slot on standby it will be set to zero and then later will be
synchronized with the one coming from the primary. I think that's fine to have
it to zero for this window of time.

Now, if the standby is down and one sets sync_replication_slots to off,
then inactive_since will be set to zero on the standby at startup and not 
synchronized (unless one triggers a manual sync). I also think that's fine but
it might be worth to document this behavior (that after a standby startup
inactive_since is zero until the next sync...). 

6 ===

+   print "HI  $slot_name $name $inactive_since $slot_creation_time\n";

garbage?

7 ===

+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+   my ($node, $slot_name, $slot_creation_time) = @_;
+   my $name = $node->name;

We know have capture_and_validate_slot_inactive_since at 2 places:
040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.

Worth to create a sub in Cluster.pm?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bharath Rupireddy
On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
 wrote:
>
> If we just sync inactive_since value for synced slots while in
> recovery from the primary, so be it. Why do we need to update it to
> the current time when the slot is being created? We don't expose slot
> creation time, no? Aren't we fine if we just sync the value from
> primary and document that fact? After the promotion, we can reset it
> to the current time so that it gets its own time.

I'm attaching v24 patches. It implements the above idea proposed
upthread for synced slots. I've now separated
s/last_inactive_time/inactive_since and synced slots behaviour. Please
have a look.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v24-0001-Use-less-confusing-name-for-slot-s-last_inactive.patch
Description: Binary data


v24-0002-Maintain-inactive_since-for-synced-slots-correct.patch
Description: Binary data


v24-0003-Allow-setting-inactive_timeout-for-replication-s.patch
Description: Binary data


v24-0004-Add-inactive_timeout-based-replication-slot-inva.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 04:17:53PM +0530, shveta malik wrote:
> On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > > I think there may have been some misunderstanding here.
> >
> > Indeed ;-)
> >
> > > But now if I
> > > rethink this, I am fine with 'inactive_since' getting synced from
> > > primary to standby. But if we do that, we need to add docs stating
> > > "inactive_since" represents primary's inactivity and not standby's
> > > slots inactivity for synced slots.
> >
> > Yeah sure.
> >
> > > The reason for this clarification
> > > is that the synced slot might be generated much later, yet
> > > 'inactive_since' is synced from the primary, potentially indicating a
> > > time considerably earlier than when the synced slot was actually
> > > created.
> >
> > Right.
> >
> > > Another approach could be that "inactive_since" for synced slot
> > > actually gives its own inactivity data rather than giving primary's
> > > slot data. We update inactive_since on standby only at 3 occasions:
> > > 1) at the time of creation of the synced slot.
> > > 2) during standby restart.
> > > 3) during promotion of standby.
> > >
> > > I have attached a sample patch for this idea as.txt file.
> >
> > Thanks!
> >
> > > I am fine with any of these approaches.  One gives data synced from
> > > primary for synced slots, while another gives actual inactivity data
> > > of synced slots.
> >
> > What about another approach?: inactive_since gives data synced from primary 
> > for
> > synced slots and another dedicated field (could be added later...) could
> > represent what you suggest as the other option.
> 
> Yes, okay with me. I think there is some confusion here as well. In my
> second approach above, I have not suggested anything related to
> sync-worker.

Yeah, no confusion, understood that way.

> We can think on that later if we really need another
> field which give us sync time.

I think that calling GetCurrentTimestamp() so frequently could be too costly, so
I'm not sure we should.

> In my second approach, I have tried to
> avoid updating inactive_since for synced slots during sync process. We
> update that field during creation of synced slot so that
> inactive_since reflects correct info even for synced slots (rather
> than copying from primary). 

Yeah, and I think we could create a dedicated field with this information
if we feel the need.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 04:49:18PM +0530, shveta malik wrote:
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
>  wrote:
> >
> > On Tue, Mar 26, 2024 at 4:18 PM shveta malik  wrote:
> > >
> > > > What about another approach?: inactive_since gives data synced from 
> > > > primary for
> > > > synced slots and another dedicated field (could be added later...) could
> > > > represent what you suggest as the other option.
> > >
> > > Yes, okay with me. I think there is some confusion here as well. In my
> > > second approach above, I have not suggested anything related to
> > > sync-worker. We can think on that later if we really need another
> > > field which give us sync time.  In my second approach, I have tried to
> > > avoid updating inactive_since for synced slots during sync process. We
> > > update that field during creation of synced slot so that
> > > inactive_since reflects correct info even for synced slots (rather
> > > than copying from primary). Please have a look at my patch and let me
> > > know your thoughts. I am fine with copying it from primary as well and
> > > documenting this behaviour.
> >
> > I took a look at your patch.
> >
> > --- a/src/backend/replication/logical/slotsync.c
> > +++ b/src/backend/replication/logical/slotsync.c
> > @@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
> > remote_dbid)
> >  SpinLockAcquire(>mutex);
> >  slot->effective_catalog_xmin = xmin_horizon;
> >  slot->data.catalog_xmin = xmin_horizon;
> > +slot->inactive_since = GetCurrentTimestamp();
> >  SpinLockRelease(>mutex);
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created?
> 
> If we update inactive_since  at synced slot's creation or during
> restart (skipping setting it during sync), then this time reflects
> actual 'inactive_since' for that particular synced slot.  Isn't that a
> clear info for the user and in alignment of what the name
> 'inactive_since' actually suggests?
> 
> > We don't expose slot
> > creation time, no?
> 
> No, we don't. But for synced slot, that is the time since that slot is
> inactive  (unless promoted), so we are exposing inactive_since and not
> creation time.
> 
> >Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time. Do you see any
> > issues with it?
> 
> Yes, we can do that. But curious to know, do we see any additional
> benefit of reflecting primary's inactive_since at standby which I
> might be missing?

In case the primary goes down, then one could use the value on the standby
to get the value coming from the primary. I think that could be useful info to
have.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
 wrote:
>
> On Tue, Mar 26, 2024 at 4:18 PM shveta malik  wrote:
> >
> > > What about another approach?: inactive_since gives data synced from 
> > > primary for
> > > synced slots and another dedicated field (could be added later...) could
> > > represent what you suggest as the other option.
> >
> > Yes, okay with me. I think there is some confusion here as well. In my
> > second approach above, I have not suggested anything related to
> > sync-worker. We can think on that later if we really need another
> > field which give us sync time.  In my second approach, I have tried to
> > avoid updating inactive_since for synced slots during sync process. We
> > update that field during creation of synced slot so that
> > inactive_since reflects correct info even for synced slots (rather
> > than copying from primary). Please have a look at my patch and let me
> > know your thoughts. I am fine with copying it from primary as well and
> > documenting this behaviour.
>
> I took a look at your patch.
>
> --- a/src/backend/replication/logical/slotsync.c
> +++ b/src/backend/replication/logical/slotsync.c
> @@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
> remote_dbid)
>  SpinLockAcquire(>mutex);
>  slot->effective_catalog_xmin = xmin_horizon;
>  slot->data.catalog_xmin = xmin_horizon;
> +slot->inactive_since = GetCurrentTimestamp();
>  SpinLockRelease(>mutex);
>
> If we just sync inactive_since value for synced slots while in
> recovery from the primary, so be it. Why do we need to update it to
> the current time when the slot is being created?

If we update inactive_since  at synced slot's creation or during
restart (skipping setting it during sync), then this time reflects
actual 'inactive_since' for that particular synced slot.  Isn't that a
clear info for the user and in alignment of what the name
'inactive_since' actually suggests?

> We don't expose slot
> creation time, no?

No, we don't. But for synced slot, that is the time since that slot is
inactive  (unless promoted), so we are exposing inactive_since and not
creation time.

>Aren't we fine if we just sync the value from
> primary and document that fact? After the promotion, we can reset it
> to the current time so that it gets its own time. Do you see any
> issues with it?

Yes, we can do that. But curious to know, do we see any additional
benefit of reflecting primary's inactive_since at standby which I
might be missing?

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bharath Rupireddy
On Tue, Mar 26, 2024 at 4:18 PM shveta malik  wrote:
>
> > What about another approach?: inactive_since gives data synced from primary 
> > for
> > synced slots and another dedicated field (could be added later...) could
> > represent what you suggest as the other option.
>
> Yes, okay with me. I think there is some confusion here as well. In my
> second approach above, I have not suggested anything related to
> sync-worker. We can think on that later if we really need another
> field which give us sync time.  In my second approach, I have tried to
> avoid updating inactive_since for synced slots during sync process. We
> update that field during creation of synced slot so that
> inactive_since reflects correct info even for synced slots (rather
> than copying from primary). Please have a look at my patch and let me
> know your thoughts. I am fine with copying it from primary as well and
> documenting this behaviour.

I took a look at your patch.

--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
 SpinLockAcquire(>mutex);
 slot->effective_catalog_xmin = xmin_horizon;
 slot->data.catalog_xmin = xmin_horizon;
+slot->inactive_since = GetCurrentTimestamp();
 SpinLockRelease(>mutex);

If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created? We don't expose slot
creation time, no? Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time. Do you see any
issues with it?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> > I think there may have been some misunderstanding here.
>
> Indeed ;-)
>
> > But now if I
> > rethink this, I am fine with 'inactive_since' getting synced from
> > primary to standby. But if we do that, we need to add docs stating
> > "inactive_since" represents primary's inactivity and not standby's
> > slots inactivity for synced slots.
>
> Yeah sure.
>
> > The reason for this clarification
> > is that the synced slot might be generated much later, yet
> > 'inactive_since' is synced from the primary, potentially indicating a
> > time considerably earlier than when the synced slot was actually
> > created.
>
> Right.
>
> > Another approach could be that "inactive_since" for synced slot
> > actually gives its own inactivity data rather than giving primary's
> > slot data. We update inactive_since on standby only at 3 occasions:
> > 1) at the time of creation of the synced slot.
> > 2) during standby restart.
> > 3) during promotion of standby.
> >
> > I have attached a sample patch for this idea as.txt file.
>
> Thanks!
>
> > I am fine with any of these approaches.  One gives data synced from
> > primary for synced slots, while another gives actual inactivity data
> > of synced slots.
>
> What about another approach?: inactive_since gives data synced from primary 
> for
> synced slots and another dedicated field (could be added later...) could
> represent what you suggest as the other option.

Yes, okay with me. I think there is some confusion here as well. In my
second approach above, I have not suggested anything related to
sync-worker. We can think on that later if we really need another
field which give us sync time.  In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary). Please have a look at my patch and let me
know your thoughts. I am fine with copying it from primary as well and
documenting this behaviour.

> Another cons of updating inactive_since at the current time during each slot
> sync cycle is that calling GetCurrentTimestamp() very frequently
> (during each sync cycle of very active slots) could be too costly.

Right.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Ajin Cherian
On Tue, Mar 26, 2024 at 7:57 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> Please see the attached v23 patches. I've addressed all the review
> comments received so far from Amit and Shveta.
>
>
In patch 0003:
+ SpinLockAcquire(>mutex);
+ }
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ if (slot->inactive_since > 0 &&
+ slot->data.inactive_timeout > 0)
+ {
+ TimestampTz now;
+
+ /* inactive_since is only tracked for inactive slots */
+ Assert(slot->active_pid == 0);
+
+ now = GetCurrentTimestamp();
+ if (TimestampDifferenceExceeds(slot->inactive_since, now,
+   slot->data.inactive_timeout * 1000))
+ inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+ }
+
+ if (need_locks)
+ {
+ SpinLockRelease(>mutex);

Here, GetCurrentTimestamp() is still called with SpinLock held. Maybe do
this prior to acquiring the spinlock.

regards,
Ajin Cherian
Fujitsu Australia


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Amit Kapila
On Tue, Mar 26, 2024 at 3:12 PM Bertrand Drouvot
 wrote:
>
> On Tue, Mar 26, 2024 at 02:27:17PM +0530, Bharath Rupireddy wrote:
> > Please use the v22 patch set.
>
> Thanks!
>
> 1 ===
>
> +reset_synced_slots_info(void)
>
> I'm not sure "reset" is the right word, what about 
> slot_sync_shutdown_update()?
>

*shutdown_update() sounds generic. How about
update_synced_slots_inactive_time()? I think it is a bit longer but
conveys the meaning.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 03:17:36PM +0530, shveta malik wrote:
> On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:
> > > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
> > >  wrote:
> > > >
> > > > 2 ===
> > > >
> > > > It looks like inactive_since is set to the current timestamp on the 
> > > > standby
> > > > each time the sync worker does a cycle:
> > > >
> > > > primary:
> > > >
> > > > postgres=# select slot_name,inactive_since from pg_replication_slots 
> > > > where failover = 't';
> > > >   slot_name  |inactive_since
> > > > -+---
> > > >  lsub27_slot | 2024-03-26 07:39:19.745517+00
> > > >  lsub28_slot | 2024-03-26 07:40:24.953826+00
> > > >
> > > > standby:
> > > >
> > > > postgres=# select slot_name,inactive_since from pg_replication_slots 
> > > > where failover = 't';
> > > >   slot_name  |inactive_since
> > > > -+---
> > > >  lsub27_slot | 2024-03-26 07:43:56.387324+00
> > > >  lsub28_slot | 2024-03-26 07:43:56.387338+00
> > > >
> > > > I don't think that should be the case.
> > > >
> > >
> > > But why? This is exactly what we discussed in another thread where we
> > > agreed to update inactive_since even for sync slots.
> >
> > Hum, I thought we agreed to "sync" it and to "update it to current time"
> > only at promotion time.
> 
> I think there may have been some misunderstanding here.

Indeed ;-)

> But now if I
> rethink this, I am fine with 'inactive_since' getting synced from
> primary to standby. But if we do that, we need to add docs stating
> "inactive_since" represents primary's inactivity and not standby's
> slots inactivity for synced slots.

Yeah sure.

> The reason for this clarification
> is that the synced slot might be generated much later, yet
> 'inactive_since' is synced from the primary, potentially indicating a
> time considerably earlier than when the synced slot was actually
> created.

Right.

> Another approach could be that "inactive_since" for synced slot
> actually gives its own inactivity data rather than giving primary's
> slot data. We update inactive_since on standby only at 3 occasions:
> 1) at the time of creation of the synced slot.
> 2) during standby restart.
> 3) during promotion of standby.
> 
> I have attached a sample patch for this idea as.txt file.

Thanks!

> I am fine with any of these approaches.  One gives data synced from
> primary for synced slots, while another gives actual inactivity data
> of synced slots.

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Another cons of updating inactive_since at the current time during each slot
sync cycle is that calling GetCurrentTimestamp() very frequently
(during each sync cycle of very active slots) could be too costly.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:
> > On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
> >  wrote:
> > >
> > > 2 ===
> > >
> > > It looks like inactive_since is set to the current timestamp on the 
> > > standby
> > > each time the sync worker does a cycle:
> > >
> > > primary:
> > >
> > > postgres=# select slot_name,inactive_since from pg_replication_slots 
> > > where failover = 't';
> > >   slot_name  |inactive_since
> > > -+---
> > >  lsub27_slot | 2024-03-26 07:39:19.745517+00
> > >  lsub28_slot | 2024-03-26 07:40:24.953826+00
> > >
> > > standby:
> > >
> > > postgres=# select slot_name,inactive_since from pg_replication_slots 
> > > where failover = 't';
> > >   slot_name  |inactive_since
> > > -+---
> > >  lsub27_slot | 2024-03-26 07:43:56.387324+00
> > >  lsub28_slot | 2024-03-26 07:43:56.387338+00
> > >
> > > I don't think that should be the case.
> > >
> >
> > But why? This is exactly what we discussed in another thread where we
> > agreed to update inactive_since even for sync slots.
>
> Hum, I thought we agreed to "sync" it and to "update it to current time"
> only at promotion time.

I think there may have been some misunderstanding here. But now if I
rethink this, I am fine with 'inactive_since' getting synced from
primary to standby. But if we do that, we need to add docs stating
"inactive_since" represents primary's inactivity and not standby's
slots inactivity for synced slots. The reason for this clarification
is that the synced slot might be generated much later, yet
'inactive_since' is synced from the primary, potentially indicating a
time considerably earlier than when the synced slot was actually
created.

Another approach could be that "inactive_since" for synced slot
actually gives its own inactivity data rather than giving primary's
slot data. We update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.

I have attached a sample patch for this idea as.txt file.

I am fine with any of these approaches.  One gives data synced from
primary for synced slots, while another gives actual inactivity data
of synced slots.

thanks
Shveta
From 7dcd0e95299263187eb1f03812f8321b2612ee5c Mon Sep 17 00:00:00 2001
From: Shveta Malik 
Date: Tue, 26 Mar 2024 14:42:25 +0530
Subject: [PATCH v1] inactive_since for synced slots.

inactive_since is updated for synced slots:
1) at the time of creation of slot.
2) during server restart.
3) during promotion.
---
 src/backend/replication/logical/slotsync.c |  1 +
 src/backend/replication/slot.c | 15 ---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/logical/slotsync.c 
b/src/backend/replication/logical/slotsync.c
index bbf9a2c485..6114895dca 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid 
remote_dbid)
SpinLockAcquire(>mutex);
slot->effective_catalog_xmin = xmin_horizon;
slot->data.catalog_xmin = xmin_horizon;
+   slot->inactive_since = GetCurrentTimestamp();
SpinLockRelease(>mutex);
ReplicationSlotsComputeRequiredXmin(true);
LWLockRelease(ProcArrayLock);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d0a2f440ef..f2a57a14ec 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -628,7 +628,10 @@ retry:
 * now.
 */
SpinLockAcquire(>mutex);
-   s->inactive_since = 0;
+
+   if (!(RecoveryInProgress() && s->data.synced))
+   s->inactive_since = 0;
+
SpinLockRelease(>mutex);
 
if (am_walsender)
@@ -704,14 +707,20 @@ ReplicationSlotRelease(void)
 */
SpinLockAcquire(>mutex);
slot->active_pid = 0;
-   slot->inactive_since = now;
+
+   if (!(RecoveryInProgress() && slot->data.synced))
+   slot->inactive_since = now;
+
SpinLockRelease(>mutex);
ConditionVariableBroadcast(>active_cv);
}
else
{
SpinLockAcquire(>mutex);
-   slot->inactive_since = now;
+
+   if (!(RecoveryInProgress() && slot->data.synced))
+   slot->inactive_since = now;
+
SpinLockRelease(>mutex);
}
 
-- 
2.34.1



Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 02:27:17PM +0530, Bharath Rupireddy wrote:
> Please use the v22 patch set.

Thanks!

1 ===

+reset_synced_slots_info(void)

I'm not sure "reset" is the right word, what about slot_sync_shutdown_update()?

2 ===

+   for (int i = 0; i < max_replication_slots; i++)
+   {
+   ReplicationSlot *s = >replication_slots[i];
+
+   /* Check if it is a synchronized slot */
+   if (s->in_use && s->data.synced)
+   {
+   TimestampTz now;
+
+   Assert(SlotIsLogical(s));
+   Assert(s->active_pid == 0);
+
+   /*
+* Set the time since the slot has become inactive 
after shutting
+* down slot sync machinery. This helps correctly 
interpret the
+* time if the standby gets promoted without a restart. 
We get the
+* current time beforehand to avoid a system call while 
holding
+* the lock.
+*/
+   now = GetCurrentTimestamp();

What about moving "now = GetCurrentTimestamp()" outside of the for loop? (it
would be less costly and probably good enough).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 2:27 PM Bharath Rupireddy
 wrote:
>
> >
> > 1)
> > Commti msg:
> >
> > ensures the value is set to current timestamp during the
> > shutdown to help correctly interpret the time if the standby gets
> > promoted without a restart.
> >
> > shutdown --> shutdown of slot sync worker   (as it was not clear if it
> > is instance shutdown or something else)
>
> Changed it to "shutdown of slot sync machinery" to be consistent with
> the comments.

Thanks for addressing the comments. Just to give more clarity here (so
that you take a informed decision), I am not sure if we actually shut
down slot-sync machinery. We only shot down slot sync worker.
Slot-sync machinery can still be used using
'pg_sync_replication_slots' SQL function. I can easily reproduce the
scenario where SQL function and  reset_synced_slots_info() are going
in parallel where the latter hits 'Assert(s->active_pid == 0)' due to
the fact that  parallel SQL sync function is active on that slot.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bharath Rupireddy
On Tue, Mar 26, 2024 at 11:26 AM Amit Kapila  wrote:
>
> Review comments on v18_0002 and v18_0005
> ===
>
> 1.
> We have decided to update inactive_since for temporary slots. So,
> unless there is some reason, we should allow inactive_timeout to also
> be set for temporary slots.

WFM. A temporary slot that's inactive for a long time before even the
server isn't shutdown can utilize this inactive_timeout based
invalidation mechanism. And, I'd also vote for we being consistent for
temporary and synced slots.

>  L.last_inactive_time,
> +L.inactive_timeout,
>
> Shall we keep inactive_timeout before
> last_inactive_time/inactive_since? I don't have any strong reason to
> propose that way apart from that the former is provided by the user.

Done.

> + if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
> + invalidated = true;
>
> I don't think we should try to invalidate the slots in
> pg_get_replication_slots. This function's purpose is to get the
> current information on slots and has no intention to perform any work
> for slots. Any error due to invalidation won't be what the user would
> be expecting here.

Agree. Removed.

> 4.
> +static bool
> +InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
> + bool need_control_lock,
> + bool need_mutex)
> {
> ...
> ...
> + if (need_control_lock)
> + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> +
> + Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
> +
> + /*
> + * Check if the slot needs to be invalidated due to inactive_timeout. We
> + * do this with the spinlock held to avoid race conditions -- for example
> + * the restart_lsn could move forward, or the slot could be dropped.
> + */
> + if (need_mutex)
> + SpinLockAcquire(>mutex);
> ...
>
> I find this combination of parameters a bit strange. Because, say if
> need_mutex is false and need_control_lock is true then that means this
> function will acquire LWlock after acquiring spinlock which is
> unacceptable. Now, this may not happen in practice as the callers
> won't pass such a combination but still, this functionality should be
> improved.

Right. Either we need two locks or not. So, changed it to use just one
bool need_locks, upon set both control lock and spin lock are acquired
and released.

On Mon, Mar 25, 2024 at 10:33 AM shveta malik  wrote:
>
> patch 002:
>
> 2)
> slotsync.c:
>
>   ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
> remote_slot->two_phase,
> remote_slot->failover,
> -   true);
> +   true, 0);
>
> + slot->data.inactive_timeout = remote_slot->inactive_timeout;
>
> Is there a reason we are not passing 'remote_slot->inactive_timeout'
> to ReplicationSlotCreate() directly?

The slot there gets created temporarily for which we were not
supporting inactive_timeout being set. But, in the latest v22 patch we
are supporting, so passing the remote_slot->inactive_timeout directly.

> 3)
> slotfuncs.c
> pg_create_logical_replication_slot():
> + int inactive_timeout = PG_GETARG_INT32(5);
>
> Can we mention here that timeout is in seconds either in comment or
> rename variable to inactive_timeout_secs?
>
> Please do this for create_physical_replication_slot(),
> create_logical_replication_slot(),
> pg_create_physical_replication_slot() as well.

Added /* in seconds */ next the variable declaration.

> -
> 4)
> + int inactive_timeout; /* The amount of time in seconds the slot
> + * is allowed to be inactive. */
>  } LogicalSlotInfo;
>
>  Do we need to mention "before getting invalided" like other places
> (in last patch)?

Done.

>  5)
> Same at these two places. "before getting invalided" to be added in
> the last patch otherwise the info is incompleted.
>
> +
> + /* The amount of time in seconds the slot is allowed to be inactive */
> + int inactive_timeout;
>  } ReplicationSlotPersistentData;
>
>
> + * inactive_timeout: The amount of time in seconds the slot is allowed to be
> + * inactive.
>   */
>  void
>  ReplicationSlotCreate(const char *name, bool db_specific,
>  Same here. "before getting invalidated" ?

Done.

On Tue, Mar 26, 2024 at 12:04 PM shveta malik  wrote:
>
> > Please find the attached v21 patch implementing the above idea. It
> > also has changes for renaming last_inactive_time to inactive_since.
>
> Thanks for the patch. I have tested this patch alone, and it does what
> it says. One additional thing which I noticed is that now it sets
> inactive_since for temp slots as well, but that idea looks fine to me.

Right. Let's be consistent by treating all slots the same.

> I could not test 'invalidation on promotion bug' with this change, as
> that needed rebasing of the rest of the patches.

Please use the v22 patch set.

> Few trivial things:
>
> 1)
> Commti msg:
>
> ensures the value is set to current timestamp during the
> shutdown to help correctly interpret the time if the standby gets
> promoted without a restart.
>
> shutdown --> 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:
> On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
>  wrote:
> >
> > 2 ===
> >
> > It looks like inactive_since is set to the current timestamp on the standby
> > each time the sync worker does a cycle:
> >
> > primary:
> >
> > postgres=# select slot_name,inactive_since from pg_replication_slots where 
> > failover = 't';
> >   slot_name  |inactive_since
> > -+---
> >  lsub27_slot | 2024-03-26 07:39:19.745517+00
> >  lsub28_slot | 2024-03-26 07:40:24.953826+00
> >
> > standby:
> >
> > postgres=# select slot_name,inactive_since from pg_replication_slots where 
> > failover = 't';
> >   slot_name  |inactive_since
> > -+---
> >  lsub27_slot | 2024-03-26 07:43:56.387324+00
> >  lsub28_slot | 2024-03-26 07:43:56.387338+00
> >
> > I don't think that should be the case.
> >
> 
> But why? This is exactly what we discussed in another thread where we
> agreed to update inactive_since even for sync slots.

Hum, I thought we agreed to "sync" it and to "update it to current time"
only at promotion time.

I don't think updating inactive_since to current time during each cycle makes
sense (I mean I understand the use case: being able to say when slots have been
sync, but if this is what we want then we should consider an extra view or an
extra field but not relying on the inactive_since one).

If the primary goes down, not updating inactive_since to the current time could
also provide benefit such as knowing the inactive_since of the primary slots
(from the standby) the last time it has been synced. If we update it to the 
current
time then this information is lost.

> In each sync
> cycle, we acquire/release the slot, so the inactive_since gets
> updated. See synchronize_one_slot().

Right, and I think we should put an extra condition if in recovery.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Amit Kapila
On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
 wrote:
>
> 2 ===
>
> It looks like inactive_since is set to the current timestamp on the standby
> each time the sync worker does a cycle:
>
> primary:
>
> postgres=# select slot_name,inactive_since from pg_replication_slots where 
> failover = 't';
>   slot_name  |inactive_since
> -+---
>  lsub27_slot | 2024-03-26 07:39:19.745517+00
>  lsub28_slot | 2024-03-26 07:40:24.953826+00
>
> standby:
>
> postgres=# select slot_name,inactive_since from pg_replication_slots where 
> failover = 't';
>   slot_name  |inactive_since
> -+---
>  lsub27_slot | 2024-03-26 07:43:56.387324+00
>  lsub28_slot | 2024-03-26 07:43:56.387338+00
>
> I don't think that should be the case.
>

But why? This is exactly what we discussed in another thread where we
agreed to update inactive_since even for sync slots. In each sync
cycle, we acquire/release the slot, so the inactive_since gets
updated. See synchronize_one_slot().

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 11:07:51AM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 9:30 AM shveta malik  wrote:
> > But immediately after promotion, we can not rely on the above check
> > and thus possibility of synced slots invalidation is there. To
> > maintain consistent behavior regarding the setting of
> > last_inactive_time for synced slots, similar to user slots, one
> > potential solution to prevent this invalidation issue is to update the
> > last_inactive_time of all synced slots within the ShutDownSlotSync()
> > function during FinishWalRecovery(). This approach ensures that
> > promotion doesn't immediately invalidate slots, and henceforth, we
> > possess a correct last_inactive_time as a basis for invalidation going
> > forward. This will be equivalent to updating last_inactive_time during
> > restart (but without actual restart during promotion).
> > The plus point of maintaining last_inactive_time for synced slots
> > could be, this can provide data to the user on when last time the sync
> > was attempted on that particular slot by background slot sync worker
> > or SQl function. Thoughts?
> 
> Please find the attached v21 patch implementing the above idea. It
> also has changes for renaming last_inactive_time to inactive_since.

Thanks!

A few comments:

1 ===

One trailing whitespace:

Applying: Fix review comments for slot's last_inactive_time property
.git/rebase-apply/patch:433: trailing whitespace.
# got a valid inactive_since value representing the last slot sync time.
warning: 1 line adds whitespace errors.

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where 
failover = 't';
  slot_name  |inactive_since
-+---
 lsub27_slot | 2024-03-26 07:39:19.745517+00
 lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where 
failover = 't';
  slot_name  |inactive_since
-+---
 lsub27_slot | 2024-03-26 07:43:56.387324+00
 lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy
 wrote:
>
> On Tue, Mar 26, 2024 at 9:30 AM shveta malik  wrote:
> >
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik  
> > wrote:
> > >
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set. Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> >
> > On standby, if we decide to maintain valid last_inactive_time for
> > synced slots, then invalidation is correctly restricted in
> > InvalidateSlotForInactiveTimeout() for synced slots using the check:
> >
> > if (RecoveryInProgress() && slot->data.synced)
> > return false;
> >
> > But immediately after promotion, we can not rely on the above check
> > and thus possibility of synced slots invalidation is there. To
> > maintain consistent behavior regarding the setting of
> > last_inactive_time for synced slots, similar to user slots, one
> > potential solution to prevent this invalidation issue is to update the
> > last_inactive_time of all synced slots within the ShutDownSlotSync()
> > function during FinishWalRecovery(). This approach ensures that
> > promotion doesn't immediately invalidate slots, and henceforth, we
> > possess a correct last_inactive_time as a basis for invalidation going
> > forward. This will be equivalent to updating last_inactive_time during
> > restart (but without actual restart during promotion).
> > The plus point of maintaining last_inactive_time for synced slots
> > could be, this can provide data to the user on when last time the sync
> > was attempted on that particular slot by background slot sync worker
> > or SQl function. Thoughts?
>
> Please find the attached v21 patch implementing the above idea. It
> also has changes for renaming last_inactive_time to inactive_since.
>

Thanks for the patch. I have tested this patch alone, and it does what
it says. One additional thing which I noticed is that now it sets
inactive_since for temp slots as well, but that idea looks fine to me.

I could not test 'invalidation on promotion bug' with this change, as
that needed rebasing of the rest of the patches.

Few trivial things:

1)
Commti msg:

ensures the value is set to current timestamp during the
shutdown to help correctly interpret the time if the standby gets
promoted without a restart.

shutdown --> shutdown of slot sync worker   (as it was not clear if it
is instance shutdown or something else)

2)
'The time since the slot has became inactive'.

has became-->has become
or just became

Please check it in all the files. There are multiple places.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 11:36 AM Bertrand Drouvot
 wrote:
> >
> > The issue that I can see with your proposal is: what if one synced the slots
> > manually (with pg_sync_replication_slots()) but does not use the sync 
> > worker?
> > Then I think ShutDownSlotSync() is not going to help in that case.
>
> It looks like ShutDownSlotSync() is always called (even if 
> sync_replication_slots = off),
> so that sounds ok to me (I should have checked the code, I was under the 
> impression
> ShutDownSlotSync() was not called if sync_replication_slots = off).

Right, it is called irrespective of sync_replication_slots.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 05:55:11AM +, Bertrand Drouvot wrote:
> Hi,
> 
> On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote:
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik  
> > wrote:
> > >
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set. Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> > 
> > On standby, if we decide to maintain valid last_inactive_time for
> > synced slots, then invalidation is correctly restricted in
> > InvalidateSlotForInactiveTimeout() for synced slots using the check:
> > 
> > if (RecoveryInProgress() && slot->data.synced)
> > return false;
> 
> Right.
> 
> > But immediately after promotion, we can not rely on the above check
> > and thus possibility of synced slots invalidation is there. To
> > maintain consistent behavior regarding the setting of
> > last_inactive_time for synced slots, similar to user slots, one
> > potential solution to prevent this invalidation issue is to update the
> > last_inactive_time of all synced slots within the ShutDownSlotSync()
> > function during FinishWalRecovery(). This approach ensures that
> > promotion doesn't immediately invalidate slots, and henceforth, we
> > possess a correct last_inactive_time as a basis for invalidation going
> > forward. This will be equivalent to updating last_inactive_time during
> > restart (but without actual restart during promotion).
> > The plus point of maintaining last_inactive_time for synced slots
> > could be, this can provide data to the user on when last time the sync
> > was attempted on that particular slot by background slot sync worker
> > or SQl function. Thoughts?
> 
> Yeah, another plus point is that if the primary is down then one could look
> at the synced "active_since" on the standby to get an idea of it (depends of 
> the
> last sync though).
> 
> The issue that I can see with your proposal is: what if one synced the slots
> manually (with pg_sync_replication_slots()) but does not use the sync worker?
> Then I think ShutDownSlotSync() is not going to help in that case.

It looks like ShutDownSlotSync() is always called (even if 
sync_replication_slots = off),
so that sounds ok to me (I should have checked the code, I was under the 
impression
ShutDownSlotSync() was not called if sync_replication_slots = off).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Amit Kapila
On Sun, Mar 24, 2024 at 3:05 PM Bharath Rupireddy
 wrote:
>
> I've attached the v18 patch set here. I've also addressed earlier
> review comments from Amit, Ajin Cherian. Note that I've added new
> invalidation mechanism tests in a separate TAP test file just because
> I don't want to clutter or bloat any of the existing files and spread
> tests for physical slots and logical slots into separate existing TAP
> files.
>

Review comments on v18_0002 and v18_0005
===
1.
 ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
-   bool two_phase, bool failover, bool synced)
+   bool two_phase, bool failover, bool synced,
+   int inactive_timeout)
 {
  ReplicationSlot *slot = NULL;
  int i;
@@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific,
  errmsg("cannot enable failover for a temporary replication slot"));
  }

+ if (inactive_timeout > 0)
+ {
+ /*
+ * Do not allow users to set inactive_timeout for temporary slots,
+ * because temporary slots will not be saved to the disk.
+ */
+ if (persistency == RS_TEMPORARY)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot set inactive_timeout for a temporary replication slot"));
+ }

We have decided to update inactive_since for temporary slots. So,
unless there is some reason, we should allow inactive_timeout to also
be set for temporary slots.

2.
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1024,6 +1024,7 @@ CREATE VIEW pg_replication_slots AS
 L.safe_wal_size,
 L.two_phase,
 L.last_inactive_time,
+L.inactive_timeout,

Shall we keep inactive_timeout before
last_inactive_time/inactive_since? I don't have any strong reason to
propose that way apart from that the former is provided by the user.

3.
@@ -287,6 +288,13 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  slot_contents = *slot;
  SpinLockRelease(>mutex);

+ /*
+ * Here's an opportunity to invalidate inactive replication slots
+ * based on timeout, so let's do it.
+ */
+ if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
+ invalidated = true;

I don't think we should try to invalidate the slots in
pg_get_replication_slots. This function's purpose is to get the
current information on slots and has no intention to perform any work
for slots. Any error due to invalidation won't be what the user would
be expecting here.

4.
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+ bool need_control_lock,
+ bool need_mutex)
{
...
...
+ if (need_control_lock)
+ LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ /*
+ * Check if the slot needs to be invalidated due to inactive_timeout. We
+ * do this with the spinlock held to avoid race conditions -- for example
+ * the restart_lsn could move forward, or the slot could be dropped.
+ */
+ if (need_mutex)
+ SpinLockAcquire(>mutex);
...

I find this combination of parameters a bit strange. Because, say if
need_mutex is false and need_control_lock is true then that means this
function will acquire LWlock after acquiring spinlock which is
unacceptable. Now, this may not happen in practice as the callers
won't pass such a combination but still, this functionality should be
improved.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Bertrand Drouvot
Hi,

On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote:
> On Mon, Mar 25, 2024 at 12:43 PM shveta malik  wrote:
> >
> > I have one concern, for synced slots on standby, how do we disallow
> > invalidation due to inactive-timeout immediately after promotion?
> >
> > For synced slots, last_inactive_time and inactive_timeout are both
> > set. Let's say I bring down primary for promotion of standby and then
> > promote standby, there are chances that it may end up invalidating
> > synced slots (considering standby is not brought down during promotion
> > and thus inactive_timeout may already be past 'last_inactive_time').
> >
> 
> On standby, if we decide to maintain valid last_inactive_time for
> synced slots, then invalidation is correctly restricted in
> InvalidateSlotForInactiveTimeout() for synced slots using the check:
> 
> if (RecoveryInProgress() && slot->data.synced)
> return false;

Right.

> But immediately after promotion, we can not rely on the above check
> and thus possibility of synced slots invalidation is there. To
> maintain consistent behavior regarding the setting of
> last_inactive_time for synced slots, similar to user slots, one
> potential solution to prevent this invalidation issue is to update the
> last_inactive_time of all synced slots within the ShutDownSlotSync()
> function during FinishWalRecovery(). This approach ensures that
> promotion doesn't immediately invalidate slots, and henceforth, we
> possess a correct last_inactive_time as a basis for invalidation going
> forward. This will be equivalent to updating last_inactive_time during
> restart (but without actual restart during promotion).
> The plus point of maintaining last_inactive_time for synced slots
> could be, this can provide data to the user on when last time the sync
> was attempted on that particular slot by background slot sync worker
> or SQl function. Thoughts?

Yeah, another plus point is that if the primary is down then one could look
at the synced "active_since" on the standby to get an idea of it (depends of the
last sync though).

The issue that I can see with your proposal is: what if one synced the slots
manually (with pg_sync_replication_slots()) but does not use the sync worker?
Then I think ShutDownSlotSync() is not going to help in that case.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Bharath Rupireddy
On Tue, Mar 26, 2024 at 9:30 AM shveta malik  wrote:
>
> On Mon, Mar 25, 2024 at 12:43 PM shveta malik  wrote:
> >
> > I have one concern, for synced slots on standby, how do we disallow
> > invalidation due to inactive-timeout immediately after promotion?
> >
> > For synced slots, last_inactive_time and inactive_timeout are both
> > set. Let's say I bring down primary for promotion of standby and then
> > promote standby, there are chances that it may end up invalidating
> > synced slots (considering standby is not brought down during promotion
> > and thus inactive_timeout may already be past 'last_inactive_time').
> >
>
> On standby, if we decide to maintain valid last_inactive_time for
> synced slots, then invalidation is correctly restricted in
> InvalidateSlotForInactiveTimeout() for synced slots using the check:
>
> if (RecoveryInProgress() && slot->data.synced)
> return false;
>
> But immediately after promotion, we can not rely on the above check
> and thus possibility of synced slots invalidation is there. To
> maintain consistent behavior regarding the setting of
> last_inactive_time for synced slots, similar to user slots, one
> potential solution to prevent this invalidation issue is to update the
> last_inactive_time of all synced slots within the ShutDownSlotSync()
> function during FinishWalRecovery(). This approach ensures that
> promotion doesn't immediately invalidate slots, and henceforth, we
> possess a correct last_inactive_time as a basis for invalidation going
> forward. This will be equivalent to updating last_inactive_time during
> restart (but without actual restart during promotion).
> The plus point of maintaining last_inactive_time for synced slots
> could be, this can provide data to the user on when last time the sync
> was attempted on that particular slot by background slot sync worker
> or SQl function. Thoughts?

Please find the attached v21 patch implementing the above idea. It
also has changes for renaming last_inactive_time to inactive_since.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v21-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Amit Kapila
On Tue, Mar 26, 2024 at 1:24 AM Nathan Bossart  wrote:
>
>
> On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote:
> > This commit particularly lets one specify the inactive_timeout for
> > a slot via SQL functions pg_create_physical_replication_slot and
> > pg_create_logical_replication_slot.
>
> Off-list, Bharath brought to my attention that the current proposal was to
> set the timeout at the slot level.  While I think that is an entirely
> reasonable thing to support, the main use-case I have in mind for this
> feature is for an administrator that wants to prevent inactive slots from
> causing problems (e.g., transaction ID wraparound) on a server or a number
> of servers.  For that use-case, I think a GUC would be much more
> convenient.  Perhaps there could be a default inactive slot timeout GUC
> that would be used in the absence of a slot-level setting.  Thoughts?
>

Yeah, that is a valid point. One of the reasons for keeping it at slot
level was to allow different subscribers/output plugins to have a
different setting for invalid_timeout for their respective slots based
on their usage. Now, having it as a GUC also has some valid use cases
as pointed out by you but I am not sure having both at slot level and
at GUC level is required. I was a bit inclined to have it at slot
level for now and then based on some field usage report we can later
add GUC as well.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 12:43 PM shveta malik  wrote:
>
> I have one concern, for synced slots on standby, how do we disallow
> invalidation due to inactive-timeout immediately after promotion?
>
> For synced slots, last_inactive_time and inactive_timeout are both
> set. Let's say I bring down primary for promotion of standby and then
> promote standby, there are chances that it may end up invalidating
> synced slots (considering standby is not brought down during promotion
> and thus inactive_timeout may already be past 'last_inactive_time').
>

On standby, if we decide to maintain valid last_inactive_time for
synced slots, then invalidation is correctly restricted in
InvalidateSlotForInactiveTimeout() for synced slots using the check:

if (RecoveryInProgress() && slot->data.synced)
return false;

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Nathan Bossart
I apologize that I haven't been able to keep up with this thread for a
while, but I'm happy to see the continued interest in $SUBJECT.

On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote:
> This commit particularly lets one specify the inactive_timeout for
> a slot via SQL functions pg_create_physical_replication_slot and
> pg_create_logical_replication_slot.

Off-list, Bharath brought to my attention that the current proposal was to
set the timeout at the slot level.  While I think that is an entirely
reasonable thing to support, the main use-case I have in mind for this
feature is for an administrator that wants to prevent inactive slots from
causing problems (e.g., transaction ID wraparound) on a server or a number
of servers.  For that use-case, I think a GUC would be much more
convenient.  Perhaps there could be a default inactive slot timeout GUC
that would be used in the absence of a slot-level setting.  Thoughts?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 5:10 PM Amit Kapila  wrote:
>
> I think we should keep pg_alter_replication_slot() as the last
> priority among the remaining patches for this release. Let's try to
> first finish the primary functionality of inactive_timeout patch.
> Otherwise, I agree that the problem reported by you should be fixed.

Noted. Will focus on v18-002 patch now.

I was debugging the flow and just noticed that RecoveryInProgress()
always returns 'true' during
StartupReplicationSlots()-->RestoreSlotFromDisk() (even on primary) as
'xlogctl->SharedRecoveryState' is always 'RECOVERY_STATE_CRASH' at
that time. The 'xlogctl->SharedRecoveryState' is changed  to
'RECOVERY_STATE_DONE' on primary and to 'RECOVERY_STATE_ARCHIVE' on
standby at a later stage in StartupXLOG() (after we are done loading
slots).

The impact of this is, the condition in RestoreSlotFromDisk() in v20-001:

if (!(RecoveryInProgress() && slot->data.synced))
 slot->last_inactive_time = GetCurrentTimestamp();

is merely equivalent to:

if (!slot->data.synced)
slot->last_inactive_time = GetCurrentTimestamp();

Thus on primary, after restart, last_inactive_at is set correctly,
while on promoted standby (new primary), last_inactive_at is always
NULL after restart for the synced slots.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Amit Kapila
On Mon, Mar 25, 2024 at 2:40 PM shveta malik  wrote:
>
> On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > Yeah, and I can see last_inactive_time is moving on the standby (while not 
> > the
> > case on the primary), probably due to the sync worker slot 
> > acquisition/release
> > which does not seem right.
> >
>
> Yes, you are right, last_inactive_time keeps on moving for synced
> slots on standby.  Once I disabled slot-sync worker, then it is
> constant. Then it only changes if I call pg_sync_replication_slots().
>
> On a  different note, I noticed that we allow altering
> inactive_timeout for synced-slots on standby. And again overwrite it
> with the primary's value in the next sync cycle. Steps:
>
> 
> --Check pg_replication_slots for synced slot on standby, inactive_timeout is 
> 120
>slot_name   | failover | synced | active | inactive_timeout
> ---+--+++--
>  logical_slot1 | t| t  | f  |  120
>
> --Alter on standby
> SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900);
>

I think we should keep pg_alter_replication_slot() as the last
priority among the remaining patches for this release. Let's try to
first finish the primary functionality of inactive_timeout patch.
Otherwise, I agree that the problem reported by you should be fixed.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Amit Kapila
On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy
 wrote:
>
> Right. Done that way i.e. not setting the last_inactive_time for slots
> both while releasing the slot and restoring from the disk.
>
> Also, I've added a TAP function to check if the captured times are
> sane per Bertrand's review comment.
>
> Please see the attached v20 patch.
>

Pushed, after minor changes.

-- 
With Regards,
Amit Kapila.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy
 wrote:
>
> Right. Done that way i.e. not setting the last_inactive_time for slots
> both while releasing the slot and restoring from the disk.
>
> Also, I've added a TAP function to check if the captured times are
> sane per Bertrand's review comment.
>
> Please see the attached v20 patch.

Thanks for the patch. The issue of unnecessary invalidation of synced
slots on promotion is resolved in this patch.

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Bharath Rupireddy
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
 wrote:
>
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set.
>
> Yeah, and I can see last_inactive_time is moving on the standby (while not the
> case on the primary), probably due to the sync worker slot acquisition/release
> which does not seem right.
>
> > Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> >
> > This raises the question of whether we need to set
> > 'last_inactive_time' synced slots on the standby?
>
> Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
> standby because such slots are not usable anyway (until the standby gets 
> promoted).
>
> So, I think that last_inactive_time does not make sense if the slot never had
> the chance to be active.

Right. Done that way i.e. not setting the last_inactive_time for slots
both while releasing the slot and restoring from the disk.

Also, I've added a TAP function to check if the captured times are
sane per Bertrand's review comment.

Please see the attached v20 patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v20-0001-Track-last_inactive_time-in-pg_replication_slots.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Bertrand Drouvot
Hi,

On Mon, Mar 25, 2024 at 02:39:50PM +0530, shveta malik wrote:
> I am listing the concerns raised by me:
> 3) alter replication slot to alter inactive_timout for synced slots on
> standby, should this be allowed?

I don't think it should be allowed.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> Yeah, and I can see last_inactive_time is moving on the standby (while not the
> case on the primary), probably due to the sync worker slot acquisition/release
> which does not seem right.
>

Yes, you are right, last_inactive_time keeps on moving for synced
slots on standby.  Once I disabled slot-sync worker, then it is
constant. Then it only changes if I call pg_sync_replication_slots().

On a  different note, I noticed that we allow altering
inactive_timeout for synced-slots on standby. And again overwrite it
with the primary's value in the next sync cycle. Steps:


--Check pg_replication_slots for synced slot on standby, inactive_timeout is 120
   slot_name   | failover | synced | active | inactive_timeout
---+--+++--
 logical_slot1 | t| t  | f  |  120

--Alter on standby
SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900);

--Check pg_replication_slots:
   slot_name   | failover | synced | active | inactive_timeout
---+--+++--
 logical_slot1 | t| t  | f  |  900

--Run sync function
SELECT pg_sync_replication_slots();

--check again, inactive_timeout is set back to primary's value.
   slot_name   | failover | synced | active | inactive_timeout
---+--+++--
 logical_slot1 | t| t  | f  |  120

 

I feel altering synced slot's inactive_timeout should be prohibited on
standby. It should be in sync with primary always. Thoughts?

I am listing the concerns raised by me:
1) create-subscription with create_slot=false overwriting
inactive_timeout of existing slot  ([1])
2) last_inactive_time set for synced slots may result in invalidation
of slot on promotion.  ([2])
3) alter replication slot to alter inactive_timout for synced slots on
standby, should this be allowed?

[1]: 
https://www.postgresql.org/message-id/CAJpy0uAqBi%2BGbNn2ngJ-A_Z905CD3ss896bqY2ACUjGiF1Gkng%40mail.gmail.com
[2]: 
https://www.postgresql.org/message-id/CAJpy0uCLu%2BmqAwAMum%3DpXE9YYsy0BE7hOSw_Wno5vjwpFY%3D63g%40mail.gmail.com

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Bertrand Drouvot
Hi,

On Mon, Mar 25, 2024 at 02:07:21PM +0530, shveta malik wrote:
> On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:
> > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik  
> > > wrote:
> > > >
> > > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik  
> > > > wrote:
> > > > >
> > > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik 
> > > > >  wrote:
> > > > > >
> > > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > > > > >  wrote:
> > > > > > >
> > > > > > > I've attached the v18 patch set here.
> > > >
> > > > I have one concern, for synced slots on standby, how do we disallow
> > > > invalidation due to inactive-timeout immediately after promotion?
> > > >
> > > > For synced slots, last_inactive_time and inactive_timeout are both
> > > > set.
> >
> > Yeah, and I can see last_inactive_time is moving on the standby (while not 
> > the
> > case on the primary), probably due to the sync worker slot 
> > acquisition/release
> > which does not seem right.
> >
> > > Let's say I bring down primary for promotion of standby and then
> > > > promote standby, there are chances that it may end up invalidating
> > > > synced slots (considering standby is not brought down during promotion
> > > > and thus inactive_timeout may already be past 'last_inactive_time').
> > > >
> > >
> > > This raises the question of whether we need to set
> > > 'last_inactive_time' synced slots on the standby?
> >
> > Yeah, I think that last_inactive_time should stay at 0 on synced slots on 
> > the
> > standby because such slots are not usable anyway (until the standby gets 
> > promoted).
> >
> > So, I think that last_inactive_time does not make sense if the slot never 
> > had
> > the chance to be active.
> >
> > OTOH I think the timeout invalidation (if any) should be synced from 
> > primary.
> 
> Yes, even I feel that last_inactive_time makes sense only when the
> slot is available to be used. Synced slots are not available to be
> used until standby is promoted and thus last_inactive_time can be
> skipped to be set for synced_slots. But once primay is invalidated due
> to inactive-timeout, that invalidation should be synced to standby
> (which is happening currently).
> 

yeah, syncing the invalidation and always keeping last_inactive_time to zero 
for synced slots looks right to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik  
> > wrote:
> > >
> > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik  
> > > wrote:
> > > >
> > > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik  
> > > > wrote:
> > > > >
> > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > > > >  wrote:
> > > > > >
> > > > > > I've attached the v18 patch set here.
> > >
> > > I have one concern, for synced slots on standby, how do we disallow
> > > invalidation due to inactive-timeout immediately after promotion?
> > >
> > > For synced slots, last_inactive_time and inactive_timeout are both
> > > set.
>
> Yeah, and I can see last_inactive_time is moving on the standby (while not the
> case on the primary), probably due to the sync worker slot acquisition/release
> which does not seem right.
>
> > Let's say I bring down primary for promotion of standby and then
> > > promote standby, there are chances that it may end up invalidating
> > > synced slots (considering standby is not brought down during promotion
> > > and thus inactive_timeout may already be past 'last_inactive_time').
> > >
> >
> > This raises the question of whether we need to set
> > 'last_inactive_time' synced slots on the standby?
>
> Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
> standby because such slots are not usable anyway (until the standby gets 
> promoted).
>
> So, I think that last_inactive_time does not make sense if the slot never had
> the chance to be active.
>
> OTOH I think the timeout invalidation (if any) should be synced from primary.

Yes, even I feel that last_inactive_time makes sense only when the
slot is available to be used. Synced slots are not available to be
used until standby is promoted and thus last_inactive_time can be
skipped to be set for synced_slots. But once primay is invalidated due
to inactive-timeout, that invalidation should be synced to standby
(which is happening currently).

thanks
Shveta




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Bertrand Drouvot
Hi,

On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:
> On Mon, Mar 25, 2024 at 12:43 PM shveta malik  wrote:
> >
> > On Mon, Mar 25, 2024 at 11:53 AM shveta malik  
> > wrote:
> > >
> > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik  
> > > wrote:
> > > >
> > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > > >  wrote:
> > > > >
> > > > > I've attached the v18 patch set here.
> >
> > I have one concern, for synced slots on standby, how do we disallow
> > invalidation due to inactive-timeout immediately after promotion?
> >
> > For synced slots, last_inactive_time and inactive_timeout are both
> > set.

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

> Let's say I bring down primary for promotion of standby and then
> > promote standby, there are chances that it may end up invalidating
> > synced slots (considering standby is not brought down during promotion
> > and thus inactive_timeout may already be past 'last_inactive_time').
> >
> 
> This raises the question of whether we need to set
> 'last_inactive_time' synced slots on the standby?

Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
standby because such slots are not usable anyway (until the standby gets 
promoted).

So, I think that last_inactive_time does not make sense if the slot never had
the chance to be active.

OTOH I think the timeout invalidation (if any) should be synced from primary.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Bertrand Drouvot
Hi,

On Mon, Mar 25, 2024 at 12:25:21PM +0530, Bharath Rupireddy wrote:
> On Mon, Mar 25, 2024 at 10:28 AM Amit Kapila  wrote:
> >
> > On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila  wrote:
> > >
> > >
> > > Such a test looks reasonable but shall we add equal to in the second
> > > part of the test (like '$last_inactive_time'::timestamptz >=
> > > > '$slot_creation_time'::timestamptz;). This is just to be sure that even 
> > > > if the test ran fast enough to give the same time, the test shouldn't 
> > > > fail. I think it won't matter for correctness as well.
> 
> Agree. I added that in v19 patch. I was having that concern in my
> mind. That's the reason I wasn't capturing current_time something like
> below for the same worry that current_timestamp might be the same (or
> nearly the same) as the slot creation time. That's why I ended up
> capturing current_timestamp in a separate query than clubbing it up
> with pg_create_physical_replication_slot.
> 
> SELECT current_timestamp FROM pg_create_physical_replication_slot('foo');
> 
> > Apart from this, I have made minor changes in the comments. See and
> > let me know what you think of attached.
> 

Thanks!

v19-0001 LGTM, just one Nit comment for 019_replslot_limit.pl:

The code for "Get last_inactive_time value after the slot's creation" and 
"Check that the captured time is sane" is somehow duplicated: is it worth 
creating
2 functions?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread Amit Kapila
On Mon, Mar 25, 2024 at 12:43 PM shveta malik  wrote:
>
> On Mon, Mar 25, 2024 at 11:53 AM shveta malik  wrote:
> >
> > On Mon, Mar 25, 2024 at 10:33 AM shveta malik  
> > wrote:
> > >
> > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > >  wrote:
> > > >
> > > > I've attached the v18 patch set here.
>
> I have one concern, for synced slots on standby, how do we disallow
> invalidation due to inactive-timeout immediately after promotion?
>
> For synced slots, last_inactive_time and inactive_timeout are both
> set. Let's say I bring down primary for promotion of standby and then
> promote standby, there are chances that it may end up invalidating
> synced slots (considering standby is not brought down during promotion
> and thus inactive_timeout may already be past 'last_inactive_time').
>

This raises the question of whether we need to set
'last_inactive_time' synced slots on the standby?

-- 
With Regards,
Amit Kapila.




  1   2   3   >