Hi,

On 10/31/23 10:37 AM, shveta malik wrote:
On Fri, Oct 27, 2023 at 8:43 PM Drouvot, Bertrand

Agree with your test case, but in my case I was not using pub/sub.

I was not clear, so when I said:

- create logical_slot1 on the primary (and don't start using it)

I meant don't start decoding from it (like using pg_recvlogical() or
pg_logical_slot_get_changes()).

By using pub/sub the "don't start using it" is not satisfied.

My test case is:

"
SELECT * FROM pg_create_logical_replication_slot('logical_slot1', 
'test_decoding', false, true, true);
SELECT * FROM pg_create_logical_replication_slot('logical_slot2', 
'test_decoding', false, true, true);
pg_recvlogical -d postgres -S logical_slot2 --no-loop --start -f -
"


Okay, I am able to reproduce it now. Thanks for clarification. I have
tried to change the algorithm as per suggestion by Amit in [1]

[1]: 
https://www.postgresql.org/message-id/CAA4eK1KBL0110gamQfc62X%3D5JV8-Qjd0dw0Mq0o07cq6kE%2Bq%3Dg%40mail.gmail.com

Thanks!


This is not full proof solution but optimization over first one. Now
in any sync-cycle, we take 2 attempts for slots-creation (if any slots
are available to be created). In first attempt, we do not wait
indefinitely on inactive slots, we wait only for a fixed amount of
time and if remote-slot is still behind, then we add that to the
pending list and move to the next slot. Once we are done with first
attempt, in second attempt, we go for the pending ones and now we wait
on each of them until the primary catches up.

Aren't we "just" postponing the "issue"? I mean if there is really no activity
on, say, the first created slot, then once we move to the second attempt then 
any newly
created slot from that time would wait to be synced forever, no?

Looking at V30:

+       /* Update lsns of slot to remote slot's current position */
+       local_slot_update(remote_slot);
+       ReplicationSlotPersist();
+
+       ereport(LOG, errmsg("created slot \"%s\" locally", remote_slot->name));

I think this message is confusing as the slot has been created before it, here:

+   else
+   {
+       TransactionId xmin_horizon = InvalidTransactionId;
+       ReplicationSlot *slot;
+
+       ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL,
+                             remote_slot->two_phase, false);

So that it shows up in pg_replication_slots before this message is emitted (and 
that
specially true/worst for non active slots).

Maybe something like "newly locally created slot XXX has been synced..."?

While at it, would that make sense to move

+       slot->data.failover = true;

once we stop waiting for this slot? I think that would avoid confusion if one
query pg_replication_slots while we are still waiting for this slot to be 
synced,
thoughts? (currently we can see pg_replication_slots.synced_slot set to true
while we are still waiting).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Reply via email to