Hi,

On 10/6/23 6:48 PM, Amit Kapila wrote:
On Wed, Oct 4, 2023 at 5:34 PM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:

On 10/4/23 1:50 PM, shveta malik wrote:
On Wed, Oct 4, 2023 at 5:00 PM Amit Kapila <amit.kapil...@gmail.com> wrote:

On Wed, Oct 4, 2023 at 11:55 AM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:

On 10/4/23 6:26 AM, shveta malik wrote:
On Wed, Oct 4, 2023 at 5:36 AM Amit Kapila <amit.kapil...@gmail.com> wrote:


How about an alternate scheme where we define sync_slot_names on
standby but then store the physical_slot_name in the corresponding
logical slot (ReplicationSlotPersistentData) to be synced? So, the
standby will send the list of 'sync_slot_names' and the primary will
add the physical standby's slot_name in each of the corresponding
sync_slot. Now, if we do this then even after restart, we should be
able to know for which physical slot each logical slot needs to wait.
We can even provide an SQL API to reset the value of
standby_slot_names in logical slots as a way to unblock decoding in
case of emergency (for example, corresponding when physical standby
never comes up).



Looks like a better approach to me. It solves most of the pain points like:
1) Avoids the need of multiple GUCs
2) Primary and standby need not to worry to be in sync if we maintain
sync-slot-names GUC on both

As per my understanding of this approach, we don't want
'sync-slot-names' to be set on the primary. Do you have a different
understanding?


Same understanding. We do not need it to be set on primary by user. It
will be GUC on standby and standby will convey it to primary.

+1, same understanding here.


At PGConf NYC, I had a brief discussion on this topic with Andres
where yet another approach to achieve this came up.

Great!

Have a parameter
like enable_failover at the slot level (this will be persistent
information). Users can set it during the create/alter subscription or
via pg_create_logical_replication_slot(). Also, on physical standby,
there will be a parameter like enable_syncslot. All the physical
standbys that have set enable_syncslot will receive all the logical
slots that are marked as enable_failover. To me, whether to sync a
particular slot is a slot-level property, so defining it in this new
way seems reasonable.

Yeah, as this is a slot-level property, I agree that this seems reasonable.

Also that sounds more natural to me with this approach. The primary
is really the one that "drives" which slots can be synced. I like it.

One could also set enable_failover while creating a logical slot on a physical
standby (so that cascading standbys could also have "extra slot" to sync as
compare to "level 1" standbys).


I think this will simplify the scheme a bit but still, the list of
physical standby's for which logical slots wait during decoding needs
to be maintained as we thought.

Right.

But, how about with the above two
parameters (enable_failover and enable_syncslot), we have
standby_slot_names defined on the primary. That avoids the need to
store the list of standby_slot_names in logical slots and simplifies
the implementation quite a bit, right?

Agree.

Now, one can think if we have a
parameter like 'standby_slot_names' then why do we need
enable_syncslot on physical standby but that will be required to
invoke sync worker which will pull logical slot's information?

yes and enable_sync slot on the standby could also be used to "pause"
the sync on standbys (by disabling the parameter) if one would want to
(without the need to modify anything on the primary).

The
advantage of having standby_slot_names defined on primary is that we
can selectively wait on the subset of physical standbys where we are
syncing the slots.

Yeah and this flexibility/filtering looks somehow mandatory to me.

I think this will be something similar to
'synchronous_standby_names' in the sense that the physical standbys
mentioned in standby_slot_names will behave as synchronous copies with
respect to slots and after failover user can switch to one of these
physical standby and others can start following new master/publisher.

Thoughts?

I like the idea and I think that's the one that seems the more reasonable
to me. I'd vote for this idea with:

- standby_slot_names on the primary (could also be set on standbys in case of
cascading context)
- enable_failover at logical slot creation + API to enable/disable it at wish
- enable_syncslot on the standbys

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Reply via email to