On Mon, Jul 24, 2023 at 8:03 AM Bharath Rupireddy
<bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Fri, Jul 21, 2023 at 5:16 PM shveta malik <shveta.ma...@gmail.com> wrote:
> >
> > Thanks Bharat for letting us know. It is okay to split the patch, it
> > may definitely help to understand the modules better but shall we take
> > a step back and try to reevaluate the design first before moving to
> > other tasks?
>
> Agree that design comes first. FWIW, I'm attaching the v9 patch set
> that I have with me. It can't be a perfect patch set unless the design
> is finalized.
>
> > I analyzed more on the issues stated in [1] for replacing LIST_SLOTS
> > with SELECT query. On rethinking, it might not be a good idea to
> > replace this cmd with SELECT in Launcher code-path
>
> I think there are open fundamental design aspects, before optimizing
> LIST_SLOTS, see below. I'm sure we can come back to this later.
>
> > Secondly, I was thinking if the design proposed in the patch is the
> > best one. No doubt, it is the most simplistic design and thus may
> > .......... Any feedback is appreciated.
>
> Here are my thoughts about this feature:
>
> Current design:
>
> 1. On primary, never allow walsenders associated with logical
> replication slots to go ahead of physical standbys that are candidates
> for future primary after failover. This enables subscribers to connect
> to new primary after failover.
> 2. On all candidate standbys, periodically sync logical slots from
> primary (creating the slots if necessary) with one slot sync worker
> per logical slot.
>
> Important considerations:
>
> 1. Does this design guarantee the row versions required by subscribers
> aren't removed on candidate standbys as raised here -
> https://www.postgresql.org/message-id/20220218222319.yozkbhren7vkjbi5%40alap3.anarazel.de?
>
> It seems safe with logical decoding on standbys feature. Also, a
> test-case from upthread is already in patch sets (in v9 too)
> https://www.postgresql.org/message-id/CAAaqYe9FdKODa1a9n%3Dqj%2Bw3NiB9gkwvhRHhcJNginuYYRCnLrg%40mail.gmail.com.
> However, we need to verify the use cases extensively.
>

Agreed.

> 2. All candidate standbys will start one slot sync worker per logical
> slot which might not be scalable.
>

Yeah, that doesn't sound like a good idea but IIRC, the proposed patch
is using one worker per database (for all slots corresponding to a
database).

> Is having one (or a few more - not
> necessarily one for each logical slot) worker for all logical slots
> enough?
>

I guess for a large number of slots the is a possibility of a large
gap in syncing the slots which probably means we need to retain
corresponding WAL for a much longer time on the primary. If we can
prove that the gap won't be large enough to matter then this would be
probably worth considering otherwise, I think we should find a way to
scale the number of workers to avoid the large gap.

-- 
With Regards,
Amit Kapila.


Reply via email to