On Thu, Aug 17, 2023 at 11:44 AM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:
>
> Hi,
>
> On 8/14/23 11:52 AM, shveta malik wrote:
>
> >
> > We (myself and Ajin) performed the tests to compute the lag in standby
> > slots as compared to primary slots with different number of slot-sync
> > workers configured.
> >
>
> Thanks!
>
> > 3 DBs were created, each with 30 tables and each table having one
> > logical-pub/sub configured. So this made a total of 90 logical
> > replication slots to be synced. Then the workload was run for aprox 10
> > mins. During this workload, at regular intervals, primary and standby
> > slots' lsns were captured (from pg_replication_slots) and compared. At
> > each capture, the intent was to know how much is each standby's slot
> > lagging behind corresponding primary's slot by taking the distance
> > between confirmed_flush_lsn of primary and standby slot. Then we took
> > the average (integer value) of this distance over the span of 10 min
> > workload
>
> Thanks for the explanations, make sense to me.
>
> > and this is what we got:
> >
> > With max_slot_sync_workers=1, average-lag =  42290.3563
> > With max_slot_sync_workers=2, average-lag =  24585.1421
> > With max_slot_sync_workers=3, average-lag =  14964.9215
> >
> > This shows that more workers have better chances to keep logical
> > replication slots in sync for this case.
> >
>
> Agree.
>
> > Another statistics if it interests you is, we ran a frequency test as
> > well (this by changing code, unit test sort of) to figure out the
> > 'total number of times synchronization done' with different number of
> > sync-slots workers configured. Same 3 DBs setup with each DB having 30
> > logical replication slots. With 'max_slot_sync_workers' set at 1, 2
> > and 3; total number of times synchronization done was 15874, 20205 and
> > 23414 respectively. Note: this is not on the same machine where we
> > captured lsn-gap data, it is on  a little less efficient machine but
> > gives almost the same picture
> >
> > Next we are planning to capture this data for a lesser number of slots
> > like 10,30,50 etc. It may happen that the benefit of multi-workers
> > over single workers in such cases could be less, but let's have the
> > data to verify that.
> >
>
> Thanks a lot for those numbers and for the testing!
>
> Do you think it would make sense to also get the number of using
> the pg_failover_slots module? (and compare the pg_failover_slots numbers with 
> the
> "one worker" case here). Idea is to check if the patch does introduce
> some overhead as compare to pg_failover_slots.
>

Yes, definitely. We will work on that and share the numbers soon.

thanks
Shveta


Reply via email to