On Fri, Mar 8, 2024 at 9:56 AM Ajin Cherian <itsa...@gmail.com> wrote:
>
>> Pushed with minor modifications. I'll keep an eye on BF.
>>
>> BTW, one thing that we should try to evaluate a bit more is the
>> traversal of slots in StandbySlotsHaveCaughtup() where we verify if
>> all the slots mentioned in standby_slot_names have received the
>> required WAL. Even if the standby_slot_names list is short the total
>> number of slots can be much larger which can lead to an increase in
>> CPU usage during traversal. There is an optimization that allows to
>> cache ss_oldest_flush_lsn and ensures that we don't need to traverse
>> the slots each time so it may not hit frequently but still there is a
>> chance. I see it is possible to further optimize this area by caching
>> the position of each slot mentioned in standby_slot_names in
>> replication_slots array but not sure whether it is worth.
>>
>>
>
> I tried to test this by configuring a large number of logical slots while 
> making sure the standby slots are at the end of the array and checking if 
> there was any performance hit in logical replication from these searches.
>

Thanks  Ajin and Nisha.

We also plan:
1) Redoing XLogSendLogical time-log related test with
'sync_replication_slots' enabled.
2) pg_recvlogical test to monitor lag in StandbySlotsHaveCaughtup()
for a large number of slots.
3) Profiling to see if StandbySlotsHaveCaughtup() is noticeable in the
report when there are a large number of slots to traverse.

thanks
Shveta


Reply via email to