On Fri, Mar 8, 2024 at 9:56 AM Ajin Cherian <itsa...@gmail.com> wrote: > >> Pushed with minor modifications. I'll keep an eye on BF. >> >> BTW, one thing that we should try to evaluate a bit more is the >> traversal of slots in StandbySlotsHaveCaughtup() where we verify if >> all the slots mentioned in standby_slot_names have received the >> required WAL. Even if the standby_slot_names list is short the total >> number of slots can be much larger which can lead to an increase in >> CPU usage during traversal. There is an optimization that allows to >> cache ss_oldest_flush_lsn and ensures that we don't need to traverse >> the slots each time so it may not hit frequently but still there is a >> chance. I see it is possible to further optimize this area by caching >> the position of each slot mentioned in standby_slot_names in >> replication_slots array but not sure whether it is worth. >> >> > > I tried to test this by configuring a large number of logical slots while > making sure the standby slots are at the end of the array and checking if > there was any performance hit in logical replication from these searches. >
Thanks Ajin and Nisha. We also plan: 1) Redoing XLogSendLogical time-log related test with 'sync_replication_slots' enabled. 2) pg_recvlogical test to monitor lag in StandbySlotsHaveCaughtup() for a large number of slots. 3) Profiling to see if StandbySlotsHaveCaughtup() is noticeable in the report when there are a large number of slots to traverse. thanks Shveta