On Tue, Aug 1, 2023 at 5:01 PM shveta malik <shveta.ma...@gmail.com> wrote: > > > The work division amongst the sync workers can > > be simple, the logical replication launcher builds a shared memory > > structure based on number of slots to sync and starts the sync workers > > dynamically, and each sync worker picks {dboid, slot name, conninfo} > > from the shared memory, syncs it and proceeds with other slots. > > Do you mean the logical replication launcher builds a shared memory > structure based > on the number of 'dbs' to sync as I understood from your initial comment?
Yes. I haven't looked at the 0003 patch posted upthread. However, the standby must do the following at a minimum: - Make GUCs synchronize_slot_names and max_slot_sync_workers of PGC_POSTMASTER type needing postmaster restart when changed as they affect the number of slot sync workers. - LR (logical replication) launcher connects to primary to fetch the logical slots specified in synchronize_slot_names. This is a one-time task. - LR launcher prepares a dynamic shared memory (created via dsm_create) with some state like locks for IPC and an array of {slot_name, dboid_associated_with_slot, is_sync_in_progress} - maximum number of elements in the array is the number of slots specified in synchronize_slot_names. This is a one-time task. - LR launcher decides the *best* number of slot sync workers - (based on some perf numbers) it can just launch, say, one worker per 2 or 4 or 8 etc. slots. - Each slot sync worker then picks up a slot from the DSM, connects to primary using primary conn info, syncs it, and moves to another slot. Not having the capability of on-demand stop/launch of slot sync workers makes the above design simple IMO. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com