On Tue, Sep 16, 2025 at 11:53 AM shveta malik <shveta.ma...@gmail.com> wrote: > > On Mon, Sep 15, 2025 at 6:17 PM Ajin Cherian <itsa...@gmail.com> wrote: > > > > On Wed, Sep 10, 2025 at 2:45 PM shveta malik <shveta.ma...@gmail.com> wrote: > > > > > > On Tue, Sep 9, 2025 at 5:37 PM Ajin Cherian <itsa...@gmail.com> wrote: > > > > > > > > Attached v11 patch addressing the above comments. > > > > > > > > > > Please find a few comments: > > > > > > 1) > > > > > > + Retry is done after 2 > > > + * sec wait. Exits early if promotion is triggered or certain critical > > > > > > We can say: Retry is done after SLOTSYNC_API_NAPTIME_MS wait. > > > > > > > Changed. > > > > > 2) > > > + /* > > > + * Fetch remote slot info for the given slot_names. If slot_names is NIL, > > > + * fetch all failover-enabled slots. Note that we reuse slot_names from > > > + * the previous iteration; re-fetching all failover slots each time could > > > + * cause an endless loop. > > > + */ > > > > > > a) > > > the previous iteration --> the first iteration. > > > > > > b) Also we can mention the reason why we take names from first > > > iteration instead of going for pending ones alone, something like: > > > > > > Instead of reprocessing only the pending slots in each iteration, it's > > > better to process all the slots received in the first iteration. > > > This ensures that by the time we're done, all slots reflect the latest values. > > > > > > 3) > > > + remote_slots = fetch_remote_slots(wrconn, slot_names); > > > + > > > + > > > + /* Attempt to synchronize slots */ > > > + synchronize_slots(wrconn, remote_slots, &slot_persistence_pending); > > > > > > One extra blank line can be removed > > > > > > > Fixed. > > > > > 4) > > > > > > + /* Clean up slot_names if allocated in TopMemoryContext */ > > > + if (slot_names) > > > + list_free_deep(slot_names); > > > > > > Can we please move it before 'ReplicationSlotCleanup'. > > > > > > > Fixed. > > > > > 5) > > > In case of error in subsequent iteration, slot_names allocated from > > > TopMemoryContext will be left unfreed? > > > > > > > I've changed the logic so that even on error, slot_names are freed. > > I see that you have freed 'slot_names' in config-changed and promotion > case but the ERROR can come from other flows as well. The idea was to > somehow to free it (if possible) in slotsync_failure_callback() by > passing it as an argument, like we do for 'wrconn'. >
Are you suggesting introducing a structure (for example, SlotSyncContext as shown below) that encapsulates both wrconn and slot_names, and then passing a pointer to this structure as the Datum argument to the slotsync_failure_callback cleanup function, so that the callback can handle freeing wrconn and slot_names and maybe some other members within the structure that allocate memory? /* * Extended structure that can hold both connection and slot_names info */ typedef struct SlotSyncContext { WalReceiverConn *wrconn; /* Must be first for compatibility */ List *slot_names; /* Pointer to slot_names list */ bool extended; /* Flag to indicate extended context */ } SlotSyncContext; SyncReplicationSlots(WalReceiverConn *wrconn) { SlotSyncContext sync_ctx; ... ... /* Initialize extended context */ sync_ctx.wrconn = wrconn; sync_ctx.slot_names_ptr = &slot_names; sync_ctx.extended = true; PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, PointerGetDatum(&sync_ctx)); ... } -- With Regards, Ashutosh Sharma.