Hi Shveta, Zhijie, On Fri, May 15, 2026 at 4:41 PM Zhijie Hou (Fujitsu) <[email protected]> wrote: > > On Friday, May 15, 2026 2:36 PM shveta malik <[email protected]> wrote: > > On Fri, May 15, 2026 at 11:03 AM Xuneng Zhou <[email protected]> > > wrote: > > > > > > Hi Hackers, > > > > > > pg_sync_replication_slots() now retries inside a single SQL function > > > call. Unlike the slotsync worker, it does not get a transaction > > > boundary between retry cycles, so allocations made while fetching and > > > synchronizing remote slots can accumulate until the function returns. > > > > > > The existing list_free_deep(remote_slots) is not enough to bound this. > > > It frees the List cells and the RemoteSlot structs stored as list > > > elements, but it does not recursively free allocations owned by those > > > structs, such as the copied slot name, plugin name, and database name. > > > It also does not release unrelated per-cycle allocations made while > > > fetching and processing the remote slots. > > > > > > This is probably modest in normal cases, as the retained memory is > > > roughly proportional to the number of retries times the number of remote > > slots. > > > Still, the function can wait for a long time on a lagging or > > > misconfigured standby, so the growth is unbounded for that call. > > > > > > The attached patch runs each manual retry cycle in a short-lived > > > memory context and resets it before the next attempt. The slot name > > > list needed across retries is copied outside that context. > > > > > > It also adds two local cleanups. Tuple slots created with > > > MakeSingleTupleTableSlot() are explicitly released with > > > ExecDropSingleTupleTableSlot() before clearing the walreceiver result. > > > The memory context would reclaim the storage eventually, but the slot > > > also holds a reference to the result tuple descriptor, so releasing it > > > at the matching ownership boundary seems clearer and follows nearby > > code. > > > > > > drop_local_obsolete_slots() now frees the temporary list container it > > > builds. The retry-cycle context would reclaim this list too, but > > > freeing it locally would make the helper self-contained. > > > > > > Politely CCed the original authors. > > > > > > > > > > I like the core idea of this patch. It makes the flow cleaner and protects > > the > > flow from memory-leaks when dealing with a high number of slots. I think > > during implementation, we considered having a separate memory context, > > but since we were freeing the remote_slots then, we thought allocating a new > > memory context might not be required. But on re-thinking and reading the > > details above, IMO, it is a good improvement.
Yeah, we need to stop the memory accumulation, either by using a dedicated memory context or by freeing the memory manually. > +1 to the general idea of avoiding memory accumulation. > > > > > A couple of minor comments: > > > > 1. I think we need to delete the new memory context in > > slotsync_failure_callback() as well. > > I think we can avoid doing that, because the new memory context should be > destroyed along with its parent context on transaction abort (if any error > occurs). Yeah, normal error/transaction memory context cleanup seems enough. It avoids adding callback state only to delete cycle_ctx. > Just one comment: > > @@ -579,6 +579,8 @@ drop_local_obsolete_slots(List *remote_slot_list) > > local_slot->data.database)); > } > } > + > + list_free(local_slots); > } > > I think we prefer to avoid freeing memory explicitly, since this is a > static function and we already have a memory context in place to prevent > leaks. > (We've seen this discussion about explicit freeing multiple times before, and > the previous conclusion was to rely on the memory context management rather > than > adding more free code.) Thanks for pointing out. I wasn't aware of the prior discussion. I mainly wanted to avoid potential issues in case future callers invoke it without handling the cleanup properly. -- Regards, Xuneng Zhou HighGo Software Co., Ltd.
v2-0001-Bound-memory-usage-during-manual-slot-sync-retrie.patch
Description: Binary data
