On Mon, 23 Mar 2020, Ludovic Courtès <[email protected]> wrote: > Ludovic Courtès <[email protected]> skribis: > >> The size of the weak set keeps growing, even when GC happens.
[...] I can still reproduce in 3.0.11. [...] > So it seems that the trick of ‘scm_i_register_async_gc_callback’ no > longer works as expected. Indeed, I don't see the callback being called more than once or twice. However, allocating a new callback by calling `scm_i_register_async_gc_callback' after executing the callback seems to work. It however introduces some overhead in the GC that I have not look into. Alternatively, I made a patch that rewrite the finalizer logic and allows the vacuum of weak-sets and weak-tables to be done in the finalizer logic, after the GC has been run (via GC hook), without passing throught the finalizer notification of libgc. I suspect that this fix will not be necassary with the move to Whippet.
>From 2adaf872aeb3b84be9cf8af06f958ea4b731f66c Mon Sep 17 00:00:00 2001 From: Olivier Dion <[email protected]> Date: Tue, 17 Feb 2026 11:37:35 -0500 Subject: [PATCH] Fix weak sets not shrinking after GC. The `scm_i_register_async_gc_callback' mechanism used finalizers to trigger post-GC callbacks for vacuuming weak sets and tables. This approach stopped working reliably, causing weak sets to grow indefinitely even after GC. Replace the finalizer-based callback mechanism with direct integration into `scm_after_gc_c_hook'. A new hook callback, `scm_i_notify_finalizer_thread', is registered with `scm_after_gc_c_hook'. When the hook runs after GC, this callback either notifies the finalizer thread (if running) to invoke finalizers, or runs them synchronously and spawns the finalizer thread if allowed. The vacuum functions for weak sets and tables are called from `scm_run_finalizers'. When `(gc)' is called explicitly from Scheme, finalizers are run immediately. An atomic flag prevents double-vacuuming when the GC hook fires afterward. Fixes <https://issues.guix.gnu.org/40194>. * libguile/atomics-internal.h (scm_atomic_swap_pointer): New function. * libguile/finalizers.c: Include weak-set.h and weak-table.h. (queue_finalizer_async): Remove. (async_gc_finalizer): Remove. (scm_i_register_async_gc_callback): Remove. (allow_finalizer_thread): New static variable. (scm_i_finalizer_pre_fork): Use `allow_finalizer_thread' instead of `GC_set_finalizer_notifier'. (scm_i_notify_finalizer_thread): New function. (scm_set_automatic_finalization_enabled): Use `scm_after_gc_c_hook' instead of `GC_set_finalizer_notifier'. (vacuum_off): New static variable. (scm_run_finalizers): Call `scm_i_vacuum_weak_sets' and `scm_i_vacuum_weak_tables' using atomic flag to prevent double-vacuuming. (scm_i_run_finalizers_and_turnoff_vacuums): New function. (scm_init_finalizers): Use `scm_after_gc_c_hook'. (scm_init_finalizer_thread): Simplify to just set `allow_finalizer_thread'. * libguile/finalizers.h (scm_i_register_async_gc_callback): Remove. (scm_i_run_finalizers_and_turnoff_vacuums): New declaration. * libguile/gc.c: Include finalizers.h. (scm_gc): Call `scm_i_run_finalizers_and_turnoff_vacuums' instead of `GC_invoke_finalizers'. (scm_init_gc): Disable `GC_set_finalizer_notifier'. * libguile/weak-set.c (scm_i_vacuum_weak_sets): Rename from `vacuum_all_weak_sets' and make non-static. (scm_init_weak_set): Remove `scm_i_register_async_gc_callback' call. * libguile/weak-set.h (scm_i_vacuum_weak_sets): New declaration. * libguile/weak-table.c (scm_i_vacuum_weak_tables): Rename from `vacuum_all_weak_tables' and make non-static. (scm_init_weak_table): Remove `scm_i_register_async_gc_callback' call. * libguile/weak-table.h (scm_i_vacuum_weak_tables): New declaration. Signed-off-by: Olivier Dion <[email protected]> --- libguile/atomics-internal.h | 17 ++++- libguile/finalizers.c | 138 ++++++++++++++++++------------------ libguile/finalizers.h | 6 +- libguile/gc.c | 4 +- libguile/weak-set.c | 6 +- libguile/weak-set.h | 1 + libguile/weak-table.c | 6 +- libguile/weak-table.h | 1 + 8 files changed, 95 insertions(+), 84 deletions(-) diff --git a/libguile/atomics-internal.h b/libguile/atomics-internal.h index e15ea3564..4c118dff6 100644 --- a/libguile/atomics-internal.h +++ b/libguile/atomics-internal.h @@ -57,6 +57,12 @@ scm_atomic_ref_pointer (void **loc) atomic_uintptr_t *a_loc = (atomic_uintptr_t *) loc; return (void *) atomic_load (a_loc); } +static inline void * +scm_atomic_swap_pointer (void **loc, void *val) +{ + atomic_uintptr_t *a_loc = (atomic_uintptr_t *) loc; + return (void *) atomic_exchange (a_loc, (uintptr_t) val); +} static inline void scm_atomic_set_scm (SCM *loc, SCM val) { @@ -135,7 +141,16 @@ scm_atomic_ref_pointer (void **loc) scm_i_pthread_mutex_unlock (&atomics_lock); return ret; } - +static inline void * +scm_atomic_swap_pointer (void **loc, void *val) +{ + void *ret; + scm_i_pthread_mutex_lock (&atomics_lock); + ret = *loc; + *loc = val; + scm_i_pthread_mutex_unlock (&atomics_lock); + return ret; +} static inline void scm_atomic_set_scm (SCM *loc, SCM val) { diff --git a/libguile/finalizers.c b/libguile/finalizers.c index f6e67d5d7..b8bbe2803 100644 --- a/libguile/finalizers.c +++ b/libguile/finalizers.c @@ -38,6 +38,8 @@ #include "init.h" #include "threads.h" #include "atomics-internal.h" +#include "weak-set.h" +#include "weak-table.h" #include "finalizers.h" @@ -149,25 +151,6 @@ run_finalizers_async_thunk (void) return SCM_UNSPECIFIED; } - -/* The function queue_finalizer_async is run by the GC when there are - * objects to finalize. It will enqueue an asynchronous call to - * GC_invoke_finalizers() at the next SCM_TICK in this thread. - */ -static void -queue_finalizer_async (void) -{ - scm_thread *t = SCM_I_CURRENT_THREAD; - - /* Could be that the current thread is is NULL when we're allocating - in threads.c:guilify_self_1. In that case, rely on the - GC_invoke_finalizers call there after the thread spins up. */ - if (!t) return; - - scm_system_async_mark_for_thread (run_finalizers_subr, t->handle); -} - - #if SCM_USE_PTHREAD_THREADS @@ -340,14 +323,20 @@ scm_i_is_finalizer_thread (struct scm_thread *t) +#if SCM_USE_PTHREAD_THREADS +static int +allow_finalizer_thread; +#endif + void scm_i_finalizer_pre_fork (void) { #if SCM_USE_PTHREAD_THREADS if (automatic_finalization_p) { + allow_finalizer_thread = 0; stop_finalization_thread (); - GC_set_finalizer_notifier (spawn_finalizer_thread); + allow_finalizer_thread = 1; } #endif } @@ -355,49 +344,33 @@ scm_i_finalizer_pre_fork (void) -static void -async_gc_finalizer (void *ptr, void *data) +static void * +scm_i_notify_finalizer_thread (void * hook_data SCM_UNUSED, + void *fn_data SCM_UNUSED, + void *data SCM_UNUSED) { - void **obj = ptr; - void (*callback) (void) = obj[0]; - - callback (); - - scm_i_set_finalizer (ptr, async_gc_finalizer, data); -} - -/* Arrange to call CALLBACK asynchronously after each GC. The callback - will be invoked from a finalizer, which may be from an async or from - another thread. - - As an implementation detail, the way this works is that we allocate a - fresh object and put the callback in the object. We know that this - object should get collected the next time GC is run, so we attach a - finalizer to it to trigger the callback. - - Once the callback runs, we re-attach a finalizer to that fresh object - to prepare for the next GC, and the process repeats indefinitely. - - We could use the scm_after_gc_hook, but using a finalizer has the - advantage of potentially running in another thread, decreasing pause - time. - - Note that libgc currently has a heuristic that adding 500 finalizable - objects will cause GC to collect rather than expand the heap, - drastically reducing performance on workloads that actually need to - expand the heap. Therefore scm_i_register_async_gc_callback is - inappropriate for using on unbounded numbers of callbacks. */ -void -scm_i_register_async_gc_callback (void (*callback) (void)) -{ - void **obj = GC_MALLOC_ATOMIC (sizeof (void*)); - - obj[0] = (void*)callback; - - scm_i_set_finalizer (obj, async_gc_finalizer, NULL); +#if SCM_USE_PTHREAD_THREADS + scm_i_pthread_mutex_lock (&finalization_thread_lock); + if (finalization_thread_is_running) + { + notify_finalizers_to_run (); + scm_i_pthread_mutex_unlock (&finalization_thread_lock); + } + else + { + scm_i_pthread_mutex_unlock (&finalization_thread_lock); + scm_run_finalizers (); + + if (allow_finalizer_thread) + spawn_finalizer_thread (); + } +#else + scm_run_finalizers (); +#endif + + return NULL; } - int scm_set_automatic_finalization_enabled (int enabled_p) { @@ -414,17 +387,21 @@ scm_set_automatic_finalization_enabled (int enabled_p) if (enabled_p) { + scm_c_hook_add (&scm_after_gc_c_hook, + scm_i_notify_finalizer_thread, + NULL, 0); + #if SCM_USE_PTHREAD_THREADS - GC_set_finalizer_notifier (spawn_finalizer_thread); -#else - GC_set_finalizer_notifier (queue_finalizer_async); + allow_finalizer_thread = 1; #endif } else { - GC_set_finalizer_notifier (0); - #if SCM_USE_PTHREAD_THREADS + allow_finalizer_thread = 0; + scm_c_hook_remove (&scm_after_gc_c_hook, + scm_i_notify_finalizer_thread, + NULL); stop_finalization_thread (); #endif } @@ -434,16 +411,38 @@ scm_set_automatic_finalization_enabled (int enabled_p) return was_enabled_p; } +static uintptr_t vacuum_off; + int scm_run_finalizers (void) { - int finalized = GC_invoke_finalizers (); + int finalized = 0; + + if (GC_should_invoke_finalizers ()) + finalized = GC_invoke_finalizers (); finalization_count += finalized; + if (!scm_atomic_swap_pointer((void **)&vacuum_off, (void *)0)) + { + scm_i_vacuum_weak_sets (); + scm_i_vacuum_weak_tables (); + } + return finalized; } +/* Use to run the finalizers but cooling down the vacuum until next run. + This is used when explicitly calling (gc) from Scheme, where the + finalizers are run immediately after the GC, but the GC hook will get + triggered later. So instead of vacuuming twice, just turn off the + vacuums for the next run. */ +void scm_i_run_finalizers_and_turnoff_vacuums (void) +{ + (void) scm_run_finalizers (); + + scm_atomic_set_pointer((void **) &vacuum_off, (void *)1); +} @@ -456,14 +455,15 @@ scm_init_finalizers (void) run_finalizers_async_thunk); if (automatic_finalization_p) - GC_set_finalizer_notifier (queue_finalizer_async); + scm_c_hook_add (&scm_after_gc_c_hook, + scm_i_notify_finalizer_thread, + NULL, 0); } void scm_init_finalizer_thread (void) { #if SCM_USE_PTHREAD_THREADS - if (automatic_finalization_p) - GC_set_finalizer_notifier (spawn_finalizer_thread); + allow_finalizer_thread = 1; #endif } diff --git a/libguile/finalizers.h b/libguile/finalizers.h index a92a74be1..014e770cb 100644 --- a/libguile/finalizers.h +++ b/libguile/finalizers.h @@ -37,17 +37,13 @@ SCM_INTERNAL void scm_i_add_resuscitator (void *obj, scm_t_finalizer_proc, SCM_INTERNAL void scm_i_finalizer_pre_fork (void); -/* CALLBACK will be called after each garbage collection. It will be - called from a finalizer, which may be from an async or from another - thread. */ -SCM_INTERNAL void scm_i_register_async_gc_callback (void (*callback) (void)); - /* Return true if THREAD is the finalizer thread. */ SCM_INTERNAL int scm_i_is_finalizer_thread (struct scm_thread *thread); SCM_API int scm_set_automatic_finalization_enabled (int enabled_p); SCM_API int scm_run_finalizers (void); +SCM_INTERNAL void scm_i_run_finalizers_and_turnoff_vacuums (void); SCM_INTERNAL void scm_init_finalizers (void); SCM_INTERNAL void scm_init_finalizer_thread (void); diff --git a/libguile/gc.c b/libguile/gc.c index 3cbe43ec5..121b91923 100644 --- a/libguile/gc.c +++ b/libguile/gc.c @@ -36,6 +36,7 @@ #include "deprecation.h" #include "dynwind.h" #include "eval.h" +#include "finalizers.h" #include "gen-scmconfig.h" #include "gsubr.h" #include "hashtab.h" @@ -248,7 +249,7 @@ SCM_DEFINE (scm_gc, "gc", 0, 0, 0, scm_i_gc ("call"); /* If you're calling scm_gc(), you probably want synchronous finalization. */ - GC_invoke_finalizers (); + scm_i_run_finalizers_and_turnoff_vacuums (); return SCM_UNSPECIFIED; } #undef FUNC_NAME @@ -618,6 +619,7 @@ scm_init_gc () GC_set_oom_fn (scm_oom_fn); GC_set_warn_proc (scm_gc_warn_proc); GC_set_start_callback (run_before_gc_c_hook); + GC_set_finalizer_notifier (0); #include "gc.x" } diff --git a/libguile/weak-set.c b/libguile/weak-set.c index 93c429346..1fad28b3d 100644 --- a/libguile/weak-set.c +++ b/libguile/weak-set.c @@ -707,8 +707,8 @@ do_vacuum_weak_set (SCM set) static scm_i_pthread_mutex_t all_weak_sets_lock = SCM_I_PTHREAD_MUTEX_INITIALIZER; static SCM all_weak_sets = SCM_EOL; -static void -vacuum_all_weak_sets (void) +void +scm_i_vacuum_weak_sets (void) { scm_i_pthread_mutex_lock (&all_weak_sets_lock); scm_i_visit_weak_list (&all_weak_sets, do_vacuum_weak_set); @@ -902,6 +902,4 @@ void scm_init_weak_set () { #include "weak-set.x" - - scm_i_register_async_gc_callback (vacuum_all_weak_sets); } diff --git a/libguile/weak-set.h b/libguile/weak-set.h index 621bce85f..c8b9d2039 100644 --- a/libguile/weak-set.h +++ b/libguile/weak-set.h @@ -57,6 +57,7 @@ SCM_INTERNAL SCM scm_weak_set_for_each (SCM proc, SCM set); SCM_INTERNAL SCM scm_weak_set_map_to_list (SCM proc, SCM set); SCM_INTERNAL void scm_i_weak_set_print (SCM exp, SCM port, scm_print_state *pstate); +SCM_INTERNAL void scm_i_vacuum_weak_sets (void); SCM_INTERNAL void scm_init_weak_set (void); #endif /* SCM_WEAK_SET_H */ diff --git a/libguile/weak-table.c b/libguile/weak-table.c index 1e4d8d302..a303ca3f0 100644 --- a/libguile/weak-table.c +++ b/libguile/weak-table.c @@ -487,8 +487,8 @@ do_vacuum_weak_table (SCM table) static scm_i_pthread_mutex_t all_weak_tables_lock = SCM_I_PTHREAD_MUTEX_INITIALIZER; static SCM all_weak_tables = SCM_EOL; -static void -vacuum_all_weak_tables (void) +void +scm_i_vacuum_weak_tables (void) { scm_i_pthread_mutex_lock (&all_weak_tables_lock); scm_i_visit_weak_list (&all_weak_tables, do_vacuum_weak_table); @@ -840,6 +840,4 @@ void scm_init_weak_table () { #include "weak-table.x" - - scm_i_register_async_gc_callback (vacuum_all_weak_tables); } diff --git a/libguile/weak-table.h b/libguile/weak-table.h index bcbc94e3f..5a27a6c56 100644 --- a/libguile/weak-table.h +++ b/libguile/weak-table.h @@ -81,6 +81,7 @@ SCM_API SCM scm_doubly_weak_hash_table_p (SCM h); SCM_INTERNAL void scm_i_weak_table_print (SCM exp, SCM port, scm_print_state *pstate); +SCM_INTERNAL void scm_i_vacuum_weak_tables (void); SCM_INTERNAL void scm_weak_table_prehistory (void); SCM_INTERNAL void scm_init_weak_table (void); -- 2.52.0
Thanks, Olivier -- Olivier Dion
