https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113401
Bug ID: 113401 Summary: Memory (resource) leak in -ftrampoline-impl=heap Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: fw at gcc dot gnu.org Target Milestone: --- Target: x86_64-*, aarch64-* Consider this test program: #include <err.h> #include <errno.h> #include <pthread.h> static void *volatile compiler_barrier; void * thread_routine (void *ignore) { int local_variable; auto int *local_function (void) { return &local_variable; } compiler_barrier = local_function; return NULL; } int main (void) { for (int i = 0; i < 10000; ++i) { pthread_t thread; int ret = pthread_create (&thread, NULL, thread_routine, NULL); if (ret != 0) { errno = ret; err (1, "pthread_create"); } ret = pthread_join (thread, NULL); if (ret != 0) { errno = ret; err (1, "pthread_join"); } } } With -ftrampoline-impl=stack, I get: $ \time ./a.out 0.01user 0.16system 0:00.18elapsed 102%CPU (0avgtext+0avgdata 3008maxresident)k 0inputs+0outputs (0major+68minor)pagefaults 0swaps With -ftrampoline-impl=heap: $ \time ./a.out 0.03user 0.17system 0:00.20elapsed 101%CPU (0avgtext+0avgdata 41796maxresident)k 0inputs+0outputs (0major+10148minor)pagefaults 0swaps The reason for the maxresident change is that __builtin_nested_func_ptr_created and __builtin_nested_func_ptr_deleted cache the last trampoline page even if it is completely unused. This is required for performance reasons. The fix is to register a TLS destructor to deallocate that page, too. On glibc, that also fixes another memory leak for -fno-exceptions compilations (the default for C) if pthread_exit is called with an active trampoline.