Re: [Valgrind-users] Timer delete
Yes, the cache disabling is quite hacky, as mentionnd in the doc: "Valgrind disables the cache using some internal knowledge of the glibc stack cache implementation and by examining the debug information of the pthread library. This technique is thus somewhat fragile and might not work for all glibc versions. This has been successfully tested with various glibc versions (e.g. 2.11, 2.16, 2.18) on various platforms." As you indicate, it looks broken on the more recent glibc version you tried. Philippe Indeed. Looks like this: Author: Florian Weimer 2021-05-10 10:31:41 Committer: Florian Weimer 2021-05-10 10:31:41 Parent: d017b0ab5a181dce4145f3a1b3b27e3341abd201 (elf: Introduce __tls_pre_init_tp) Child: ee07b3a7222746fafc5d5cb2163c9609b81615ef (nptl: Simplify the change_stack_perm calling convention) Branches: master, remotes/origin/arm/morello/main, remotes/origin/arm/morello/v1, remotes/origin/arm/morello/v2, remotes/origin/azanella/bz23960-dirent, remotes/origin/azanella/clang, remotes/origin/codonell/c-utf8, remotes/origin/codonell/ld-audit, remotes/origin/fw/localedef-utf8, remotes/origin/maskray/relr, remotes/origin/maskray/x86-mpx, remotes/origin/master, remotes/origin/nsz/bug23293, remotes/origin/nsz/bug23293-v5, remotes/origin/nsz/bug23293-v6, remotes/origin/release/2.34/master, remotes/origin/release/2.35/master, remotes/origin/release/2.36/master, remotes/origin/siddhesh/realpath-and-getcwd Follows: glibc-2.33.9000 Precedes: glibc-2.34 nptl: Move more stack management variables into _rtld_global Permissions of the cached stacks may have to be updated if an object is loaded that requires executable stacks, so the dynamic loader needs to know about these cached stacks. The move of in_flight_stack and stack_cache_actsize is a requirement for merging __reclaim_stacks into the fork implementation in libc. Tested-by: Carlos O'Donell Reviewed-by: Carlos O'Donell It looks like "stack_cache_actsize" in libc moved to be _dl_stack_cache_actsize in ld-linux-x86-64.so.2 A+ Paul ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Timer delete
On Sat, Nov 12, 2022 at 12:46:41PM +0100, Philippe Waroquiers wrote: > On Sat, 2022-11-12 at 12:21 +0100, Paul Floyd wrote: > > So my conclusion is that there are two problems > > 1. Some cleanup code missing in __libc_freeres that is causing this leak > > (libc problem) > > 2. no-stackcache not working. This is more a Valgrind problem, but it > > does rely on twiddling libc internals, so it's not too surprising that > > it breaks. That needs work on the Valgrind side. > Yes, the cache disabling is quite hacky, as mentionnd in the doc: > "Valgrind disables the cache using some internal > knowledge of the glibc stack cache implementation and by > examining the debug information of the pthread > library. This technique is thus somewhat fragile and might > not work for all glibc versions. This has been successfully > tested with various glibc versions (e.g. 2.11, 2.16, 2.18) > on various platforms." > > As you indicate, it looks broken on the more recent glibc version you tried. This is https://bugs.kde.org/show_bug.cgi?id=88 Use glibc.pthread.stack_cache_size tunable Since glibc 2.34 the internal/private stack_cache_maxsize variable isn't available anymore, which causes "sched WARNING: pthread stack cache cannot be disabled!" when the simhint no_nptl_pthread_stackcache is set (e.g. in helgrind/tests/tls_threads.vgtest) Cheers, Mark ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Timer delete
On Sat, 2022-11-12 at 12:21 +0100, Paul Floyd wrote: > Philiipe wrote: > > Possibly --sim-hints=no-nptl-pthread-stackcache might help (if I > > re-read the manual entry for this sim-hint). > > > As the manpage says, the pthread stackcache stuff is mainly for Helgrind. > > I don't see how this would affect a leak though. This sim-hint also influences memcheck behaviour related to __thread (i.e. tls) variables. Here is the extract of the doc: "When using the memcheck tool, disabling the cache ensures the memory used by glibc to handle __thread variables is directly released when a thread terminates." (at least that was likely true in 2014, when the above was written). > > I did some tests to check that __libc_freeres is being called (and it is > being called). > > So my conclusion is that there are two problems > 1. Some cleanup code missing in __libc_freeres that is causing this leak > (libc problem) > 2. no-stackcache not working. This is more a Valgrind problem, but it > does rely on twiddling libc internals, so it's not too surprising that > it breaks. That needs work on the Valgrind side. Yes, the cache disabling is quite hacky, as mentionnd in the doc: "Valgrind disables the cache using some internal knowledge of the glibc stack cache implementation and by examining the debug information of the pthread library. This technique is thus somewhat fragile and might not work for all glibc versions. This has been successfully tested with various glibc versions (e.g. 2.11, 2.16, 2.18) on various platforms." As you indicate, it looks broken on the more recent glibc version you tried. Philippe ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Timer delete
On 11/12/22 01:46, John Reiser wrote: It's a bug (or implementation constraint) in glibc timer. When I run it under valgrind-3.19.0 with glibc-debuginfo and glibc-debugsource installed (2.35-17.fc36.x86_64): [Notice the annotation "LOOK HERE"] ==281161== Command: ./a.out ==281161== --281161:0: sched WARNING: pthread stack cache cannot be disabled! < LOOK HERE < And also Philiipe wrote: Possibly --sim-hints=no-nptl-pthread-stackcache might help (if I re-read the manual entry for this sim-hint). As the manpage says, the pthread stackcache stuff is mainly for Helgrind. What the code does is use debuginfo to find the GNU libc variable that describes the size of the stack cache, and forces it to be some large value. That causes libthead to think that the cache is full (when it is still really empty) and not use the cache. That means that every time a thread gets created a new stack will get allocated rather than allocated and recycled in the cache. The caching causes problems with Helgrind for applications using thread local storage in sequences like write to TLS var on thread 2 thread 2 exit thread 3 created recycles thread2's TLS read from TLS var on thread 3 Helgrind just sees unprotected reads and writes from the same address without knowing that it isn't the same variable. This test is currently failing for me (Fedora 36 amd64): paulf> perl tests/vg_regtest helgrind/tests/tls_threads tls_threads: valgrind -q --sim-hints=no-nptl-pthread-stackcache ./tls_threads *** tls_threads failed (stderr) *** (More details here https://github.com/paulfloyd/freebsd_valgrind/issues/113 since I've looked into how to implement something similar for FreeBSD). I don't see how this would affect a leak though. I did some tests to check that __libc_freeres is being called (and it is being called). So my conclusion is that there are two problems 1. Some cleanup code missing in __libc_freeres that is causing this leak (libc problem) 2. no-stackcache not working. This is more a Valgrind problem, but it does rely on twiddling libc internals, so it's not too surprising that it breaks. That needs work on the Valgrind side. FWIW on FreeBSD (no stack cache disable or libc freeres) I also get a bunch of leaks that I need to suppress. A+ Paul ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users