Hi Aaron, On Mon, 2025-09-01 at 20:31 -0400, Aaron Merey wrote: > __libdw_dieabbrev uses the abbrev_lock rwlock to synchronize access to the > Dwarf_Die abbrev field as well as its lazy loading. Calls to rwlock_wrlock > and unlock incur significant performance overhead even in single-threaded > cases. > > This patch implements thread safety in __libdw_dieabbrev with GCC __atomic > builtins instead of an rwlock, improving performance.
configure checks for C11 plus atomics support. So you should be able to just use #include <stdatomic.h> and atomic_compare_exchange* functions. > abbrev_lock has been > changed to a mutex and now synchronizes access to the Dwarf_CU field > last_abbrev_offset in __libdw_findabbrev. > > libdw/ > * dwarf_begin_elf.c (valid_p): Change type of abbrev_lock from > rwlock to mutex. > * dwarf_end.c (cu_free): Ditto. > * dwarf_tag.c (__libdw_findabbrev): Protect > cu->last_abbrev_offset with mutex. > * libdwP.h (struct Dwarf_CU): Change type of abbrev_lock from > rwlock to mutex. > (__libdw_dieabbrev): Use __atomic builtins instead of rwlock. > * libdw_findcu.c (__libdw_intern_next_unit): Change type of > abbrev_lock from rwlock to mutex. Apart from using stdatomic.h and the standard C11 functions I think this is all correct. > One issue we need to address by the time we remove the "experimental" > notice from --enable-thread-safety is communicating to the library > caller that they should not attempt to read Dwarf_Die.abbrev before > it is loaded. Since the Dwarf_Die struct is public and transparent to > library callers, one thread attempting to read the abbrev field while > another thread lazy loads the abbrev is undefined behavior. > > I don't think we can fully eliminate the possibility of UB under a lazy > loading design given that we should not change the Dwarf_Die layout and > its public accessibility. Lazy loading implementation cannot prevent > library callers from reading partially written abbrev bytes no matter > what internal synchronization methods are used. Maybe we just have to say that you cannot share Dwarf_Die references between threads? So if you want to use a Dwarf_Die from another thread you have to copy it instead by passing it by reference? It is a fairly small struct so passing a copy instead of a reference shouldn't be a very high overhead. Or maybe we should say that you should pass the Dwarf and Dwarf_Die.addr to another thread and reconstruct it using dwarf_die_addr_die? That would simplify things a bit on the implementation side I think. But might make parallel algorithm using libdw more awkward? Thanks, Mark