> On Dec 3, 2015, at 10:34 AM, D'Alessandro, Luke K <[email protected]> > wrote: > >> >> On Dec 3, 2015, at 7:30 AM, D'Alessandro, Luke K <[email protected]> >> wrote: >> >> Hi All, >> >> I have a C library that commonly uses a custom setjmp/longjmp for non-local >> return. I’m trying to add support for intermediate C++ code, which means I >> need to return through frames that might have RAII destructors that need to >> run. I’m attempting to use `_Unwind_ForcedUnwind()` to perform this >> operation. It works fine, however there is a serious scalability bottleneck >> that I’m trying to track down. >> >> I’m using the 1.1 release and I’ve switched `x86_64_local_addr_space_init()` >> to set the default caching policy to UNW_CACHE_PER_THREAD. I did this >> statically because I couldn’t figure out where to insert >> `unw_set_caching_policy()` to get it to change properly—it appears that the >> address space is created inside of the call to `_Unwind_ForcedUnwind()`…? >> >> In any case, I still see the app hammering away at a lock. I see an init >> lock in `tdep_init()`, but I doubt that’s an issue. I also see a lock in >> `trace_cache_get_unthreaded`, which I don’t think I should be hitting. If >> someone could point me to the likely issue that would be great, or if there >> is something fundamentally non-scalable about reading the dwarf information >> and unwinding that would be useful information too. > > Okay, to answer my own question a bit. > > Based on the `perf record -g` output below, there appears to be a lock inside > of `dl_iterate_phdr` that gets hit every time `fetch_proc_info` runs. There > is also locking in `dwarf_get` that happens occasionally. Is caching supposed > to elide the `dl` hits? Could this be user-error on my part?
Okay, I think I see why we keep hitting `dl_iterate_phdr()`. The `fetch_proc_info()` call uses it, and it’s getting called deterministically from http://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/dwarf/Gparser.c;h=3a47255c4a1afa217d1ecc99723babdc92cffec9;hb=HEAD#l924. ``` HIDDEN int dwarf_make_proc_info (struct dwarf_cursor *c) { #if 0 if (c->as->caching_policy == UNW_CACHE_NONE || get_cached_proc_info (c) < 0) #endif /* Lookup it up the slow way... */ return fetch_proc_info (c, c->ip, 0); return 0; } ``` So the question becomes, what is standing in the way of a “ get_cached_proc_info” implementation? Is it just a "TODO/patches welcome” or is there something fundamentally difficult going on here? Thanks, Luke > > Thanks, > Luke > > ``` > # Children Self Command Shared Object Symbol > > # ........ ........ ......... .................. > .............................................. > # > 84.80% 0.01% fibonacci [kernel.kallsyms] [k] system_call_fastpath > > | > ---system_call_fastpath > | > |--52.91%-- __lll_unlock_wake > | | > | |--36.55%-- 0x100000000 > | | > | |--27.51%-- validate_mem > | | access_mem > | | | > | | |--66.90%-- dwarf_get > | | | apply_reg_state > | | | _ULx86_64_dwarf_find_save_locs > | | | 0 > | | | > | | --33.10%-- dwarf_get > | | access_mem > | | dwarf_get > | | apply_reg_state > | | _ULx86_64_dwarf_find_save_locs > | | 0 > | | > | |--26.93%-- fetch_proc_info > | | _ULx86_64_dwarf_make_proc_info > | | _ULx86_64_get_proc_info > | | _Unwind_ForcedUnwind > | | > | --9.01%-- dwarf_get > | access_mem > | dwarf_get > | apply_reg_state > | _ULx86_64_dwarf_find_save_locs > | 0 > | > |--45.92%-- __lll_lock_wait > | | > | |--99.99%-- fetch_proc_info > | | _ULx86_64_dwarf_make_proc_info > | | _ULx86_64_get_proc_info > | | _Unwind_ForcedUnwind > ``` > >> >> Thanks, >> Luke >> _______________________________________________ >> Libunwind-devel mailing list >> [email protected] >> https://lists.nongnu.org/mailman/listinfo/libunwind-devel > > _______________________________________________ > Libunwind-devel mailing list > [email protected] > https://lists.nongnu.org/mailman/listinfo/libunwind-devel _______________________________________________ Libunwind-devel mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/libunwind-devel
