On Freitag, 2. Juni 2017 15:26:10 CEST Milian Wolff wrote: > On Donnerstag, 1. Juni 2017 22:57:12 CEST Milian Wolff wrote: > > Hey all, > > <snip> > > > heaptrack_gui 2228 135073.400474: 613969 cycles: > > 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) > > > > @@ -533,8 +529,6 @@ > > > > 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) > > 297c53 QCoreApplicationPrivate::init (/usr/lib/ > > > > libQt5Core.so.5.8.0) > > > > f7cde QGuiApplicationPrivate::init > > > > (/usr/lib/libQt5Gui.so. 5.8.0) > > - 1589e8 QApplicationPrivate::init > > (/usr/lib/libQt5Widgets.so. 5.8.0) > > - 78622 main (/home/milian/projects/compiled/other/bin/ > > heaptrack_gui) > > > > 20439 __libc_start_main (/usr/lib/libc-2.25.so) > > 78299 _start (/home/milian/projects/compiled/other/bin/ > > > > heaptrack_gui) > > > > NOTE: this is super odd, it simply misses two frames in the middle?! > > This is really quite odd - looking at the debug output, the frames in the > middle are really just skipped for some reason: > > unwind: access_mem addr 0x7ffca0a88330, val 4edc50, offset 2808 > unwind: access_mem addr 0x7ffca0a88338, val 7f69bfce443a, offset 2816 > unwind: pc: = 0x7f69c10fecde > found map: 7f69c1007000 7f69c1766000 > dso found: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0 > reported: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0, 1 > unwind: QGuiApplicationPrivate::init():ip = 0x7f69c10fecde (0xf7cde) > > -> so far so good, this frame is properly found inside libQt5Gui, but then: > > unwind: pc: = 0x7f69bfce4439 > found map: 7f69bfcc4000 7f69c0069000 > dso found: libc-2.25.so /usr/lib/libc-2.25.so > reported: libc-2.25.so /usr/lib/libc-2.25.so, 1 > unwind: __libc_start_main:ip = 0x7f69bfce4439 (0x20439) > > -> the next frame is is supposedly the one in libc, but what happened to the > two frames in QApplicationPrivate::init and main? I also note that no calls > to access_mem are occuring - is this maybe some (wrong) caching in libdw or > so that interfers here? > > Any insight would be appreciated, thanks!
Some more debugging and going after my gut feeling brings me to the following
conclusion: The real issue seems to be the on-demand reporting of the elf
file. We used to do:
Dwarf_Addr pc;
bool isactivation;
if (!dwfl_frame_pc(state, &pc, &isactivation)) {
pr_err("%s", dwfl_errmsg(-1));
return DWARF_CB_ABORT;
}
// report the module before we query for isactivation
report_module(pc, ui);
This looks safe and fine and actually works most of the time. But passing a
non-null isactivation flag to dwfl_frame_pc potentially leads to a second
unwind step, before we got the change to report the module! I can workaround
this by instead doing
Dwarf_Addr pc;
bool isactivation;
if (!dwfl_frame_pc(state, &pc, NULL)) {
pr_err("%s", dwfl_errmsg(-1));
return DWARF_CB_ABORT;
}
// report the module before we query for isactivation
report_module(pc, ui);
if (!dwfl_frame_pc(state, &pc, &isactivation)) {
pr_err("%s", dwfl_errmsg(-1));
return DWARF_CB_ABORT;
}
This fixes all the issues in my original email. So sorry for the noise - it
doesn't see as if the unwinding code in elfutils is broken - quite the
contrary! It's just our misuse of the API that is to blame.
May I suggest though to move the isactivation code into a separate function to
prevent this issue from arising in the future? I.e. it would be nice if the
code above could read:
Dwarf_Addr pc;
bool isactivation;
if (!dwfl_frame_pc(state, &pc)) {
pr_err("%s", dwfl_errmsg(-1));
return DWARF_CB_ABORT;
}
// report the module before we query for isactivation
report_module(pc, ui);
if (!dwfl_frame_is_activation(state)) {
--pc;
}
Thanks
--
Milian Wolff
[email protected]
http://milianw.de
signature.asc
Description: This is a digitally signed message part.
