Hi Serhei, On Wed, 2025-08-20 at 18:12 -0400, Serhei Makarov wrote: > The header-only patch below illustrates the planned API change. > The base solution is for dwflst_sample_getframes() to take a pointer > to regs_mapping, which, for each item in regs[] array, specifies its > position in the full register file expected by the DWARF functionality. > (This avoids the need to take a perf_events regs bitmap that can only > be interpreted with the aid of a weird enum header buried in the linux > kernel tree.) > > The regs_mapping array is most likely to stay constant for an entire > profiling session, so this is not a high-overhead interface. > (A bit controversially, I'm allowing n_regs_mapping != n_regs, which > might smoothly deal with cases where dwflst_sample_getframes is passed > a profiling data packet with malformed or truncated regs array.) > > We *keep* the existing dwflst_perf_sample_getframes as a convenience > variant of dwflst_sample_getframes. It clearly says 'perf', and I > think it's fine to have this perf-specific API as long as it's not > the *only* API. > > I believe this is the most significant obstacle to turning > libdwfl_stacktrace into a stable API. > mjw, do any other major issues come to mind? > (I'm sure we will find minor things to fix when we bikeshed this header, of > course.) > > I could add functions to generate a plausible regs_mapping for various > architectures, > though I'd like a non perf_events test case for that. I've looked at BSD > PmcTools, > for one example, but not 100% sure yet how to bolt libdwfl_stacktrace into it.
I have to admit I had to reread things for the above to make sense. So for libdwfl_stacktrace we have a Dwflst_Process_Tracker. You create a Dwflst_Process_Tracker with a set of Dwfl_Callbacks (from libdwfl.h) which sets up the find_elf, find_debuginfo and debuginfo_path. Then from this Dwflst_Process_Tracker you create new Dwfls with dwflst_tracker_dwfl_begin instead of calling dwfl_begin with a new set of callbacks yourself. You would still setup the Dwfl with dwfl_[linux_proc]_report[_*] calls. But not attach Dwfl_Thread_Callbacks state because that is done dynamically by the dwflst_*_getframes functions. Alternatively you use dwflst_tracker_find_pid to get or create a new Dwflst_Process_Tracker owned Dwfl. Then finally you call dwflst_perf_sample_getframes or the new dwflst_sample_getframes to iterate to the frames of a specific Dwfl. Which finally brings us to the topic of this email :) Could you explain again why we have the first set of arguments? Dwfl *dwfl, Elf *elf, pid_t pid, pid_t tid Some of this seems a little repetitive. Ideally I think you would just provide a Dwflst_Process_Tracker and a tid. Assuming you can easily get the pid from a tid (maybe not?). At least giving both Dwfl and pid seems redundant. Also which Elf would you provide? A Dwfl can contain multiple Dwfl_Modules each with their own associated Elf. Side note, there is dwfl_getthreads, but I assume that isn't how you iterate through threads with an Dwflst_Process_Tracker derived Dwfl? Then you get the stack (and size) and registers (as Dwarf_Words and number of regs). So we just assume this is provided by some external mechanism. And finally the n_regs_mapping which for each DWARF register number maps it to an index in the regs array. So that the callback can get a register value from the given Dwfl_Frame. I think this works, but do note that some arches have gaps in the DWARF register numbers and so might have > 100 nregs. Should the regmapping maybe be stored or set with the Dwfl or Dwflst_Process_Tracker once? Cheers, Mark > diff --git a/libdwfl_stacktrace/libdwfl_stacktrace.h > b/libdwfl_stacktrace/libdw\ > fl_stacktrace.h > index b236ddc4..b8987c6f 100644 > --- a/libdwfl_stacktrace/libdwfl_stacktrace.h > +++ b/libdwfl_stacktrace/libdwfl_stacktrace.h > @@ -113,14 +113,28 @@ extern int dwflst_tracker_linux_proc_find_elf > (Dwfl_Modul\ > e *mod, void **userdata > const char *module_name, > Dwarf_A\ > ddr base, > char **file_name, Elf **); > > - > /* Like dwfl_thread_getframes, but iterates through the frames for a > - linux perf_events stack sample rather than a live thread. Calls > - dwfl_attach_state on DWFL, with architecture specified by ELF, ELF > - must remain valid during Dwfl lifetime. Returns zero if all frames > - have been processed by the callback, returns -1 on error, or the > - value of the callback when not DWARF_CB_OK. -1 returned on error > - will set dwfl_errno (). */ > + stack sample rather than a live thread. Calls dwfl_attach_state on > + DWFL, with architecture specified by ELF, ELF must remain vaild > + during Dwfl lifetime. Returns zero if all frames have been > + processed by the callback, returns -1 on error, or the value of the > + callback when not DWARF_CB_OK. -1 returned on error will set > + dwfl_errno (). */ > +int dwflst_sample_getframes (Dwfl *dwfl, Elf *elf, pid_t pid, pid_t tid, > + const void *stack, size_t stack_size, > + const Dwarf_Word *regs, uint32_t n_regs, > + const int *regs_mapping, uint32_t > n_regs_mapping, > + int (*callback) (Dwfl_Frame *state, void > *arg), > + void *arg) > + __nonnull_attribute__ (1, 5, 7, 9, 11); > + > +/* Adapts dwflst_sample_getframes to linux perf_events stack sample > + and register file data format. Calls dwfl_attach_state on DWFL, > + with architecture specified by ELF, ELF must remain valid during > + Dwfl lifetime. Returns zero if all frames have been processed by > + the callback, returns -1 on error, or the value of the callback > + when not DWARF_CB_OK. -1 returned on error will set dwfl_errno > + (). */ > int dwflst_perf_sample_getframes (Dwfl *dwfl, Elf *elf, pid_t pid, pid_t tid, > const void *stack, size_t stack_size, > const Dwarf_Word *regs, uint32_t n_regs,