Hi,

This is my first contribution to elfutils so wanted to check if the
approach is okay before sending a formal patch.

Following up on Mark's reply to Bogdan back in July 2024
(https://sourceware.org/pipermail/elfutils-devel/2024q3/007266.html)
about serving .dwo files from debuginfod. I implemented this and have
it working locally. Is a significant gap in the protocol right now.
More and more people are compiling with -gsplit-dwarf to speed up
builds and reduce binary sizes, but then they loose the ability to
fetch debug info automatically via debuginfod. Been interested in this
for a while and once it lands I'm planing to contribute the
corresponding GDB support to consume it.

For the .dwo vs .dwp question Mark raised, I went with serving the
whole .dwp file since libdw already knows how to parse the index and
extract what it needs. Duplicating that logic server-side would be
complex and clients requesting one DWO ID from a .dwp will likely want
others from the same package anyway.

What I implemented:

On the server side, the scanner now recognizes .dwo and .dwp files
during traversal, iterates through their compilation units, extracts
the DWO IDs, and stores the mappings in a new table. The HTTP handler
serves requests to /dwoid/<DWOID>/debuginfo by looking up the file and
returning it.

On the client side, added debuginfod_find_dwo() wich works similar to
the other find functions. Takes the DWO ID bytes, converts to hex,
queries the servers, and caches the result. Also added debuginfod-find
dwoid <DWOID> for manual testing.

The trickier part was making this automatic. In libdw, added
dwarf_set_dwo_lookup() wich lets you register a callback that gets
invoked when the existing local lookups fail (same directory,
DW_AT_comp_dir, etc). The callback receives the DWO ID and returns a
file descriptor. This way libdw doesn't need to know anything about
debuginfod, just calls the callback if one is registered.

Then in libdwfl, when loading DWARF I set up this callback to point to
a wrapper that calls debuginfod_find_dwo(). So from the user
perspective everything just works transparently. Open a skeleton
binary with libdwfl and the split DWARF gets fetched automatically if
needed.

This is how it looks:

   debuginfod/debuginfod-client.c   | 382
+++++++++++++++++++++++++++++++++++++++
   debuginfod/debuginfod-find.c     |  16 +-
   debuginfod/debuginfod.cxx        | 356 ++++++++++++++++++++++++++++++++++--
   debuginfod/debuginfod.h.in       |  25 +++
   debuginfod/libdebuginfod.map     |   4 +
   libdw/Makefile.am                |   3 +-
   libdw/dwarf_set_dwo_lookup.c     |  49 +++++
   libdw/libdw.h                    |  19 ++
   libdw/libdw.map                  |   1 +
   libdw/libdwP.h                   |   7 +
   libdw/libdw_find_split_unit.c    |  91 ++++++----
   libdwfl/debuginfod-client.c      |  20 +-
   libdwfl/dwfl_module_getdwarf.c   |  16 ++
   libdwfl/libdwflP.h               |   2 +
   tests/Makefile.am                |   7 +-
   tests/debuginfod_dwoid_find.c    | 138 ++++++++++++++
   tests/run-debuginfod-dwoid.sh    | 322 +++++++++++++++++++++++++++++++++

The test script run-debuginfod-dwoid.sh covers:

- DWP file indexing for both DWARF 5 and DWARF 4 packages (each
containing multiple DWO IDs)
- Individual .dwo file indexing for both DWARF 5 and DWARF 4
- Error handling: 404 for non-existent DWO ID, rejection of malformed
hex strings
- The debuginfod-find dwoid client command
- Cache directory structure verification
- End-to-end libdwfl callback integration using skeleton binaries

The C test program debuginfod_dwoid_find.c opens a skeleton binary via
dwfl_report_offline(), calls dwfl_module_getdwarf() wich triggers the
callback, and verifies that the split DIE gets resolved even though
the .dwo files aren't present locally, only on the server. I have
tested DWARF 4 and DWARF 5 on aarch64 and x86_64 (64 bit).

Splitting this into smaller patches is somewhat hard because
everything is tangled together. The server needs the endpoint, the
client needs the API, libdw needs the callback mechanism, libdwfl
needs to wire it all up, and the tests need all of it working.

Does this approach make sense? Happy to send the actual patches if so.

Thanks in advance,

Pablo Galindo Salgado

Reply via email to