Re: [RFC] debuginfod: DWO ID-based lookup for split DWARF

Pablo Galindo Salgado Mon, 05 Jan 2026 15:44:18 -0800

> (Given that you have completed a prototype already, sending a patch
> is fine and makes feedback more responsive.)


I'm happy to send formal patches, though I'd appreciate some guidance
on the preferred
workflow for submitting them, as I'm not familiar with this
contribution process. Is there any
wiki or document I should look at?

> For those of us nonexperts, could you give a brief example of the
> dwoid / dwp composition scenario here, so we can see how close it is
> to buildids?  (There is a remote possibility that the existing
> debuginfod machinery can handle it already, with clever enough -Z.)

Build IDs identify executable/library files and are stored in ELF
.note.gnu.build-id sections. They're created at link time. DWO ID
identify individual compilation units and are stored in DWARF unit
headers (DW_AT_dwo_id). They're computed from compilation
unit content during compilation.

 Concrete Example:

  $ gcc -gsplit-dwarf -o hello hello.c world.c

 This produces:
  - hello (executable with build-id, contains skeleton CUs)
  - hello.dwo (split DWARF for hello.c, DWO ID: c422aa5c31fec205)
  - world.dwo (split DWARF for world.c, DWO ID: b6c8b9d97e6dfdfe)

 The .dwo files have no build-id - they're not linked, just compiler
output. They can be packaged into a .dwp:

  $ dwp -e hello -o hello.dwp

 Now hello.dwp contains both CUs, indexed by their DWO IDs.

 The existing machinery can't handle this becayse:

  1. No build-id: DWO/DWP files typically lack ELF notes entirely, so
-Z can't extract a build-id that doesn't exist.
  2. Different identifier space: Even if we added fake build-ids, the
client needs to query by DWO ID (which it gets from the skeleton CU),
not by any build-id.
  3. Multiple identifiers per file: A .dwp file contains many CUs,
each with a different DWO ID. Build-id is 1:1 with files.

The relationship is skeleton CU in executable -> DW_AT_dwo_id -> .dwo/.dwp file

There's no way for the client to derive a build-id from a DWO ID -
they're computed from completely different inputs. The
/dwoid/<ID>/debuginfo endpoint mirrors /buildid/<ID>/debuginfo but
uses the DWO ID as the key.

I chose to serve the whole .dwp file because DWP files use a complex
index structure (.debug_cu_index) to locate individual CUs, and libdw
already knows how to parse this and extract what it needs.
Reimplementing that extraction logic server-side would add significant
complexity for little benefit. Additionally, when debugging an
executable, you'll typically need debug info for multiple CUs from the
same package (stack frames, stepping, etc.), so fetching and caching
the whole .dwp once is more efficient than many round-trips for
individual CU sections.

Pablo

Re: [RFC] debuginfod: DWO ID-based lookup for split DWARF

Reply via email to