I think David is correct, that we did not consider LTO and assumed a .dwo file 
would have a single compilation unit in the .debug_info section.  It seems to 
me not hard to fix, but my idea would require an extension to the package-file 
index and I don't see provision in the package-file index for vendor extensions 
(another oversight?).

In a split-DWARF scenario producing multiple CUs, it's clear that each 
split-full unit in the .dwo file would need a corresponding skeleton unit in 
the .o file, with matching unique DWO IDs.  The v5 spec basically already says 
that.  With multiple split-full units in the same .debug_info section, then 
DW_FORM_ref_addr can support cross-CU references within the section; the 
producer can supply the correct offset within the section without needing any 
relocations.

How to describe this in the package file?  I'd leave DW_SECT_INFO meaning what 
it does now—describing the base and size of the individual unit.  I'd add a new 
"section identifier" DW_SECT_INFO_FILE or whatever, which describes the base 
and size of the entire .debug_info section contributed by the .dwo *file* that 
the unit came from.  This allows a consumer to find each individual unit by DWO 
ID, as today, and the extra _FILE column describes the base-and-size to use 
when interpreting a DW_FORM_ref_addr from that unit.  For any .dwo file that 
contains only one unit, DW_SECT_INFO and DW_SECT_INFO_FILE would have the same 
values.  The tool that creates the package file can omit DW_SECT_INFO_FILE from 
the index if every input .dwo file has only one unit.

This solution avoids the problem of the *consumer* having to scan the 
.debug_info contribution to find the units; that work can be done once up front 
by the packaging tool.

Section identifiers are 32 bits wide, and the defined values are just 1-8; 
surely we can allocate some for vendor extensions!  And then it's no problem to 
have tools produce the new column for the index.  Consumers will just ignore 
section identifiers that they don't recognize, same as any other part of DWARF.

Would that address the problem?
--paulr

From: Dwarf-Discuss [mailto:dwarf-discuss-boun...@lists.dwarfstd.org] On Behalf 
Of David Blaikie
Sent: Tuesday, May 02, 2017 12:10 PM
To: dwarf-discuss@lists.dwarfstd.org
Subject: [Dwarf-Discuss] Fission + cross-CU references (ref_addr)

I've recently been trying to resolve the use of Fission in LLVM's ThinLTO mode 
(though this would apply to plain LTO too).

One of the things that happens here is that cross-CU DIE references 
(DW_FORM_ref_addr) are used to describe inlining a function in one CU into 
another CU.

This format has been implemented in LLVM and GCC for ~years and seems to work 
well outside of Fission.

So the question is: what to do with Fission?

It seemed to me that a good representation would be to produce multiple CUs 
into a single DWO file, which GDB can't yet consume, but I'm working on patches 
to help there. DW_FORM_ref_addr would not use any ELF relocation, but be 
assumed to be "relative to the chunk of debug_info it was in" (within the .dwo 
file)

But what about DWP files? Currently binutils dwp produces records like this:

(this dwp contains 3 CUs, two from one LTO compile, and one from a standalone 
compile linked in for comparison):

Index Signature          INFO     ABBR     LINE     STR_OFF
----- ------------------ -------- -------- -------- --------
    2 0x7bd765349b7e7631 [2d, 65) [38, ae) [11, 22) [14, 3c)
    8 0x66f4e160661d2687 [00, 2d) [00, 38) [00, 11) [00, 14)
   11 0x32dd6d7121dd1d9a [65, 98) [38, ae) [11, 22) [14, 3c)

So the ABBR/LINE/STR_OFF sections are kept as-is (no analysis is done to find 
which portions of the dwo file are used by which CUs, etc), but the INFO 
section is fragmented on the CU boundaries. Fragmenting the TYPES section on 
the TU boundaries is necessary/useful for deduplication of types, but this 
fragmenting of the CU makes it impossible (I think) to use ref_addr in a dwp 
file.

If this fragmenting were not done - consumers (GDB, etc) would need to change 
to account for this - searching through the INFO range to find the CU matching 
the signature, rather than knowing it starts at the start of the INFO range. 
This could have a noticeable performance impact especially in a full LTO build 
(where /all/ the CUs were in the same .dwo - so the index would be entirely 
unhelpful, I think).

Does all this sound right/sane - anyone have ideas/perspectives/thoughts on how 
this should work?

_______________________________________________
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

Reply via email to