> On Mar 18, 2015, at 4:41 PM, David Blaikie <[email protected]> wrote: > > > > On Wed, Mar 18, 2015 at 4:31 PM, Adrian Prantl <[email protected] > <mailto:[email protected]>> wrote: > >> On Mar 18, 2015, at 4:02 PM, David Blaikie <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> >> On Wed, Mar 18, 2015 at 3:50 PM, Adrian Prantl <[email protected] >> <mailto:[email protected]>> wrote: >> >>> On Mar 17, 2015, at 6:44 PM, David Blaikie <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> >>> On Tue, Mar 17, 2015 at 3:47 PM, Adrian Prantl <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> > On Mar 17, 2015, at 10:03 AM, Greg Clayton <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > >>> > >>> >> On Mar 17, 2015, at 9:46 AM, David Blaikie <[email protected] >>> >> <mailto:[email protected]>> wrote: >>> >> >>> >> >>> >> >>> >> On Tue, Mar 17, 2015 at 9:42 AM, Greg Clayton <[email protected] >>> >> <mailto:[email protected]>> wrote: >>> >> >>> >>> On Mar 16, 2015, at 6:47 PM, David Blaikie <[email protected] >>> >>> <mailto:[email protected]>> wrote: >>> >>> >>> >>> >>> >>> >>> >>> On Mon, Mar 16, 2015 at 5:14 PM, Adrian Prantl <[email protected] >>> >>> <mailto:[email protected]>> wrote: >>> >>> >>> >>> Thanks for the explanation David, I missed that it is entirely the >>> >>> linker's (or some dwarf post-processor's) responsibility to find the >>> >>> module files and link in the debug info from the .pcm files, so >>> >>> debugger doesn’t notice a difference. >>> >>> >>> >>> I think there's still some confusion here. Sorry if I'm rehashing >>> >>> something, but I'll try to explain how this all works. >>> >>> >>> >>> Normal split DWARF: >>> >>> >>> >>> Compiler generates two files: .o and .dwo. >>> >>> .dwo has static, non-relocatable debug info. >>> >>> .o has a skeleton compile_unit that has the name of the .dwo file and a >>> >>> hash to verify that the .dwo file isn't stale when the debugger reads >>> >>> it. >>> >>> The .o files are all linked together, the .dwo files stay where they >>> >>> are. >>> >>> The debugger reads the linked executable, finds the skeleton >>> >>> compile_units contained therein, and find/loads the .dwo files >>> >>> >>> >>> The scenario I have in mind for module debug info is this: >>> >>> Module is compiled as an object file with debug info (this file is >>> >>> actually a .dwo file, even if it has some other extension - it has the >>> >>> non-relocatable debug info in it) >>> >>> .o file has a comdat'd skeleton compile_unit describing the .dwo/module >>> >>> file >>> >>> <from here on no extra work is required, the linker and debugger just >>> >>> act as normal> >>> >>> The .o files are linked together, the skeleton compile_units get >>> >>> deduplicated by the linker (comdat sections) >>> >> >>> >> One issue I can think of is we will need to figure out a way to make >>> >> COMDAT work with mach-o. COMDAT requires large number of sections and >>> >> mach-o can only have 255. >>> >> >>> >> Ah, fair enough - how does MachO handle inline functions (the most >>> >> common use of comdat) currently, then? >>> > >>> > Currently mach-o relies on symbols in the symbol table being marked as >>> > weak and I believe the data for these symbols are in special sections >>> > that are marked as containing items that can be coalesced. >>> > >>> That’s not necessarily an issue that needs to be solved on Darwin, or am I >>> maybe missing something? The linker leaves all debug info in the .o (as it >>> currently does) and llvm-dsymutil is resolving all the external module type >>> references while creating the .dSYM bundle. >>> >>> Yeah, with a debug aware linker (or in the case of dsymutil, a debug-only >>> linker) you would just know that since you're looking at object files, >>> module references will be redundant across objects and should be >>> deduplicated (by the dwo hash, most likely). >>> >>> If you're not teaching your debugger to read modules, and want to link the >>> debug info in from the .dwos - at that point you can probably drop the >>> skeleton stuff entirely (you'd still need to teach your debugger about .dwo >>> sections and some of the esoteric things there - like str_index and the >>> extra/special line table just for file names (decl_file, etc, uses this)) >>> and just put the contents of the module debug info straight in the dsym. >>> It'd be a bit weird, but do-able without too much work, I'd imagine. You >>> could move them back into the original sections, if you wanted to avoid the >>> weird .dwo +non-.dwo sections together... *shrug* not sure what exactly >>> you'd want there. >> >> My plan was to have -gmodules to behave like the latter variant unless >> -gsplit-dwarf is also present; this way there wouldn't be any weird >> Darwin-specific code paths. >> >> Not sure I quite follow (mostly my fault given the rambling paragraph up >> there) - given the lack of a dsymutil-like tool on other platforms as part >> of the common tool path for debug info, I'm not sure module debug info >> without split dwarf is viable in that world. There's no tool to read these >> extra files at any point. > > In theory someone could port llvm-dsymutil to a different platform, but that > scenario is a little far-fetched. I’m not sure what will happen if LLDB is > presented with linked, non-split debug info that contains module references. > > Linked non-split debug info should come out for free - all the debug info > would be is a bunch of TUs in a single comdat - no skeleton CU, nothing else. > It would look just like normal DWARF, except with one comdat instead of > multiple, for each set of types from a module. (& there would be no real size > gains - since you'd be redundantly including all the type information in > every object file) > > >> >> I suppose we could be creating one giant comdat for the module's debug info >> (no skeleton unit, no distinct type unit comdats, just one big comdat). But >> we'd probably want/need a tool to do the merging at compile time (like the >> objcopy feature for split-dwarf, but in reverse - we'd compile, then run a >> tool to smoosh all the comdats from the modules onto the object we just >> generated). It wouldn't provide much in the way of space savings, a little >> less stress on the linker (fewer comdats to handle), etc. Not sure if >> there's a default mode of objcopy that would cope with this straight out, or >> whether we'd need a new feature there (which wouldn't be a priority for >> Google to implement, since we use fission, nor a priority for you to >> implement since you have dsymutil, etc - so I'm not sure anyone would bother) >> >> Long story short: maybe just error on -gmodules if -gsplit-dwarf isn't >> specified or the platform isn't darwin? (& if it's darwin, dsymutil could >> read the module skeletons to find which modules to link into the .dSYM?) > > That’s reasonable, too :-) > The plan is for llvm-dsymutil to follow the references in the module > skeletons, copy the module CUs > > TUs for now > > into the .dSYM, and fixup the external type references to become > DW_FORM_ref_addrs. > > Sounds good for you guys - the fixup work will be a bit non-trivial, since > it'll need to remove the type skeletons in the CUs, move all the extra > members from the skeletons into the type unit (& resolve any duplicates), > etc... - does that make sense? (otherwise I can provide some DWARF snippets > to explain better)
Or we use a weird Darwin-specific code path to not emit the modules with -generate-type-units in the first place (bag of DWARF+index mapping hash to DIE), which would make dsymutil's job really easy. As much as I’d like to get rid of platform-specific behavior, due to the automatic way that modules are generated on Darwin I don’t see an elegant way of making this switchable by the user. -- adrian > > > -- adrian >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
