Hey Samy - curious if you ever happened to end up getting further details here.
On Fri, Apr 9, 2021 at 1:05 PM Samy Al Bahra <sba...@repnop.org> wrote: > Thanks for the detailed response David. > > On Fri, Apr 9, 2021 at 2:52 PM David Blaikie <dblai...@gmail.com> wrote: > >> I'm not suggesting scanning all of .debug_info - only the CU DIE for >> DW_AT_ranges or high/low_pc, then skip to the next CU DIE (via the >> unit header's next unit offset). >> > >> It sounded like CU ranges couldn't be used to build such an index at >> all/that your code used quite a different strategy in the absence of >> aranges? (rather than building the index from the CU ranges - somewhat >> slower I'm sure, but I wouldn't've thought (& am trying to understand >> if it is/why) so fundamentally slower that it wouldn't be the next >> fallback rather than skipping the index entirely or employing some >> more fundamentally different approach) > > > This is still significantly less dense than aranges, involves more disk > I/O and memory pressure. Let me see what optimizations I can implement here > and get back to you with the results / what I came up with. This would be a > better basis for apples to apples comparison. > > >> >> If you mean building ranges from all the DIEs deep inside a CU - yeah, >> that's going to be fundamentally slower in a bunch of ways that maybe >> I could see that would necessitate a totally different approach/that >> the index wouldn't make sense anymore (though I'd still like to >> understand it) - but I'm especially curious about the case where the >> CU DIE itself does have comprehensive address range information. >> > > Will report back on this. > > >> >> - Dave >> >> > >> >> >> >> >> >>> >> >>> (+ complexities Greg mentions later in the thread). In cases where we >> lack this, we use our own persistent cache which introduces unnecessary >> complexity. Now I am considering going as far as adding a multi-threaded >> indexer for cases where a persistent cache / build system modifications >> aren't an option (work to begin in the next week or two). >> >>> >> >>> .debug_aranges would provide a lot of value to our users. >> >>> >> >>> On Thu, Mar 11, 2021 at 3:48 PM David Blaikie via Dwarf-Discuss < >> dwarf-discuss@lists.dwarfstd.org> wrote: >> >>>> >> >>>> On Thu, Mar 11, 2021 at 5:48 AM <paul.robin...@sony.com> wrote: >> >>>>> >> >>>>> Hopefully not to side-track things too much... maybe wants its own >> >>>>> thread, if there's more to debate here. >> >>>> >> >>>> >> >>>> Yeah, how about we spin it off into another thread (done here) >> >>>> >> >>>>> >> >>>>> >> For the case you suggested where it would be useful to keep the >> range >> >>>>> >> list for the CU in the .o file, I think .debug_aranges is what >> you're >> >>>>> >> looking for. >> >>>>> > >> >>>>> > aranges has been off by default in LLVM for a while - it adds a >> lot of >> >>>>> > overhead (doesn't have all the nice rnglist encodings for >> instance - >> >>>>> > nor can it use debug_addr, and if it did it'd still be duplicate >> with >> >>>>> > the CU ranges wherever they were). >> >>>>> >> >>>>> Did you want to file an issue to improve how .debug_aranges works? >> >>>> >> >>>> >> >>>> I don't currently understand the value it provides, and I at least >> don't have a use case for it, so I'm not sure I'd be the best person to >> advocate/drive that work. >> >>>> >> >>>>> Complaining that it duplicates CU ranges is missing the point, >> though; >> >>>>> it's an index, like .debug_names, of course it duplicates other >> info. >> >>>>> If you want to suggest an improved index, like we did with >> .debug_names, >> >>>>> that would be great too. >> >>>> >> >>>> >> >>>> .debug_names is quite different though - it collects information >> from across the DIE tree - information that is expensive to otherwise >> gather (walking the whole DIE tree). >> >>>> >> >>>> .debug_aranges is not like that for most producers (producers that >> do include the address ranges on the CU DIE) - the data is readily >> available immediately on the CU. That does involve reading some of >> .debug_abbrev, and interpreting a handful of attributes - but at least for >> the use cases I'm aware of, that overhead isn't worth the size increase. >> >>>> >> >>>> Do you have numbers on the benefits of .debug_aranges compared to >> parsing the ranges from CU DIEs? >> >>>> >> >>>> (one possible issue: the CU doesn't /have/ to contain >> low/high/ranges if its children DIEs contain addresses - having that as a >> guarantee, or some preferred way of encoding zero length (high/low of 0 >> would be acceptable, I guess) would be nice & make it cheap to skip over >> CUs that don't have any address ranges) >> >>>> >> >>>> Roughly, a modern debug_aranges to me would look something like: >> >>>> >> >>>> <length> >> >>>> <version> >> >>>> <CU sec_offset> >> >>>> <addr_base> >> >>>> <rnglist sec_offset> >> >>>> >> >>>> So it could fully re-use the rnglist encoding. If this was going to >> be as compact as possible, it'd need to be configurable which encodings it >> uses - ranges V high/low, addrx V addr - at which point it'd probably look >> like a small DIE with an inline abbrev (similar to the way DWARFv5 encodes >> the file and directory entries now, and how debug_names is self-describing) >> - at which point it looks to me a lot like parsing the CU DIEs. >> >>>> >> >>>> _______________________________________________ >> >>>> Dwarf-Discuss mailing list >> >>>> Dwarf-Discuss@lists.dwarfstd.org >> >>>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org >> >>> >> >>> >> >>> >> >>> -- >> >>> Samy Al Bahra [http://repnop.org] >> > >> > >> > >> > -- >> > Samy Al Bahra [http://repnop.org] >> > > > -- > Samy Al Bahra [http://repnop.org] >
_______________________________________________ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org