Thanks for the detailed response David.

On Fri, Apr 9, 2021 at 2:52 PM David Blaikie <dblai...@gmail.com> wrote:

> I'm not suggesting scanning all of .debug_info - only the CU DIE for
> DW_AT_ranges or high/low_pc, then skip to the next CU DIE (via the
> unit header's next unit offset).
>

> It sounded like CU ranges couldn't be used to build such an index at
> all/that your code used quite a different strategy in the absence of
> aranges? (rather than building the index from the CU ranges - somewhat
> slower I'm sure, but I wouldn't've thought (& am trying to understand
> if it is/why) so fundamentally slower that it wouldn't be the next
> fallback rather than skipping the index entirely or employing some
> more fundamentally different approach)


This is still significantly less dense than aranges, involves more disk I/O
and memory pressure. Let me see what optimizations I can implement here and
get back to you with the results / what I came up with. This would be a
better basis for apples to apples comparison.


>
> If you mean building ranges from all the DIEs deep inside a CU - yeah,
> that's going to be fundamentally slower in a bunch of ways that maybe
> I could see that would necessitate a totally different approach/that
> the index wouldn't make sense anymore (though I'd still like to
> understand it) - but I'm especially curious about the case where the
> CU DIE itself does have comprehensive address range information.
>

Will report back on this.


>
> - Dave
>
> >
> >>
> >>
> >>>
> >>> (+ complexities Greg mentions later in the thread). In cases where we
> lack this, we use our own persistent cache which introduces unnecessary
> complexity. Now I am considering going as far as adding a multi-threaded
> indexer for cases where a persistent cache / build system modifications
> aren't an option (work to begin in the next week or two).
> >>>
> >>> .debug_aranges would provide a lot of value to our users.
> >>>
> >>> On Thu, Mar 11, 2021 at 3:48 PM David Blaikie via Dwarf-Discuss <
> dwarf-discuss@lists.dwarfstd.org> wrote:
> >>>>
> >>>> On Thu, Mar 11, 2021 at 5:48 AM <paul.robin...@sony.com> wrote:
> >>>>>
> >>>>> Hopefully not to side-track things too much... maybe wants its own
> >>>>> thread, if there's more to debate here.
> >>>>
> >>>>
> >>>> Yeah, how about we spin it off into another thread (done here)
> >>>>
> >>>>>
> >>>>> >> For the case you suggested where it would be useful to keep the
> range
> >>>>> >> list for the CU in the .o file, I think .debug_aranges is what
> you're
> >>>>> >> looking for.
> >>>>> >
> >>>>> > aranges has been off by default in LLVM for a while - it adds a
> lot of
> >>>>> > overhead (doesn't have all the nice rnglist encodings for instance
> -
> >>>>> > nor can it use debug_addr, and if it did it'd still be duplicate
> with
> >>>>> > the CU ranges wherever they were).
> >>>>>
> >>>>> Did you want to file an issue to improve how .debug_aranges works?
> >>>>
> >>>>
> >>>> I don't currently understand the value it provides, and I at least
> don't have a use case for it, so I'm not sure I'd be the best person to
> advocate/drive that work.
> >>>>
> >>>>> Complaining that it duplicates CU ranges is missing the point,
> though;
> >>>>> it's an index, like .debug_names, of course it duplicates other info.
> >>>>> If you want to suggest an improved index, like we did with
> .debug_names,
> >>>>> that would be great too.
> >>>>
> >>>>
> >>>> .debug_names is quite different though - it collects information from
> across the DIE tree - information that is expensive to otherwise gather
> (walking the whole DIE tree).
> >>>>
> >>>> .debug_aranges is not like that for most producers (producers that do
> include the address ranges on the CU DIE) - the data is readily available
> immediately on the CU. That does involve reading some of .debug_abbrev, and
> interpreting a handful of attributes - but at least for the use cases I'm
> aware of, that overhead isn't worth the size increase.
> >>>>
> >>>> Do you have numbers on the benefits of .debug_aranges compared to
> parsing the ranges from CU DIEs?
> >>>>
> >>>> (one possible issue: the CU doesn't /have/ to contain low/high/ranges
> if its children DIEs contain addresses - having that as a guarantee, or
> some preferred way of encoding zero length (high/low of 0 would be
> acceptable, I guess) would be nice & make it cheap to skip over CUs that
> don't have any address ranges)
> >>>>
> >>>> Roughly, a modern debug_aranges to me would look something like:
> >>>>
> >>>> <length>
> >>>> <version>
> >>>> <CU sec_offset>
> >>>> <addr_base>
> >>>> <rnglist sec_offset>
> >>>>
> >>>> So it could fully re-use the rnglist encoding. If this was going to
> be as compact as possible, it'd need to be configurable which encodings it
> uses - ranges V high/low, addrx V addr - at which point it'd probably look
> like a small DIE with an inline abbrev (similar to the way DWARFv5 encodes
> the file and directory entries now, and how debug_names is self-describing)
> - at which point it looks to me a lot like parsing the CU DIEs.
> >>>>
> >>>> _______________________________________________
> >>>> Dwarf-Discuss mailing list
> >>>> Dwarf-Discuss@lists.dwarfstd.org
> >>>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
> >>>
> >>>
> >>>
> >>> --
> >>> Samy Al Bahra [http://repnop.org]
> >
> >
> >
> > --
> > Samy Al Bahra [http://repnop.org]
>


-- 
Samy Al Bahra [http://repnop.org]
_______________________________________________
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

Reply via email to