labath abandoned this revision.
labath marked an inline comment as done.
labath added a comment.
Abandoning. I'll create separate patches with the new implementation.
================
Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295
+ EntryIterator Absolute =
+ getAbsoluteLocations(
+ SectionedAddress{BaseAddr, SectionedAddress::UndefSection},
+ LookupPooledAddress)
+ .begin();
----------------
JDevlieghere wrote:
> labath wrote:
> > dblaikie wrote:
> > > labath wrote:
> > > > dblaikie wrote:
> > > > > labath wrote:
> > > > > > dblaikie wrote:
> > > > > > > labath wrote:
> > > > > > > > dblaikie wrote:
> > > > > > > > > labath wrote:
> > > > > > > > > > This parallel iteration is not completely nice, but I think
> > > > > > > > > > it's worth being able to reuse the absolute range
> > > > > > > > > > computation code. I'm open to ideas for improvement though.
> > > > > > > > > Ah, I see - this is what you meant about "In particular it
> > > > > > > > > makes it possible to reuse this stuff in the dumping code,
> > > > > > > > > which would have been pretty hard with callbacks.".
> > > > > > > > >
> > > > > > > > > I'm wondering if that might be worth revisiting somewhat. A
> > > > > > > > > full iterator abstraction for one user here (well, two once
> > > > > > > > > you include lldb - but I assume it's likely going to build
> > > > > > > > > its own data structure from the iteration anyway, right?
> > > > > > > > > (it's not going to keep the iterator around, do anything
> > > > > > > > > interesting like partial iterations, re-iterate/etc - such
> > > > > > > > > that a callback would suffice))
> > > > > > > > >
> > > > > > > > > I could imagine two callback APIs for this - one that gets
> > > > > > > > > entries and locations and one that only gets locations by
> > > > > > > > > filtering on the entry version.
> > > > > > > > >
> > > > > > > > > eg:
> > > > > > > > >
> > > > > > > > > // for non-verbose output:
> > > > > > > > > LL.forEachEntry([&](const Entry &E, Expected<DWARFLocation>
> > > > > > > > > L) {
> > > > > > > > > if (Verbose && actually dumping debug_loc)
> > > > > > > > > print(E) // print any LLE_*, raw parameters, etc
> > > > > > > > > if (L)
> > > > > > > > > print(*L) // print the resulting address range, section
> > > > > > > > > name (if verbose),
> > > > > > > > > else
> > > > > > > > > print(error stuff)
> > > > > > > > > });
> > > > > > > > >
> > > > > > > > > One question would be "when/where do we print the DWARF
> > > > > > > > > expression" - if there's an error computing the address
> > > > > > > > > range, we can still print the expression, so maybe that
> > > > > > > > > happens unconditionally at the end of the callback, using the
> > > > > > > > > expression in the Entry? (then, arguably, the expression
> > > > > > > > > doesn't need to be in the DWARFLocation - and I'd say make
> > > > > > > > > the DWARFLocation a sectioned range, exactly the same type as
> > > > > > > > > for ranges so that part of the dumping code, etc, can be
> > > > > > > > > maximally reused)
> > > > > > > > Actually, what lldb currently does is that it does not build
> > > > > > > > any data structures at all (except storing the pointer to the
> > > > > > > > right place in the debug_loc section. Then, whenever it wants
> > > > > > > > to do something to the loclist, it parses it afresh. I don't
> > > > > > > > know why it does this exactly, but I assume it has something to
> > > > > > > > do with most locations never being used, or being only a couple
> > > > > > > > of times, and the actual parsing being fairly fast. What this
> > > > > > > > means is that lldb is not really a single "user", but there are
> > > > > > > > like four or five places where it iterates through the list,
> > > > > > > > depending on what does it actually want to do with it. It also
> > > > > > > > does partial iteration where it stops as soon as it find the
> > > > > > > > entry it was interested in.
> > > > > > > > Now, all of that is possible with a callback (though I am
> > > > > > > > generally trying to avoid them), but it does resurface the
> > > > > > > > issue of what should be the value of the second argument for
> > > > > > > > DW_LLE_base_address entries (the thing which I originally used
> > > > > > > > a error type for).
> > > > > > > > Maybe this should be actually one callback API, taking two
> > > > > > > > callback functions, with one of them being invoked for
> > > > > > > > base_address entries, and one for others? However, if we stick
> > > > > > > > to the current approaches in both LLE and RLE of making the
> > > > > > > > address pool resolution function a parameter (which I'd like to
> > > > > > > > keep, as it makes my job in lldb easier), then this would
> > > > > > > > actually be three callbacks, which starts to get unwieldy.
> > > > > > > > Though one of those callbacks could be removed with the
> > > > > > > > "DWARFUnit implementing a AddrOffsetResolver interface" idea,
> > > > > > > > which I really like. :)
> > > > > > > Ah, thanks for the details on LLDB's location parsing logic.
> > > > > > > That's interesting indeed!
> > > > > > >
> > > > > > > I can appreciate an iterator-based API if that's the sort of
> > > > > > > usage we've got, though I expect it doesn't have any interest in
> > > > > > > the low-level encoding & just wants the fully processed address
> > > > > > > ranges/locations - it doesn't want base_address or end_of_list
> > > > > > > entries? & I think the dual-iteration is a fairly awkward API
> > > > > > > design, trying to iterate them in lock-step, etc. I'd rather
> > > > > > > avoid that if reasonably possible.
> > > > > > >
> > > > > > > Either having an iterator API that gives only the fully processed
> > > > > > > data/semantic view & a completely different API if you want to
> > > > > > > access the low level primitives (LLE, etc) (this is how ranges
> > > > > > > works - there's an API that gives a collection of ranges &
> > > > > > > abstracts over v4/v5/rnglists/etc - though that's partly
> > > > > > > motivated by a strong multi-client need for that functionality
> > > > > > > for symbolizing, etc - but I think it's a good abstraction/model
> > > > > > > anyway (& one of the reasons the inline range list printing
> > > > > > > doesn't include encoding information, the API it uses is too high
> > > > > > > level to even have access to it))
> > > > > > >
> > > > > > > > Now, all of that is possible with a callback (though I am
> > > > > > > > generally trying to avoid them), but it does resurface the
> > > > > > > > issue of what should be the value of the second argument for
> > > > > > > > DW_LLE_base_address entries (the thing which I originally used
> > > > > > > > a error type for).
> > > > > > >
> > > > > > > Sorry, my intent in the above API was for the second argument to
> > > > > > > be Optional's "None" state when... oh, I see, I did use Expected
> > > > > > > there, rather than Optional, because there are legit error cases.
> > > > > > >
> > > > > > > I know it's sort of awkward, but I might be inclined to use
> > > > > > > Optional<Expected<AddressRange>> there. I realize two layers of
> > > > > > > wrapping is a bit weird, but I think it'd be nicer than having an
> > > > > > > error state for what, I think, isn't erroneous.
> > > > > > >
> > > > > > > > Maybe this should be actually one callback API, taking two
> > > > > > > > callback functions, with one of them being invoked for
> > > > > > > > base_address entries, and one for others? However, if we stick
> > > > > > > > to the current approaches in both LLE and RLE of making the
> > > > > > > > address pool resolution function a parameter (which I'd like to
> > > > > > > > keep, as it makes my job in lldb easier), then this would
> > > > > > > > actually be three callbacks, which starts to get unwieldy.
> > > > > > >
> > > > > > > Don't mind three callbacks too much.
> > > > > > >
> > > > > > > > Though one of those callbacks could be removed with the
> > > > > > > > "DWARFUnit implementing a AddrOffsetResolver interface" idea,
> > > > > > > > which I really like. :)
> > > > > > >
> > > > > > > Sorry, I haven't really looked at where the address resolver
> > > > > > > callback is registered and alternative designs being discussed -
> > > > > > > but yeah, going off just the one-sentence, it seems reasonable to
> > > > > > > have the DWARFUnit own an address resolver/be the thing you
> > > > > > > consult when you want to resolve an address (just through a
> > > > > > > normal function call in DWARFUnit, perhaps - which might,
> > > > > > > internally, use a callback registered when it was constructed).
> > > > > > > I know it's sort of awkward, but I might be inclined to use
> > > > > > > Optional<Expected<AddressRange>> there. I realize two layers of
> > > > > > > wrapping is a bit weird, but I think it'd be nicer than having an
> > > > > > > error state for what, I think, isn't erroneous.
> > > > > > Actually, my very first attempt at this patch used an
> > > > > > `Expected<Optional<Whatever>>`, but then I scrapped it because I
> > > > > > didn't think you'd like it. It's not the friendliest of APIs, but I
> > > > > > think we can go with that.
> > > > > >
> > > > > > > Sorry, I haven't really looked at where the address resolver
> > > > > > > callback is registered and alternative designs being discussed -
> > > > > > > but yeah, going off just the one-sentence, it seems reasonable to
> > > > > > > have the DWARFUnit own an address resolver/be the thing you
> > > > > > > consult when you want to resolve an address (just through a
> > > > > > > normal function call in DWARFUnit, perhaps - which might,
> > > > > > > internally, use a callback registered when it was constructed).
> > > > > >
> > > > > > I think you got that backwards. I don't want the DWARFUnit to be
> > > > > > the source of truth for address pool resolutions, as that would
> > > > > > make it hard to use from lldb (it's far from ready to start using
> > > > > > the llvm version right now). What I wanted was to replace the
> > > > > > lambda/function_ref with a single-method interface. Then both
> > > > > > DWARFUnits could implement that interface so that passing a
> > > > > > DWARFUnit& would "just work" (but you wouldn't be limited to
> > > > > > DWARFUnits as anyone could implement that interface, just like
> > > > > > anyone can write a lambda).
> > > > > As for Expected<Optional<Whatever>> (or Optional<Expected<>>) - yeah,
> > > > > I think this is a non-obvious API (both the general problem and this
> > > > > specific solution). I think it's probably worth discussing this
> > > > > design a bit more to save you time writing/rewriting things a bit. I
> > > > > guess there are a few layers of failure here.
> > > > >
> > > > > There's the possibility that the iteration itself could fail - even
> > > > > for debug_loc style lists (if we reached the end of the section
> > > > > before encountering a terminating {0,0}). That would suggest a
> > > > > fallible iterator idiom:
> > > > > http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges
> > > > >
> > > > > But then, yes, when looking at the "processed"/semantic view, that
> > > > > could fail too in the case of an invalid address index, etc.
> > > > >
> > > > > The generic/processed/abstracted-over-ranges-and-rnglists API for
> > > > > ranges produces a fully computer vector (& then returns
> > > > > Expected<vector> of that range) - is that reasonable? (this does mean
> > > > > manifesting a whole location in memory, which may not be needed so I
> > > > > could understand avoiding that even without fully implementing &
> > > > > demonstrating the vector solution is inadequate).
> > > > >
> > > > > But I /think/ maybe the we could/should have two APIs - one generic
> > > > > API that abstracts over loc/loclists and only provides the fully
> > > > > processed view, and another that is type specific for dumping the
> > > > > underlying representation (only used in dumping debug_loclists).
> > > > If we were computing the final address ranges from scratch (which would
> > > > be the best match for the current lldb usage, but which I am not
> > > > considering now for fear of changing too many things), then I agree
> > > > that we would need the fallible_iterator iterator thingy. But in this
> > > > case we are "interpreting" the already parsed ranges, so we can assume
> > > > some level of correctness here, and the thing that can fail is only the
> > > > computation of a single range, which does not affect our ability to
> > > > process the next entry.
> > > > This indicates to me that either each entry in the list should be an
> > > > Expected<>, or that the invalid entries should be just dropped
> > > > (possibly accompanied by some flag which would tell the caller that the
> > > > result was not exhaustive).
> > > >
> > > > This is connected to one of the issues I have with the debug ranges API
> > > > -- it tries _really_ hard to return *something* -- if resolving the
> > > > indirect base address entry fails, it is perfectly happy to use the
> > > > address _index_ as the base address. This makes sense for dumping,
> > > > where you want to show something (though it would still be good to
> > > > indicate that you're not showing a real address), but it definitely
> > > > does not help consumers which then need to make decisions based on the
> > > > returned data.
> > > >
> > > > Anyway, yes, I agree that we need to APIs, and probably callbacks are
> > > > the easiest way to achieve that. We could have a "base" callback that
> > > > is not particularly nice to use, but provides the full information via
> > > > a combination of `UnparsedLL` and `Optional<Expected<ParsedLL>>`
> > > > arguments. The dumper could use that to print out everything it needs.
> > > > And then we could have a second API, built on top of the first one,
> > > > which ignores base address entries and the raw data and returns just a
> > > > bunch of `Expected<ParsedLL>`. This could be used by users like lldb,
> > > > who just want to see the final data. The `ParsedLL` type would be
> > > > independent of the location list type, so that the debug_loc parser
> > > > could provide the same kind of API (but implemented on top of something
> > > > else, as the `UnparsedLL` types would differ). Also, under the hood,
> > > > the location list dumper for debug_loclists (but not debug_loc) could
> > > > reuse some implementation details with the debug_rnglists dumper via a
> > > > suitable combination of templates and callbacks.
> > > >
> > > > How does that sound?
> > > What sort of things are you concerned about with deeper API changes here?
> > > I think it's probably worth building the "right" thing now - as good a
> > > time as any. LLVM's debug info APIs, as you've pointed out, aren't
> > > exactly "sturdy" (treating address indexes as offsets, etc, etc), so no
> > > time like the present to clean it up.
> > >
> > > I think if we had an abstraction over v4 and v5 location descriptions,
> > > parsing from scratch, fallible iterators, etc - that'd be the ideal thing
> > > to use in the inline dumping code (that dumps inside debug_info) - which
> > > currently uses "parseOneLocationList" - so it is parsing from scratch and
> > > dumping.
> > >
> > > But equally I understand not wanting to make you/me/anyone fix everything
> > > when just trying to get something actually done.
> > I think I am mainly concerned with the scope explosion of doing changes
> > like that. I don't know how well founded is that concern, because I don't
> > know all the use cases for this information right now (but that's a part of
> > the problem). But anyway, let's continue the discussion.
> >
> > The main problem I see with "inline" dumping using some higher-level
> > abstraction is what should one do in case of incomplete data (e.g. dwo
> > files). If this abstraction returns "cooked" data, then the inline dump
> > could not show anything for dwo files dumped standalone. This is perfectly
> > fine for lldb and maybe other tools (which never look at dwo files
> > separately, and mostly don't care about the reason why the "cooking" failed
> > -- if they just need to get the data it can rely on), but for something
> > like llvm-dwarfdump, we'd probably want to display _something_. I guess
> > this is why the ranges api returns indexes as offsets, but I think we agree
> > we don't want that.
> >
> > Then the question is what should that "something" be? If it's going to be a
> > raw dump of the location list entry, then solution would not be fully
> > generic, as they raw entry type will depend on the location list kind.
> > Though, we could still arrange it so that the various location lists can be
> > processed uniformly via a template if they e.g. have a dump() function with
> > the same signature...
> >
> > Suppose we go down that path (this is the path I wanted to go down in my
> > previous comment, modulo the parse-from-scratch part). The question then is
> > what to do with the section-based dumping. Should it use the same
> > mechanism? It probably should, because the first callback/iterator will
> > provide all data it can possibly want. But then, should it also build the
> > parsed representation like it does now?
> > If yes, then what for? Should we also build some mechanism to "cook" that
> > data too.
> > If not, are we ok with having all users reparse from scratch always? (There
> > are only two users I found right now, and they look like they'd be fine
> > with it.)
> > Or should the DWARFDebugLoclists object cache the cooked data instead?
> > llvm-dwarfdump --statistics would actually prefer that, but that might make
> > the DWARF verifier sad (though I guess it could always reparse from scratch
> > if needed).
> I'm very late to the discussion and I'm not as familiar as both of you with
> the details and the API uses, so please ignore my suggestion if it doesn't
> make any sense...
> Could we have an API where we parse a *list* of Uncooked (to reuse Pavel's
> nomenclature) and then have the ability to resolve each Uncooked entry into a
> Cooked entry? Then both LLDB and dwarfdump could get the list of Uncooked
> entries and try to get the cooked variant. If that works, great, we dump/use
> that, if not we move on and have separate failure modes for errors
> originating from parsing and from cooking.
We've discussed this with David last week, and we have hopefully agreed on the
rough direction forward (and I think it's going to roughly correspond to what
you had in mind). I'm preparing a patch (first of many) to implement that and
I'm hoping I'll be able to upload something today.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D68270/new/
https://reviews.llvm.org/D68270
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits