dblaikie added a comment. In D68270#1700108 <https://reviews.llvm.org/D68270#1700108>, @probinson wrote:
> Do we care whether llvm-dwarfdump's output bears any similarities to the > output from GNU readelf or objdump? There has been a push lately to get the > LLVM "binutils" to behave more like GNU's, although AFAIK it hasn't gotten to > the DWARF dumping part. Generally I hope not to deal with that until there's a user with a need for it who wants to do the work & has a specific use-case that can help motivate which similarities are desirable and which ones don't matter (& perhaps if there's enough that they start to tradeoff usability - maybe the "compatibility mode" is a separate tool or separate flag to the existing tool). My broader hope is probably that llvm-dwarfdump is more for interactive uses than other dumpers, so fewer people might try to build automated things on top of it & thus expect specific output (this gives us both the freedom not to match the GNU tools, and the freedom not to match previous llvm-dwarfdump behavior (which we've done a fair bit in the past - which seems to support the theory that people don't seem to be building much on top of this)) ================ Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:291-295 + EntryIterator Absolute = + getAbsoluteLocations( + SectionedAddress{BaseAddr, SectionedAddress::UndefSection}, + LookupPooledAddress) + .begin(); ---------------- labath wrote: > dblaikie wrote: > > labath wrote: > > > dblaikie wrote: > > > > labath wrote: > > > > > This parallel iteration is not completely nice, but I think it's > > > > > worth being able to reuse the absolute range computation code. I'm > > > > > open to ideas for improvement though. > > > > Ah, I see - this is what you meant about "In particular it makes it > > > > possible to reuse this stuff in the dumping code, which would have been > > > > pretty hard with callbacks.". > > > > > > > > I'm wondering if that might be worth revisiting somewhat. A full > > > > iterator abstraction for one user here (well, two once you include lldb > > > > - but I assume it's likely going to build its own data structure from > > > > the iteration anyway, right? (it's not going to keep the iterator > > > > around, do anything interesting like partial iterations, re-iterate/etc > > > > - such that a callback would suffice)) > > > > > > > > I could imagine two callback APIs for this - one that gets entries and > > > > locations and one that only gets locations by filtering on the entry > > > > version. > > > > > > > > eg: > > > > > > > > // for non-verbose output: > > > > LL.forEachEntry([&](const Entry &E, Expected<DWARFLocation> L) { > > > > if (Verbose && actually dumping debug_loc) > > > > print(E) // print any LLE_*, raw parameters, etc > > > > if (L) > > > > print(*L) // print the resulting address range, section name (if > > > > verbose), > > > > else > > > > print(error stuff) > > > > }); > > > > > > > > One question would be "when/where do we print the DWARF expression" - > > > > if there's an error computing the address range, we can still print the > > > > expression, so maybe that happens unconditionally at the end of the > > > > callback, using the expression in the Entry? (then, arguably, the > > > > expression doesn't need to be in the DWARFLocation - and I'd say make > > > > the DWARFLocation a sectioned range, exactly the same type as for > > > > ranges so that part of the dumping code, etc, can be maximally reused) > > > Actually, what lldb currently does is that it does not build any data > > > structures at all (except storing the pointer to the right place in the > > > debug_loc section. Then, whenever it wants to do something to the > > > loclist, it parses it afresh. I don't know why it does this exactly, but > > > I assume it has something to do with most locations never being used, or > > > being only a couple of times, and the actual parsing being fairly fast. > > > What this means is that lldb is not really a single "user", but there are > > > like four or five places where it iterates through the list, depending on > > > what does it actually want to do with it. It also does partial iteration > > > where it stops as soon as it find the entry it was interested in. > > > Now, all of that is possible with a callback (though I am generally > > > trying to avoid them), but it does resurface the issue of what should be > > > the value of the second argument for DW_LLE_base_address entries (the > > > thing which I originally used a error type for). > > > Maybe this should be actually one callback API, taking two callback > > > functions, with one of them being invoked for base_address entries, and > > > one for others? However, if we stick to the current approaches in both > > > LLE and RLE of making the address pool resolution function a parameter > > > (which I'd like to keep, as it makes my job in lldb easier), then this > > > would actually be three callbacks, which starts to get unwieldy. Though > > > one of those callbacks could be removed with the "DWARFUnit implementing > > > a AddrOffsetResolver interface" idea, which I really like. :) > > Ah, thanks for the details on LLDB's location parsing logic. That's > > interesting indeed! > > > > I can appreciate an iterator-based API if that's the sort of usage we've > > got, though I expect it doesn't have any interest in the low-level encoding > > & just wants the fully processed address ranges/locations - it doesn't want > > base_address or end_of_list entries? & I think the dual-iteration is a > > fairly awkward API design, trying to iterate them in lock-step, etc. I'd > > rather avoid that if reasonably possible. > > > > Either having an iterator API that gives only the fully processed > > data/semantic view & a completely different API if you want to access the > > low level primitives (LLE, etc) (this is how ranges works - there's an API > > that gives a collection of ranges & abstracts over v4/v5/rnglists/etc - > > though that's partly motivated by a strong multi-client need for that > > functionality for symbolizing, etc - but I think it's a good > > abstraction/model anyway (& one of the reasons the inline range list > > printing doesn't include encoding information, the API it uses is too high > > level to even have access to it)) > > > > > Now, all of that is possible with a callback (though I am generally > > > trying to avoid them), but it does resurface the issue of what should be > > > the value of the second argument for DW_LLE_base_address entries (the > > > thing which I originally used a error type for). > > > > Sorry, my intent in the above API was for the second argument to be > > Optional's "None" state when... oh, I see, I did use Expected there, rather > > than Optional, because there are legit error cases. > > > > I know it's sort of awkward, but I might be inclined to use > > Optional<Expected<AddressRange>> there. I realize two layers of wrapping is > > a bit weird, but I think it'd be nicer than having an error state for what, > > I think, isn't erroneous. > > > > > Maybe this should be actually one callback API, taking two callback > > > functions, with one of them being invoked for base_address entries, and > > > one for others? However, if we stick to the current approaches in both > > > LLE and RLE of making the address pool resolution function a parameter > > > (which I'd like to keep, as it makes my job in lldb easier), then this > > > would actually be three callbacks, which starts to get unwieldy. > > > > Don't mind three callbacks too much. > > > > > Though one of those callbacks could be removed with the "DWARFUnit > > > implementing a AddrOffsetResolver interface" idea, which I really like. :) > > > > Sorry, I haven't really looked at where the address resolver callback is > > registered and alternative designs being discussed - but yeah, going off > > just the one-sentence, it seems reasonable to have the DWARFUnit own an > > address resolver/be the thing you consult when you want to resolve an > > address (just through a normal function call in DWARFUnit, perhaps - which > > might, internally, use a callback registered when it was constructed). > > I know it's sort of awkward, but I might be inclined to use > > Optional<Expected<AddressRange>> there. I realize two layers of wrapping is > > a bit weird, but I think it'd be nicer than having an error state for what, > > I think, isn't erroneous. > Actually, my very first attempt at this patch used an > `Expected<Optional<Whatever>>`, but then I scrapped it because I didn't think > you'd like it. It's not the friendliest of APIs, but I think we can go with > that. > > > Sorry, I haven't really looked at where the address resolver callback is > > registered and alternative designs being discussed - but yeah, going off > > just the one-sentence, it seems reasonable to have the DWARFUnit own an > > address resolver/be the thing you consult when you want to resolve an > > address (just through a normal function call in DWARFUnit, perhaps - which > > might, internally, use a callback registered when it was constructed). > > I think you got that backwards. I don't want the DWARFUnit to be the source > of truth for address pool resolutions, as that would make it hard to use from > lldb (it's far from ready to start using the llvm version right now). What I > wanted was to replace the lambda/function_ref with a single-method interface. > Then both DWARFUnits could implement that interface so that passing a > DWARFUnit& would "just work" (but you wouldn't be limited to DWARFUnits as > anyone could implement that interface, just like anyone can write a lambda). As for Expected<Optional<Whatever>> (or Optional<Expected<>>) - yeah, I think this is a non-obvious API (both the general problem and this specific solution). I think it's probably worth discussing this design a bit more to save you time writing/rewriting things a bit. I guess there are a few layers of failure here. There's the possibility that the iteration itself could fail - even for debug_loc style lists (if we reached the end of the section before encountering a terminating {0,0}). That would suggest a fallible iterator idiom: http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges But then, yes, when looking at the "processed"/semantic view, that could fail too in the case of an invalid address index, etc. The generic/processed/abstracted-over-ranges-and-rnglists API for ranges produces a fully computer vector (& then returns Expected<vector> of that range) - is that reasonable? (this does mean manifesting a whole location in memory, which may not be needed so I could understand avoiding that even without fully implementing & demonstrating the vector solution is inadequate). But I /think/ maybe the we could/should have two APIs - one generic API that abstracts over loc/loclists and only provides the fully processed view, and another that is type specific for dumping the underlying representation (only used in dumping debug_loclists). Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68270/new/ https://reviews.llvm.org/D68270 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits