On Sat, May 01, 2021 at 05:59:31PM +0200, Mark Wielaard wrote: > Hi Omar, > > On Fri, 2021-04-23 at 16:36 -0700, Omar Sandoval wrote: > > Whenever we encounter an attribute with DW_FORM_indirect, we need to > > read its true form from the DIE data. Then, we can continue normally. > > This adds support to the most obvious places: __libdw_find_attr() and > > dwarf_getattrs(). > > Very nice. I wasn't aware anybody actually used DW_FORM_indirect. > > In your patch you only handle one indirection. I think that in theory a > DW_FORM_indirect could be an DW_FORM_indirect itself (but now in the > DIE instead of the Abbrev). But this is probably silly, so not > supporting that seems correct. Same for an indirect > DW_FORM_implicit_const. It doesn't really make sense to do that because > if you wanted to put the value in the DIE instead of an Abbrev you > would have used an actual FORM that did that. > > > There may be more places that need to be updated. > > There is __libdw_form_val_compute_len which already handles > DW_FORM_indirect: > > case DW_FORM_indirect: > get_uleb128 (u128, valp, endp); > // XXX Is this really correct? > result = __libdw_form_val_len (cu, u128, valp); > if (result != (size_t) -1) > result += valp - startp; > else > return (size_t) -1; > break; > > I believe the XXX question can be answered with: Yes, the result is the > size of the actual FORM data plus the size of the uleb128 encoding that > FORM (which is valp - startp). And it probably should check like your > code does that valp != DW_FORM_indirect && valp != > DW_FORM_implicit_const. I'll sent a patch to do that. > > But, now that dwarf_child and dwarf_getattrs handle DW_FORM_indirectly > directly (haha, pun intended) I think __libdw_form_val_compute_len is > never called for DW_FORM_indirect anymore. > > There is dwarf_hasattr, but that doesn't care about the value, just the > attribute name, so it doesn't have to follow any DW_FORM_indirect. > > And there are the "raw" dwarf_getabbrev functions, but those aren't > associated with any DIE/.debug_info data, so leaving DW_FORM_indirect > as is in the Dwarf_Abbrev for those seems correct. > > > I encountered this when inspecting a file that was processed by our BOLT > > tool: https://github.com/facebookincubator/BOLT. This also adds a couple > > of test cases using a file generated by that tool. > > Thanks for that test case. It really helps seeing an actual example of > indirection. I note that all Abbrevs/DIEs using DW_FORM_indirect are > for DW_AT_low_pc and that the indirect DW_AT_addr are all the zero > address. Is that intended?
>From manually inspecting the debug information, I believe this is correct. Looking into it more, BOLT uses DW_FORM_indirect for silly reasons. Something to do with using it to pad out values in place so that .debug_abbrev and .debug_info don't need to be rewritten entirely in the patched binary. > Pushed you patch. Thank you!