Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-09-01 Thread David Blaikie via Dwarf-Discuss
On Tue, Sep 1, 2020 at 10:24 AM David Anderson 
wrote:
>
> On 8/31/20 8:39 PM, David Blaikie wrote:
> > On Mon, Aug 31, 2020 at 8:22 PM David Anderson  > > wrote:
> >
> > On 8/31/20 1:03 PM, David Blaikie wrote:
> > > I'd rather go with LLVM's existing interpretation - that strx
> > > encodings used in .dwo do not attempt to use str_offsets in the
> > skeleton.
> > > But I wouldn't mind adding a str_offsets_base to the split full
unit
> > > to make it clear - this would be consistent with rnglists, I
> > think? (I
> > > think, in theory a rnglistx in a .dwo with a split full unit
> > without a
> > > rnglists_base would use the rnglists_base (and .debug_rnglists
> > > non-dwo) in the executable, but if the split full unit has a
> > > rnglists_base, then the rnglistx in the split full unit use that
> > base
> > > to find rnglists in debug_rnglists.dwo - arguably I'd say we
> > might as
> > > well say the same thing about loclists, too, for consistency,
> > though I
> > > don't have any use for skeleton location lists right now)
> >
> > It seems to me that rnglists base and loclists_base in Split Full
> > always
> > reference the data in .debug_rnglists/.debug_loclists
> >
> > 3.1.3  Split Full Compilation Unit Entries
> > The following attributes are not part of a split full compilation
unit
> > entry but instead are
> > inherited (if present) from the corresponding skeleton compilation
> > unit:
> > DW_AT_low_pc,
> > DW_AT_high_pc, DW_AT_ranges, DW_AT_stmt_list, DW_AT_comp_dir,
> > DW_AT_str_offsets_base, DW_AT_addr_base and DW_AT_rnglists_base.
> >
> >
> > Hmm... yeah. I guess LLVM implements rnglistx /rnglist_base the same
> > as strx/str_offsets_base. Where it assumes that any *x encoding refers
> > to entities in the .dwo, even in the absence of a
> > rnglists_base/str_offsets_base in the split full unit. I had thought
> > we'd implemented it to emit a rnglists_base in the split full unit,
> > which would've been in contrast to the str_offsets_base - so my
> > mistake/apologies for the previous description.
> Still confused.
>
> Lets say skeleton A is in object file OB.
> And OB.dwp contains the split-full CU DIE.
> Lets say non-empty  .debug_rnglists and .debug_rnglists.dwo  exist.
>
> The compiler could create the rnglists for A in *either* OB or OB.dwp.

It sounds like you might be talking specifically/only about the CU-level
ranges (in the phrasing "rnglists for A")? Not about ranges attached to,
say, a lexical_block or inlined_subroutine? Is that the case?

> And could pick and choose, for each split-able Compilation Unit,
> which place to put rnglists
> independently of all other CUs.

FWIW, I'm not objecting to the DWARF spec's requirement that the CU-level
ranges must go in the skeleton CU (though I wouldn't've minded if that was
a "quality of implementation" thing - some producers might want to scrape
those extra few bytes out of the skeleton at the cost of consumers needing
to do the indirection/read the dwo/dwp to find the CU's ranges)

> Meaning both OB and OB.dwp could have rnglists, but only
> one of them has the rnglists entry for any given CU.
>
> How do we know  which .debug_rnglists section  to look at
> given Skeleton A and split-full A?
> Which does the DW_AT_rnglists_base apply to?

The way I think of it - reading some parts of the spec and ignoring others,
and the way it's implemented in LLVM based on my thinking (model (1) in my
previous email) - and rnglistx encoding used in the split full unit (on the
CU DIE or any child DIEs) would be resolved into debug_rnglists.dwo - no
matter the presence/absence of a rnglists_base on the skeleton CU DIE (and
there would never be a rnglists_base on the split full CU DIE). If the
skeleton unit used a rnglistx encoding for anything, it would need a
rnglists_base and the rnglistx would be resolved relative to that.

I guess a few things I'd say:
  I don't think I'd ever want to suggest that a rnglistx on a skeleton DIE
shuold refer to rnglists.dwo (if that's the case, just move the
rnglistx-encoded attribute into the split full unit DIE, since it's useless
on the skeleton by itself). If you have a rnglistx in the skeleton unit,
you must have a rnglists_base on that skeleton DIE.
  I also think it's important that a unit be able to have references from
the skeleton unit to rnglists non-dwo, and to have references from the
split full unit (less important for the unit DIE itself (but perhaps
someone has a need for some other rnglistx encoded extension attribute, for
instance, that they would like to put on the split full unit DIE) but
certainly for the children of that DIE) to rnglists.dwo - the question is
just how to support both of those. Either we assume all *x encodings refer
within their own unit (from the previous comment this, to me, is already
definitely true for the skeleton unit - so that 

Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-09-01 Thread David Anderson via Dwarf-Discuss

On 9/1/20 10:05 AM, David Blaikie wrote:


So the base addresses are in the skeleton and the actual section
(rnglists/loclists/str_offsets/str)
can go with Split Full (i.e, in a .dwo) if it has no addresses but
must
go with the skeleton if has addresses.


Sorry, I missed a step/not sure I understand this ^ comment - could
you rephrase/expound/clarify a bit?


I am assuming the compiler can *choose* whether to put rnglists into
.debug_rnglists or .debug_rnglists.dwo.
Or is there an implicit *requirement* that certain section content
(rnglists/loclists/str_offsets)
are forced into their .dwo section?

Your comments moments ago via  model 1) and model 2) address this issue
already.
I'm thinking about it.
DavidA
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-09-01 Thread David Anderson via Dwarf-Discuss

On 8/31/20 8:39 PM, David Blaikie wrote:

On Mon, Aug 31, 2020 at 8:22 PM David Anderson mailto:dave...@linuxmail.org>> wrote:

On 8/31/20 1:03 PM, David Blaikie wrote:
> I'd rather go with LLVM's existing interpretation - that strx
> encodings used in .dwo do not attempt to use str_offsets in the
skeleton.
> But I wouldn't mind adding a str_offsets_base to the split full unit
> to make it clear - this would be consistent with rnglists, I
think? (I
> think, in theory a rnglistx in a .dwo with a split full unit
without a
> rnglists_base would use the rnglists_base (and .debug_rnglists
> non-dwo) in the executable, but if the split full unit has a
> rnglists_base, then the rnglistx in the split full unit use that
base
> to find rnglists in debug_rnglists.dwo - arguably I'd say we
might as
> well say the same thing about loclists, too, for consistency,
though I
> don't have any use for skeleton location lists right now)

It seems to me that rnglists base and loclists_base in Split Full
always
reference the data in .debug_rnglists/.debug_loclists

3.1.3  Split Full Compilation Unit Entries
The following attributes are not part of a split full compilation unit
entry but instead are
inherited (if present) from the corresponding skeleton compilation
unit:
DW_AT_low_pc,
DW_AT_high_pc, DW_AT_ranges, DW_AT_stmt_list, DW_AT_comp_dir,
DW_AT_str_offsets_base, DW_AT_addr_base and DW_AT_rnglists_base.


Hmm... yeah. I guess LLVM implements rnglistx /rnglist_base the same
as strx/str_offsets_base. Where it assumes that any *x encoding refers
to entities in the .dwo, even in the absence of a
rnglists_base/str_offsets_base in the split full unit. I had thought
we'd implemented it to emit a rnglists_base in the split full unit,
which would've been in contrast to the str_offsets_base - so my
mistake/apologies for the previous description.

Still confused.

Lets say skeleton A is in object file OB.
And OB.dwp contains the split-full CU DIE.
Lets say non-empty  .debug_rnglists and .debug_rnglists.dwo  exist.

The compiler could create the rnglists for A in *either* OB or OB.dwp.
And could pick and choose, for each split-able Compilation Unit,
which place to put rnglists
independently of all other CUs.

Meaning both OB and OB.dwp could have rnglists, but only
one of them has the rnglists entry for any given CU.

How do we know  which .debug_rnglists section  to look at
given Skeleton A and split-full A?
Which does the DW_AT_rnglists_base apply to?

If one violated the standard and put DW_AT_rnglists_base
into the CU die that has the rnglists (skeleton or split-full) it would then
be known where to read the rnglists.

Confused, still.
DavidA



--
Despite all appearances, your boss is a thinking, feeling, human being.

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-09-01 Thread Michael Eager via Dwarf-Discuss

On 9/1/20 6:59 AM, David Anderson wrote:

Mike Eager: please delete the new issue 200831.1 as it is simply wrong.


Done.

--
Michael Eager
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-09-01 Thread David Blaikie via Dwarf-Discuss
On Tue, Sep 1, 2020 at 6:59 AM David Anderson  wrote:

> On 8/31/20 8:39 PM, David Blaikie wrote:
> > Hmm... yeah. I guess LLVM implements rnglistx /rnglist_base the same
> > as strx/str_offsets_base. Where it assumes that any *x encoding refers
> > to entities in the .dwo, even in the absence of a
> > rnglists_base/str_offsets_base in the split full unit. I had thought
> > we'd implemented it to emit a rnglists_base in the split full unit,
> > which would've been in contrast to the str_offsets_base - so my
> > mistake/apologies for the previous description.
>
> So the base addresses are in the skeleton and the actual section
> (rnglists/loclists/str_offsets/str)
> can go with Split Full (i.e, in a .dwo) if it has no addresses but must
> go with the skeleton if has addresses.
>

Sorry, I missed a step/not sure I understand this ^ comment - could you
rephrase/expound/clarify a bit?

I'm suggesting there are two possible ways we could spec this:

1) all loclistx, rnglistx, strx in .dwo are required/guaranteed/defined to
always refer to debug_loclists.dwo, debug_rnglists.dwo,
debug_str_offsets.dwo
  In this model, there's no way to

  -> this is how parts of the spec seem to be already defined, and how
strx/loclistx work in the DWARFv4 GNU extension Split DWARF implementation
(there's no loclists_base or str_offsets_base - the strx/loclistx in
debug_info.dwo is assumed to refer to the str_offsets.dwo/loc.dwo sections)

2) allow/require a *x encoding in a split full unit to refer (when combined
with a *_base attribute on the skeleton CU) a split full unit's
*_rnglist/loclist/str_offsets (non-dwo) contributions if the split full
unit has no *_base attributes. The split full unit's *_base attribute could
then be optionally specified to say "resolve *x encodings relative to/in
the split unit, instead of searching back up into the skeleton unit".

(1) is what LLVM's implemented at the moment, and changing that to (2)
wouldn't be too hard (we'd just always emit a *_base attribute in the split
full unit any time we were using the corresponding *x forms in the split
full unit)

>
> Ok.
>
> This way the standard is not in error as written.


I think there's still some contradictions - the two bits I quoted
previously:

"The DW_AT_addr_base and DW_AT_str_offsets_base attributes provide context
that may be necessary to interpret the contents of the corresponding split
DWARF object file."
"The following attributes are not part of a split full compilation unit
entry but instead are inherited (if present) from the corresponding
skeleton compilation unit: DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges,
DW_AT_stmt_list, DW_AT_comp_dir, DW_AT_str_offsets_base, DW_AT_addr_base
and DW_AT_rnglists_base."

If we're going with model (1), then the first of those two quotations
should remove "and DW_AT_str_offsets_base" and the second should carveout a
special case for str_offsets_base and rnglists_base to say they cannot be
specified on a skeleton unit, but also are /not/ inherited by the split
full compilation unit. (essentially the split full compilation unit has
implicit *_base attributes (if you think of them more like the (2) model
above) equal to the size of the contribution headers for those 3 sections)


> This understanding
> restricts what information can be derived from
> the Split Full CU by itself (ie, without the skeleton) a bit since the
> base addresses are not in the Split Full CU DIE.
>

That confuses me a bit. If you have rnglists.dwo, for example - you can't
use the actual rnglists_base from the skeleton CU to find the rnglists.dwo
contribution (because the rnglists_base will be relocated in the final
executable, to a value that has nothing to do with the rnglists.dwo
contribution location in the dwo or the index-relative location in the
dwp). So having the skeleton CU shouldn't make you any more or less able to
parse/dump/etc *x forms in a split full unit if we're using (1). If we're
using (2), then the wording needs to change to say that you must specify
*_base on the split full unit if you want to resolve *x forms in the split
full unit into rnglists.dwo/loclists.dwo/str_offsets.dwo references, and in
the absence of *_base on the split full unit, such *x forms would use the
*_base in the skeleton unit and the rnglists/loclists/str_offsets (non-dwo)
in the linked executable.


>
> Mike Eager: please delete the new issue 200831.1 as it is simply wrong.
>
> DavidA
>
>
>
>
>
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-09-01 Thread David Anderson via Dwarf-Discuss

On 8/31/20 8:39 PM, David Blaikie wrote:

Hmm... yeah. I guess LLVM implements rnglistx /rnglist_base the same
as strx/str_offsets_base. Where it assumes that any *x encoding refers
to entities in the .dwo, even in the absence of a
rnglists_base/str_offsets_base in the split full unit. I had thought
we'd implemented it to emit a rnglists_base in the split full unit,
which would've been in contrast to the str_offsets_base - so my
mistake/apologies for the previous description.


So the base addresses are in the skeleton and the actual section
(rnglists/loclists/str_offsets/str)
can go with Split Full (i.e, in a .dwo) if it has no addresses but must
go with the skeleton if has addresses.

Ok.

This way the standard is not in error as written.  This understanding
restricts what information can be derived from
the Split Full CU by itself (ie, without the skeleton) a bit since the
base addresses are not in the Split Full CU DIE.

Mike Eager: please delete the new issue 200831.1 as it is simply wrong.

DavidA




___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-08-31 Thread David Anderson via Dwarf-Discuss

On 8/31/20 1:03 PM, David Blaikie wrote:

I'd rather go with LLVM's existing interpretation - that strx
encodings used in .dwo do not attempt to use str_offsets in the skeleton.
But I wouldn't mind adding a str_offsets_base to the split full unit
to make it clear - this would be consistent with rnglists, I think? (I
think, in theory a rnglistx in a .dwo with a split full unit without a
rnglists_base would use the rnglists_base (and .debug_rnglists
non-dwo) in the executable, but if the split full unit has a
rnglists_base, then the rnglistx in the split full unit use that base
to find rnglists in debug_rnglists.dwo - arguably I'd say we might as
well say the same thing about loclists, too, for consistency, though I
don't have any use for skeleton location lists right now)


It seems to me that rnglists base and loclists_base in Split Full always
reference the data in .debug_rnglists/.debug_loclists

3.1.3  Split Full Compilation Unit Entries
The following attributes are not part of a split full compilation unit
entry but instead are
inherited (if present) from the corresponding skeleton compilation unit:
DW_AT_low_pc,
DW_AT_high_pc, DW_AT_ranges, DW_AT_stmt_list, DW_AT_comp_dir,
DW_AT_str_offsets_base, DW_AT_addr_base and DW_AT_rnglists_base.

I forgot that rnglists and loclists can use address x things so they
could exist in a .dwo an
so those too could potentially need/want different tables  the .dwo vs
the non-dwo.
So now I'm thinking you  are correct.

This needs an ISSUE on on dwarfstd.org. You could file one.

Or I could ask Michael Eager to modify what I filed today (probably not
visible on Dwarstd.Org yet) to specify your approach as the
better one.
Your preference?

DavidA.

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-08-31 Thread David Blaikie via Dwarf-Discuss
On Mon, Aug 31, 2020 at 10:33 AM David Anderson via Dwarf-Discuss <
dwarf-discuss@lists.dwarfstd.org> wrote:

> I has occurred to me that simply restricting skeleton CUs
> to use DW_FORM_string or DW_FORM_strp
> would restore the unique meaning of DW_AT_str_offsets_base
> to apply to the dwp  (letting non-skeleton CUs use
> DW_FORM_strx1 etc ).  With seemingly little impact on
> overall size.
>

Seems a pity for orthogonality (and for a non-standard/extension use that
LLVM has, where skeleton units carry some DIEs (essentially "gmlt"-like
data, enough to symbolize with inline stack frames) in the skeleton CU -
not being able to used indexed strings would be an object size penalty due
to potentially needing to use more relocations)

I think the DWARFv5 spec is a bit conflicted, but does have some wording
that supports LLVM's existing usage:

".debug_info.dwo to .debug_str_offsets.dwo: Attribute values of class
string may have one of the forms DW_FORM_strx, 3 DW_FORM_strx1,
DW_FORM_strx2, DW_FORM_strx3 or 4 DW_FORM_strx4, whose value is an index
into the .debug_str_offsets.dwo section for the corresponding string"
"The string table section in .debug_str.dwo contains all the strings
referenced from DWARF attributes using any of the forms DW_FORM_strx,
DW_FORM_strx1, DW_FORM_strx2, DW_FORM_strx3 or DW_FORM_strx4. Any attribute
in a compilation unit or a type unit using this form refers to an entry in
that unit’s contribution to the .debug_str_offsets.dwo section, which in
turn provides the offset of a string in the .debug_str.dwo section."

 (& some that contradicts it):

"The DW_AT_addr_base and DW_AT_str_offsets_base attributes provide context
that may be necessary to interpret the contents of the corresponding split
DWARF object file."
"The following attributes are not part of a split full compilation unit
entry but instead are inherited (if present) from the corresponding
skeleton compilation unit: DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges,
DW_AT_stmt_list, DW_AT_comp_dir, DW_AT_str_offsets_base, DW_AT_addr_base
and DW_AT_rnglists_base."

I'd rather go with LLVM's existing interpretation - that strx encodings
used in .dwo do not attempt to use str_offsets in the skeleton.
But I wouldn't mind adding a str_offsets_base to the split full unit to
make it clear - this would be consistent with rnglists, I think? (I think,
in theory a rnglistx in a .dwo with a split full unit without a
rnglists_base would use the rnglists_base (and .debug_rnglists non-dwo) in
the executable, but if the split full unit has a rnglists_base, then the
rnglistx in the split full unit use that base to find rnglists in
debug_rnglists.dwo - arguably I'd say we might as well say the same thing
about loclists, too, for consistency, though I don't have any use for
skeleton location lists right now)
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


[Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

2020-08-31 Thread David Anderson via Dwarf-Discuss

I has occurred to me that simply restricting skeleton CUs
to use DW_FORM_string or DW_FORM_strp
would restore the unique meaning of DW_AT_str_offsets_base
to apply to the dwp  (letting non-skeleton CUs use
DW_FORM_strx1 etc ).  With seemingly little impact on
overall size.

DavidA.
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org