Re: dwarf_output overview

Roland McGrath Fri, 06 Aug 2010 15:22:01 -0700

> Yes, I saw that example. I need to figure out where/how this is handled
> in practice in the current scheme. Do you happen to have an actual
> example of when this kind of situation arises?


See dwarf_tracker: dwarf_ref_tracker::reference_matched.  

The "plain vanilla circularity" was handled by the "We have a circularity
on the left-hand side." part.  When the asymmetrical kind came up, I added
the "Our right-hand side is not in lock-step on a matching circularity."
part there.

Where this came up was in dwarfcmp-test.  That is, the comparison running
on an original dwarf reading a file vs a dwarf_output made from that.
I don't recall what file the first test case was, but it came up in some of
the files from the build itself, so 'make check' (run-dwarfcmp-test.sh)
hit it.

I can't get current GCC to do it in trivial cases off hand.  But the
situation was that it would emit <pointer_type type=sameref/> several times
in the same CU.  It might be that current GCC doesn't do that as much, but
I think it was happening in complex cases where it was sort of
"understandable".  i.e., it might make sense that for those simple DIEs GCC
didn't keep track of having emitted them before.  So what I'd say to try to
see it is play with dwarfcmp-test on src/* files (only the final links work
so far, not .o's).  If you breakpoint or just comment out the whole second
part of the tracking in reference_matched (the part maintaining _m_active),
you should see the problem hit.

> It seems to me that if we can somehow embed/hash in the structure/cycles
> better the number of comparisons we need to do to find duplicate
> subgraphs should decrease a lot. But I probably need to do some more
> testing. What would be some good representative samples to look at?

I don't know about representative.  My methodology has been to start with
the all-C binaries in the elfutils build itself first.  (Really that's just
what's easiest to put into 'make check'.)  Second is vmlinux (I used my
random build, but you can get a whole one in any kernel-debuginfo rpm too),
which is also just "simple" C code, but sometimes more complex, and also a
good test in handling a zillion CUs.  Third is the C++ binaries in the
elfutils build (C++ DWARF is *way* more complex in reference graphs
usually), first dwarflint is a smaller one, then dwarfcmp is pretty hairy.

Our testing plan (see wiki bits) is to use the entire history of Fedora and
RHEL debuginfo rpms as representative data.  But that's for big test runs
once we are correctly handling all the local test data readily on hand.

Beyond the really trivial cycles (like any linked-list struct type),
I have not found it very easy to come up with small sources that produce
more interesting DWARF graphs.  But I haven't really tried all that hard.
I always just settled into some particular corner of weirdness in the
dwarflint or dwarfcmp binaries that I found by testing, and chugged away on
debugging handling those particular subtrees.

If you have any luck coming up with hand-written C/C++ code that produces
more interesting DWARF trees for test cases, then I will certainly put some
time into making it easier to maintain those in the elfutils sources to be
used as test data for 'make check'.

> Something I was thinking about with respect to this. We want to find
> subtrees that are equal. But can two subtrees that are part of different
> larger subgraphs actually be similar? Since subtrees embedded into
> different larger graphs will have a different contexts. Or am I
> visualizing things wrongly?

Well, it's a "wrong" thought that touches on a subtle and underspecified
question that I've been forgetting for some time to bring up among DWARF
consumers (i.e. GDB folks).  My starting perspective that makes that a
wrong idea is this:

Context is not part of subtree equivalence.
Context is part of reference equivalence.

That is for the "correctness of the tree transformation" test that we
define by what dwarfcmp calls "equal".  Two subtrees are equal if they
match in structure and attributes, looking in, i.e. "local" equality.
If that's true, then we can physically share that subtree in both places
in the logical tree.  Consider:

        <formal_parameter type="#int" name="x"/>

This (childless) subtree will appear in many different places.  Assuming
they refer to the same/equivalent #int DIE, then clearly it is fine to
share this physical subtree in numerous different places in the logical
tree (aka contexts).  If you decompress back to no sharing, then you are
going to get out exactly the same whole logical tree (and that being
exactly the physical tree, is what no sharing means) you had to start with.

Likewise, consider:

        <compile_unit>
          <namespace name="A">
            <structure_type name="empty" byte_size=0/>
          </namespace>
          <namespace name="B">
            <structure_type name="empty" byte_size=0/>
          </namespace>
        </compile_unit>

This is from the C++:

        namespace A { struct empty {}; };
        namespace B { struct empty {}; };

but glossing over the DECL attributes (which would only all match with some
#include use and so forth).

Now, for "correctness of the tree transformation", if that is the whole
tree, then the <structure_type/> can be shared.  The physical tree becomes:

        <compile_unit>
          <namespace name="A">
            <imported_unit import="#pu"/>
          </namespace>
          <namespace name="B">
            <imported_unit import="#pu"/>
          </namespace>
        </compile_unit>
        <partial_unit id="pu">
          <structure_type name="empty" byte_size=0/>
        </partial_unit>

In the logical view, that's the same tree as above.  The only difference is
if you look at the debug_info_entry::identity () (i.e., dwarf_dieoffset,
for a consumer)--do you think the two "empty"'s are the same DIE or not?
(More on that later.)

When it clearly matters is when there are references.
So add to the C++:

        A::empty a;
        B::empty b;

So the tree looks like:

        <compile_unit>
          <namespace name="A">
            <structure_type id="atype" name="empty" byte_size=0/>
          </namespace>
          <namespace name="B">
            <structure_type id="btype" name="empty" byte_size=0/>
          </namespace>
        </compile_unit>
        <variable name="a" type="#atype"/>
        <variable name="b" type="#btype"/>

Now it becomes clear that type="#atype" and type="#btype" are distinct
references that should not be conflated.  If you conflated them, then
the decompression of the compressed tree would have:

        <variable name="b" type="#atype"/>

and that's wrong.

So the way I've formulated it is with these definitions:

* trees are equal if their tags, attributes, and children lists are equal
* tags are integers, equality is integer equality
* children lists are equal if the same length and respective child trees equal
* attributes are equal if the same name/value pairs and equal values
* non-reference attribute values are "simple" types, equality is
  type-appropriate.  (Later, it may have strict/non-strict variants of
  equality, for things like an exprloc vs a constant for
  data_member_location location.  The point being there are more details
  here in practice, but there is no more algorithmic complexity to be
  thinking about now.)
* reference attribute values are equal if their referent DIEs are equal
  trees in equal-enough contexts
* what goes into "context" and "equal-enough" is fungible and intended to
  be tuned by practical experience with real data later in the project.
  To begin with:
* context of a DIE means its parent DIE's tag, non-reference attributes,
  and context (hence iterate up to compile_unit level, which is not part of
  the context).  (From prior discussions here, we will probably refine the
  terminal context to include a small subset of compile_unit attributes.
  For now, the simple definition stands.)

Hence, referent context is part of reference equality, but a tree's own
context is not part of tree equality.  

Given circular references inside, a tree's own context can indirectly
become part of tree equality.  But I'm not sure that's actually the way we
want to think about it, or define it.  It's hazy to me.  (Help me on this!)
Intuitively, it seems like two comparable trees with internal circularities
might rightly be found equivalent even in different contexts.  But that's a
vague notion.  Let's try to consider a case:

        <compile_unit>
          <namespace name="A">
            <structure_type id="atype" name="list">
              <member name="next" type="#aptr" data_member_location=0/>
            </structure_type>
          </namespace>
          <pointer_type id="aptr" type="#atype"/>
          <namespace name="B">
            <structure_type id="btype" name="list">
              <member name="next" type="#bptr" data_member_location=0/>
            </structure_type>
          </namespace>
          <pointer_type id="bptr" type="#btype"/>
          <variable name="a" type="#aptr"/>
          <variable name="b" type="#bptr"/>
        </compile_unit>

Ok, that's the first case of circularity I could come up with.  Clearly
there it does matter not to conflate type="#aptr" vs type="#bptr" in the
two <variable/> DIEs' attributes.  If we didn't have those "outside"
references, would it matter?  Well, no, not to tree transformation.

Here's another case that's even simpler:

        <compile_unit>
          <namespace name="A">
            <structure_type id="atype" name="empty">
              <member external=1 declaration=1 name="x" type="#atype"/>
            </structure_type>
          </namespace>
          <namespace name="B">
            <structure_type id="btype" name="empty">
              <member external=1 declaration=1 name="x" type="#btype"/>
            </structure_type>
          </namespace>
        </compile_unit>

That's for:

        namespace A
        {
          struct empty
          {
            static empty x;
          };
        };
        namespace B
        {
          struct empty
          {
            static empty x;
          };
        };

So, are atype and btype shareable?  No, they're not.
If they're conflated, the consumer could interpret it as having been:

        namespace A
        {
          struct empty
          {
            static empty x;
          };
        };
        namespace B
        {
          struct empty
          {
            static A::empty x;
          };
        };

So context does matter for this "simple" circular reference case.  Though
it would seem not to matter (absent references) in the "more complex" (but
very common) case with {A,B}::list, above.

So, remember "More on that later"?  It's later now.  We're working on the
assumption that context only matters for reference equality, i.e. only
matters when there is actually a reference to each of the two things.
That's the essence of the formulation of equality definitions I gave above.

But real consumers don't just do tree transformations and whole-tree
comparisons.  Even if there is no reference attribute pointing to a DIE, a
consumer might treat the DIE "identity" (defined as same offset in same
file) as meaningful.  If they do, then each and every conflation of two
identical DIEs in non-identical contexts might be a problem.  But at worst,
it's only certain sorts of DIEs (types, variables?) that a consumer might
compare for identity.  And I can't really think of something very coherent
for an example of the sort of thing a debugger might do where it would come
to care.  The point being, whether it would ever make a difference to a
debugger to distinguish raw DIE identity from full "path equality",
i.e. what dwarf::children_type::const_iterator::operator== tests,
which is that your logical tree walk's conception of the context of the two
DIEs matches as well as the raw DIE identity matching (i.e. what
dwarf::raw_children_type::const_iterator::operator== tests).

If we made the contrary assumption that all context always matters to tree
equality per se, then we'd forego a lot of sharing opportunities.  At least
that's my sense, but it may well be a misguided one.  Anything that has
DECL attributes is rarely going to have those match when the context
doesn't, for example.  The concrete examples I can think of are fairly
piddly, but they probably do add up.  Say:

        <formal_parameter name="name" type="#constcharstar"/>

Now that appears in a zillion prototypes, and those (last I checked) don't
have any other attributes like DECL coordinates in the children, just in
the <subprogram declaration=1 ...> DIE itself.  So they really do appear in
their zillions, just like that, all identical.  If we ignore the abbrevs as
all coming out in the wash, then each of these costs 4 bytes for
DW_FORM_strp and 1-4 bytes for the DW_FORM_ref* (but probably 4, since the
#constcharstar DIE surely should be shared across many CUs, hence a ref_addr),
so at least 5, and probably 8, total.  An <imported_unit/> DIE always has
exactly one DW_FORM_ref_addr, so costs exactly 4 every time.  Even it's
only 1 byte saved (but it's probably 4), that 1 (or 4) times a zillion in
the end.  The overhead of adding each partial_unit to make something
individually shareable is something like 11 bytes.  So at 12 or more shared,
it's a net win.


Thanks,
Roland
_______________________________________________
elfutils-devel mailing list
[email protected]
https://fedorahosted.org/mailman/listinfo/elfutils-devel

Re: dwarf_output overview

Reply via email to