Re: dwarfcmp comparison caching

Roland McGrath Mon, 30 Aug 2010 02:02:43 -0700

> I think I see now why dwarfcmp is so slow. But I don't fully understand
> the intent/design of the comparison caching. But it seems we are not
> using the results of the dwarf_comparator tracker caching as often (at
> all?) as we could.


It is explicitly disabled since commit 4ce456a.  This was the subject of
https://fedorahosted.org/pipermail/elfutils-devel/2010-April/000967.html
which I pointed to before.

I suspect I previously posted, but maybe not, this:

diff --git a/libdw/c++/dwarf_output b/libdw/c++/dwarf_output
index 3110ea3..f6ab181 100644  
--- a/libdw/c++/dwarf_output
+++ b/libdw/c++/dwarf_output
@@ -1632,7 +1632,7 @@ namespace elfutils
        /* XXX disabled! tentative circularity matches taint this record!
           must record taint to avoid caching, or punt caching.
         */
-       //_m_pending->_m_matched = doppleganger;
+       _m_pending->_m_matched = doppleganger;
       }
 
       /* This is called by finalize_children.  In case of imported_unit
diff --git a/libdw/c++/dwarf_tracker b/libdw/c++/dwarf_tracker
index cbd253e..5baa6b1 100644  
--- a/libdw/c++/dwarf_tracker
+++ b/libdw/c++/dwarf_tracker
@@ -706,15 +706,15 @@ namespace elfutils
       return matched._m_lhs == NULL && matched._m_rhs == NULL;
     }
 
-    inline bool notice_match (reference_match &/*matched*/,
-                             const die1 &, const die2 &/*b*/, bool matches)
+    inline bool notice_match (reference_match &matched,
+                             const die1 &, const die2 &b, bool matches)
     {
       /* XXX not reliable!
         This match could be predicated on a tentative match of a
         circular ref inside.  We can't cache that!
+      */
       if (matches && matched._m_lhs != NULL)
        matched._m_lhs->second.insert (b);
-      */
       return matches;
     }
 
That would enable all the caching that ever was.

> For future reference, there are a couple of similar, but different
> comparing operators/functions constructs used. An important difference
> to look for is whether dwarf_comparator comparisons are made through
> equals () or match (). In the case of equals () the tracker can "short
> circuit" a successful comparison if it has cached the result.

Correct.  equals is the public method, overloaded for all the component
types that make up the tree.  match is the name of private methods used
at various levels of decomposition.

> - subr::container_equal calls begin() on the arguments to get iterators
>   and compares each element through a predicate function given.

Correct.

> - class dwarf_comparator extends std::binary_function and defines an 
>   bool operator (). This is implemented by calling match (a, b).

Correct.  This is main top-level entry point for comparisons at the
whole file level (dwarf object).  It's not overloaded.  There is no
special significance to this calling match rather than equals, really.
It's just that we never cache whole file objects as being equal, so
there is nothing to do there.

> - a dwarf_comparator has a template function for equals (a, b) which
>   consults the tracker set and the match (a, b) function through
>   return _m_tracker.identical (a, b) || match (a, b);

Correct.  This is a public method so that a dwarf_comparator can be
used to compare individual subtrees or finer-grained items, rather
than just whole files.  The dwarf_comparator objects created in
dwarf_tracker and dwarf_output are used only this way, never to
compare whole files.

> - the private dwarf_comparator match (a, b) functions are specialized:
>   - dwarfs (calls match on compile_units),

Yes.  This will change eventually to include the non-DIE parts of DWARF.

>   - compile_units (uses subr::container_equal with match () as
>                    predicate, while tracker.mismatch),

Note tracker.mismatch is called at every level to support uses like
dwarfcmp -l, where talker::mismatch prints out differences and then
returns true to tell the comparator to keep walking even though the
comparison has already failed.

>   - debug_info_entries (creates a tracker::visitor calls == on tags,
>                         equals () on attributes and children),

attributes container object and children container object, yes.

>   - die attributes (uses subr::container_equal, with match_rhs () as
>                     predicate, which uses the comparator equals ()
>                     method on the values of attributes, while
>                     tracker.mismatch, then tries ordered attribute
>                     lists comparing keys with == and values with equals)

It first uses match_lhs to catch the common case that the mere set of
attribute names (not considering values) doesn't match.  In ordered
implementations (i.e. all but dwarf, the others use dwarf_data), the
match_lhs mismatch tells you the set cannot match so we don't even
consider the values.  If that doesn't rule it out, then it uses
match_rhs to compare the values.  If the implementation isn't ordered,
then it builds an ordered map and uses match_sorted for the final pass
instead.

>   - die children (uses subr::container_equal with match as predicate,
>                   while tracker.mismatch)

Yes.  This then goes through a few layers of method so it calls
match_deref, which uses the tracker::prematch method.  This is almost
the same as tracker::reference_match, but it's for DIEs in the walk
rather than from chasing references.  Here we use cannot_match and
notice_match the same as with reference_match (covered below).  This
is the hook to cache a result from the main walk so that a later
reference comparison will see it, or conversely, for a previous side
trip (see below) that walked over this subtree to pre-cache a result
so that we don't bother descending this subtree again in the main walk.

>   - attribute (calls == on value and equals () on values),

== on name (in match_lhs or match_sorted), yes.

>   - attribute values (depending on the value space almost all
>                       comparisons are done through ==, except for
>                       reference which are compared through
>                       calling reference_match ())

Correct.  All the complex non-reference data types are handled by
their operator== doing something appropriate.  At some point, more
sophistication we acquire might necessitate something that would need
more state somewhere aside from references, but not yet.

>   - child iterator (calls reference_match ())
> 
> - reference_match works on two die iterators. If the comparator ignores
>   refs then it always returns true. Otherwise a die identity comparison
>   is done first, then the tags are compared and whether both have
>   children. 

Correct.  The identity check catches grafts of the same physical tree into
multiple places in a logical tree (imported_unit or collector sharing).
The others are just the checks so cheap that you do them before allocating
more state, to rule out things that simply could never match.  Conceptually
this predicate is "DIEs match && contexts match" (plus magically be able to
answer that with circularities), but since the two scalar pieces of "DIEs
match" are by far the easiest to test, we split that up to test those
first, rather than just doing an .equals (a, b) to encapsulate "DIEs match".

>   Then the tracker is queried. First the tracker
>   reference_matched () and cannot_match () are tried. 

Correct.  This is the main entry point to the tracker where it gets to
apply its magic.  The tracker::reference_match object holds the state that
indicates a comparison in progress.  This is the hook both for caching and
for circularity detection.  The reference_matched return value can indicate
a short-circuit with a positive result.  The cannot_match return value can
indicate a short-circuit with a negative result.  The tracker could make
those decisions based on caching (or negative caching), or based on
identifying a circularity.

>   Then a subtracker and a subcomparator are created (with a reference to
>   the original tracker). 

Notably, the "context" objects are created first.  context_quick_mismatch
gives the tracker a third opportunity to short-circuit the full recursive
comparison.  Two references have to have equal contexts as well as equal
subtrees.  So, again, we do what's cheap to rule out known mismatches.  In
dwarf_tracker, context_quick_mismatch fires if the two "paths from CU down"
to each DIE are not the same number of steps.  (In dwarf_output, it actually
knows enough to do a firm context check cheaply at this stage.)

>   Then the subcomparator equals () is called on
>   the attributes, the original tracker context_match () and the
>   subcomparator equals () in the children. Finally the original tracker
>   notice_match () is called with the actual result so it can possibly log
>   and/or cache that.

Correct.  The subcomparator serves two purposes.  Firstly, it's a plain
dwarf_comparator, in case the calling object is e.g. a talker--so the main
tree comparison talks, but the side trips to do subtree comparisons as part
of a reference attribute comparison don't talk.  Second, it uses the
subtracker for its reference comparisons, rather than the main tracker.

The subtracker serves only one purpose.  It's to do the tracker job
for those side trips, without perturbing the "walk state" of the main
tracker.  The main tracker maintains state for use in random-access
lookups, but it also maintains the state of the current walk and some
things need that.  So a subtracker serves as the tracker for a side
trip, where it can roll forward or jump around in the tree-walking
path from its parent tracker's current point.  But the subtracker
shares the tables with its parent tracker, so everything learned on
side trips goes into the same pool of knowledge about DIEs (keyed on
identity).

> - dwarfcmp extends dwarf_comparator and overrides the () operator to
>   call equals (a, b) in quiet mode or in the noisy variant it calls
>   equals (a, b) && tracker.result.

Noisy (-l) mode uses a tracker (talker) whose mismatch methods return
true, so the comparison keeps going (so it can talk more).  So the
result of equals is never really meaningful.  The talker has to keep
track itself that the comparison actually failed.  Hence for the final
result (exit status of dwarfcmp), we ignore the equals result and use
the record kept by the talker.

> - A dwarf_ref_tracker notice_match is ignored in both the base_tracker
>   and the dwarf_ref_tracker. dwarfcmp overrides it, but stores the
>   result in the base tracker, so it seems that is lost too.

dwarf_ref_tracker::notice_match is where the caching is disabled
(patch above), yes.

>   identical (a, b) is defined as false in the base tracker (which means
>   the dwarf_comparator equals short circuit doesn't work).

Yes, it's only really used for dwarf_output.  I didn't bother enabling
it for normal comparison, since passing the same object to both sides
of dwarf_comparator.equals is not going to come up in dwarfcmp.

> I am not really sure where we the compare caching should really go and
> be fixed because he precise idea behind the equal/match and
> identical/notice_match are not completely clear to me. Is the
> entanglement of the context path caching and the identical tracking on
> purpose?

The "identical" method is just a short-cut for the pointer equality
case in dwarf_output.  It's not involved in plain dwarfcmp at all.
Don't think about that.

Some of the methods are called equals and some are called match,
that's not a deep thing.  The only important things going on in the
comparator are about using the tracker.

A tracker::walk object lives while comparing two CUs.
A tracker::step object lives while comparing two children iterators.
A tracker::visitor object lives while comparing two (sub-CU) DIEs.

This gives the tracker those objects' constructors and destructors as
hooks for the comparison walk.  The walk and step hooks are how the
tracker collects the path-from-CU to associate with each DIE.  To
handle references in attributes, the tracker also has to support
random-access lookups.  When there are forward references, then it has
to resolve a random-access lookup by winding the walk forward past
where the main comparison has touched yet, just to reach the reference
target DIE and know what its context (path from CU) was.  So it makes
sense that keeping track of the current walk and keeping track of the
table of paths to each DIE are done together.

That basic work of tracking a walk and storing the contexts keyed by
DIE identity is what dwarf_path_finder does.  dwarf_ref_tracker has
one of these for each side of the comparison, since we have two walks
proceeding in lock-step to compare.

The only reason to track that stuff in the first place is to use it
for deciding reference comparisons.  In fact, that is the main
responsibility of the tracker.  So all that is done inside the tracker
because it's the tracker that wants to know--it's just bookkeeping to
enable the reference handling.  

The reference handling has the primary responsibility of catching
circularity in reference chains.  That wants a table keyed on DIE
identity.

The reference_match implementation plan of doing side trips to compare
referenced subtrees is what introduces redundancy to the basic tree
walk.  So that detail of the tracker is the whole reason to cache
comparisons already done.  If the tracker weren't doing side trips,
there would never be any comparisons repeated (except if both sides
were using grafts).  And, it's a second thing that wants a table keyed
on DIE identity, so using the same hash table lookup for both the
circularity detection and the match caching seems like a good fit.  
So that's why the caching of positive or negative matches goes in the
tracker.

So, yes, it's on purpose, but in that it seems to make intrinsic sense
to organize it this way.  If you have different ideas about how to
organize things, I am certainly open to them.

Also note that the dwarf_output stuff (what really matters most) is
generally modelled on the same scheme, but its tracker implementation
is substantially different (dwarf_output::copier::tracker).  Instead
of the comparison walk, we have the copy-construction walk, and so it
makes sense to meld it with the copier data structures for building
DIEs in the collector.  (Or perhaps it doesn't, but then you'll have
to tell me how exactly to do the whole thing differently.)


Thanks,
Roland
_______________________________________________
elfutils-devel mailing list
[email protected]
https://fedorahosted.org/mailman/listinfo/elfutils-devel

Re: dwarfcmp comparison caching

Reply via email to