> I think I see now why dwarfcmp is so slow. But I don't fully understand > the intent/design of the comparison caching. But it seems we are not > using the results of the dwarf_comparator tracker caching as often (at > all?) as we could.
It is explicitly disabled since commit 4ce456a. This was the subject of https://fedorahosted.org/pipermail/elfutils-devel/2010-April/000967.html which I pointed to before. I suspect I previously posted, but maybe not, this: diff --git a/libdw/c++/dwarf_output b/libdw/c++/dwarf_output index 3110ea3..f6ab181 100644 --- a/libdw/c++/dwarf_output +++ b/libdw/c++/dwarf_output @@ -1632,7 +1632,7 @@ namespace elfutils /* XXX disabled! tentative circularity matches taint this record! must record taint to avoid caching, or punt caching. */ - //_m_pending->_m_matched = doppleganger; + _m_pending->_m_matched = doppleganger; } /* This is called by finalize_children. In case of imported_unit diff --git a/libdw/c++/dwarf_tracker b/libdw/c++/dwarf_tracker index cbd253e..5baa6b1 100644 --- a/libdw/c++/dwarf_tracker +++ b/libdw/c++/dwarf_tracker @@ -706,15 +706,15 @@ namespace elfutils return matched._m_lhs == NULL && matched._m_rhs == NULL; } - inline bool notice_match (reference_match &/*matched*/, - const die1 &, const die2 &/*b*/, bool matches) + inline bool notice_match (reference_match &matched, + const die1 &, const die2 &b, bool matches) { /* XXX not reliable! This match could be predicated on a tentative match of a circular ref inside. We can't cache that! + */ if (matches && matched._m_lhs != NULL) matched._m_lhs->second.insert (b); - */ return matches; } That would enable all the caching that ever was. > For future reference, there are a couple of similar, but different > comparing operators/functions constructs used. An important difference > to look for is whether dwarf_comparator comparisons are made through > equals () or match (). In the case of equals () the tracker can "short > circuit" a successful comparison if it has cached the result. Correct. equals is the public method, overloaded for all the component types that make up the tree. match is the name of private methods used at various levels of decomposition. > - subr::container_equal calls begin() on the arguments to get iterators > and compares each element through a predicate function given. Correct. > - class dwarf_comparator extends std::binary_function and defines an > bool operator (). This is implemented by calling match (a, b). Correct. This is main top-level entry point for comparisons at the whole file level (dwarf object). It's not overloaded. There is no special significance to this calling match rather than equals, really. It's just that we never cache whole file objects as being equal, so there is nothing to do there. > - a dwarf_comparator has a template function for equals (a, b) which > consults the tracker set and the match (a, b) function through > return _m_tracker.identical (a, b) || match (a, b); Correct. This is a public method so that a dwarf_comparator can be used to compare individual subtrees or finer-grained items, rather than just whole files. The dwarf_comparator objects created in dwarf_tracker and dwarf_output are used only this way, never to compare whole files. > - the private dwarf_comparator match (a, b) functions are specialized: > - dwarfs (calls match on compile_units), Yes. This will change eventually to include the non-DIE parts of DWARF. > - compile_units (uses subr::container_equal with match () as > predicate, while tracker.mismatch), Note tracker.mismatch is called at every level to support uses like dwarfcmp -l, where talker::mismatch prints out differences and then returns true to tell the comparator to keep walking even though the comparison has already failed. > - debug_info_entries (creates a tracker::visitor calls == on tags, > equals () on attributes and children), attributes container object and children container object, yes. > - die attributes (uses subr::container_equal, with match_rhs () as > predicate, which uses the comparator equals () > method on the values of attributes, while > tracker.mismatch, then tries ordered attribute > lists comparing keys with == and values with equals) It first uses match_lhs to catch the common case that the mere set of attribute names (not considering values) doesn't match. In ordered implementations (i.e. all but dwarf, the others use dwarf_data), the match_lhs mismatch tells you the set cannot match so we don't even consider the values. If that doesn't rule it out, then it uses match_rhs to compare the values. If the implementation isn't ordered, then it builds an ordered map and uses match_sorted for the final pass instead. > - die children (uses subr::container_equal with match as predicate, > while tracker.mismatch) Yes. This then goes through a few layers of method so it calls match_deref, which uses the tracker::prematch method. This is almost the same as tracker::reference_match, but it's for DIEs in the walk rather than from chasing references. Here we use cannot_match and notice_match the same as with reference_match (covered below). This is the hook to cache a result from the main walk so that a later reference comparison will see it, or conversely, for a previous side trip (see below) that walked over this subtree to pre-cache a result so that we don't bother descending this subtree again in the main walk. > - attribute (calls == on value and equals () on values), == on name (in match_lhs or match_sorted), yes. > - attribute values (depending on the value space almost all > comparisons are done through ==, except for > reference which are compared through > calling reference_match ()) Correct. All the complex non-reference data types are handled by their operator== doing something appropriate. At some point, more sophistication we acquire might necessitate something that would need more state somewhere aside from references, but not yet. > - child iterator (calls reference_match ()) > > - reference_match works on two die iterators. If the comparator ignores > refs then it always returns true. Otherwise a die identity comparison > is done first, then the tags are compared and whether both have > children. Correct. The identity check catches grafts of the same physical tree into multiple places in a logical tree (imported_unit or collector sharing). The others are just the checks so cheap that you do them before allocating more state, to rule out things that simply could never match. Conceptually this predicate is "DIEs match && contexts match" (plus magically be able to answer that with circularities), but since the two scalar pieces of "DIEs match" are by far the easiest to test, we split that up to test those first, rather than just doing an .equals (a, b) to encapsulate "DIEs match". > Then the tracker is queried. First the tracker > reference_matched () and cannot_match () are tried. Correct. This is the main entry point to the tracker where it gets to apply its magic. The tracker::reference_match object holds the state that indicates a comparison in progress. This is the hook both for caching and for circularity detection. The reference_matched return value can indicate a short-circuit with a positive result. The cannot_match return value can indicate a short-circuit with a negative result. The tracker could make those decisions based on caching (or negative caching), or based on identifying a circularity. > Then a subtracker and a subcomparator are created (with a reference to > the original tracker). Notably, the "context" objects are created first. context_quick_mismatch gives the tracker a third opportunity to short-circuit the full recursive comparison. Two references have to have equal contexts as well as equal subtrees. So, again, we do what's cheap to rule out known mismatches. In dwarf_tracker, context_quick_mismatch fires if the two "paths from CU down" to each DIE are not the same number of steps. (In dwarf_output, it actually knows enough to do a firm context check cheaply at this stage.) > Then the subcomparator equals () is called on > the attributes, the original tracker context_match () and the > subcomparator equals () in the children. Finally the original tracker > notice_match () is called with the actual result so it can possibly log > and/or cache that. Correct. The subcomparator serves two purposes. Firstly, it's a plain dwarf_comparator, in case the calling object is e.g. a talker--so the main tree comparison talks, but the side trips to do subtree comparisons as part of a reference attribute comparison don't talk. Second, it uses the subtracker for its reference comparisons, rather than the main tracker. The subtracker serves only one purpose. It's to do the tracker job for those side trips, without perturbing the "walk state" of the main tracker. The main tracker maintains state for use in random-access lookups, but it also maintains the state of the current walk and some things need that. So a subtracker serves as the tracker for a side trip, where it can roll forward or jump around in the tree-walking path from its parent tracker's current point. But the subtracker shares the tables with its parent tracker, so everything learned on side trips goes into the same pool of knowledge about DIEs (keyed on identity). > - dwarfcmp extends dwarf_comparator and overrides the () operator to > call equals (a, b) in quiet mode or in the noisy variant it calls > equals (a, b) && tracker.result. Noisy (-l) mode uses a tracker (talker) whose mismatch methods return true, so the comparison keeps going (so it can talk more). So the result of equals is never really meaningful. The talker has to keep track itself that the comparison actually failed. Hence for the final result (exit status of dwarfcmp), we ignore the equals result and use the record kept by the talker. > - A dwarf_ref_tracker notice_match is ignored in both the base_tracker > and the dwarf_ref_tracker. dwarfcmp overrides it, but stores the > result in the base tracker, so it seems that is lost too. dwarf_ref_tracker::notice_match is where the caching is disabled (patch above), yes. > identical (a, b) is defined as false in the base tracker (which means > the dwarf_comparator equals short circuit doesn't work). Yes, it's only really used for dwarf_output. I didn't bother enabling it for normal comparison, since passing the same object to both sides of dwarf_comparator.equals is not going to come up in dwarfcmp. > I am not really sure where we the compare caching should really go and > be fixed because he precise idea behind the equal/match and > identical/notice_match are not completely clear to me. Is the > entanglement of the context path caching and the identical tracking on > purpose? The "identical" method is just a short-cut for the pointer equality case in dwarf_output. It's not involved in plain dwarfcmp at all. Don't think about that. Some of the methods are called equals and some are called match, that's not a deep thing. The only important things going on in the comparator are about using the tracker. A tracker::walk object lives while comparing two CUs. A tracker::step object lives while comparing two children iterators. A tracker::visitor object lives while comparing two (sub-CU) DIEs. This gives the tracker those objects' constructors and destructors as hooks for the comparison walk. The walk and step hooks are how the tracker collects the path-from-CU to associate with each DIE. To handle references in attributes, the tracker also has to support random-access lookups. When there are forward references, then it has to resolve a random-access lookup by winding the walk forward past where the main comparison has touched yet, just to reach the reference target DIE and know what its context (path from CU) was. So it makes sense that keeping track of the current walk and keeping track of the table of paths to each DIE are done together. That basic work of tracking a walk and storing the contexts keyed by DIE identity is what dwarf_path_finder does. dwarf_ref_tracker has one of these for each side of the comparison, since we have two walks proceeding in lock-step to compare. The only reason to track that stuff in the first place is to use it for deciding reference comparisons. In fact, that is the main responsibility of the tracker. So all that is done inside the tracker because it's the tracker that wants to know--it's just bookkeeping to enable the reference handling. The reference handling has the primary responsibility of catching circularity in reference chains. That wants a table keyed on DIE identity. The reference_match implementation plan of doing side trips to compare referenced subtrees is what introduces redundancy to the basic tree walk. So that detail of the tracker is the whole reason to cache comparisons already done. If the tracker weren't doing side trips, there would never be any comparisons repeated (except if both sides were using grafts). And, it's a second thing that wants a table keyed on DIE identity, so using the same hash table lookup for both the circularity detection and the match caching seems like a good fit. So that's why the caching of positive or negative matches goes in the tracker. So, yes, it's on purpose, but in that it seems to make intrinsic sense to organize it this way. If you have different ideas about how to organize things, I am certainly open to them. Also note that the dwarf_output stuff (what really matters most) is generally modelled on the same scheme, but its tracker implementation is substantially different (dwarf_output::copier::tracker). Instead of the comparison walk, we have the copy-construction walk, and so it makes sense to meld it with the copier data structures for building DIEs in the collector. (Or perhaps it doesn't, but then you'll have to tell me how exactly to do the whole thing differently.) Thanks, Roland _______________________________________________ elfutils-devel mailing list [email protected] https://fedorahosted.org/mailman/listinfo/elfutils-devel
