Hi,

I was looking through dwarfcmp to try to understand the libdw/c++ stuff.
In particular dwarf_comparator and dwarf_tracker. But I admit to get
lost a bit in the technicalities of the implementation (my c++ template
foo is pretty weak). So I thought about how I would solve the general
problem of finding out whether two die tree/graphs are equivalent.
Hopefully the idea is somewhat similar to how the code solves it, so I
only have to better understand the added caching and understand the
implicit stacks/paths that are kept better (the little walk, step,
visitor classes to push and pop when things come in and go out scope
with destructors is neat, but make my head hurt). But maybe my idea has
flaws that explain why the code is different/more involved than what I
expect.

- We start with two DIEs that we want to prove equivalent.

- A DIE has an unique identifier (normally the file plus index into
  the .debug_info, but it could be something else), if two DIEs have the
  same unique identifier they are equal.
  Two DIEs with different unique identifiers could still be equal.

- A DIE has a tag, for two DIEs to be equal they must have the same tag.

- A DIE has a (ordered) list of children DIEs. For two DIEs to be equal
  they must have the same number of equal children DIEs (in order).

- Till this far we just have a tree of (tagged) DIEs which we could
  compare by walking them in some (breath-first/depth-first) order and
  see if all the DIE tags match up.

- The wrinkle comes from a DIE also having a set of attributes. The
  attributes don't have an order, but we could invent some by sorting
  them before comparison. An attribute as a key and a value.

- The attribute keys are simple names. For two DIEs to be equal they
  have to have the same number of attributes, with the same names.

- The attribute values can be simple values. For two DIEs to be equal
  all attributes with the same names with simple values should have the
  same value (modulo encoding of such value).

- The attribute values can also be references to DIEs. This might be new
  DIEs, but also DIEs "up" in the DIE children tree. This is why the DIE
  tree is actually a (pointed) graph.

- For two DIEs to be equal the attribute values of the attributes with
  the same key names that have references to DIEs as value should:
    a) Be both not yet seen DIEs in the respective trees so far, and
       be equal DIEs (in this case we can just treat the attribute
       reference as we would a child DIE, compare it in order).
  Or:
    b) Be both already seen DIEs in the respective trees so far, and be
       in the "same location".

The "same location" means that the references to the DIEs in the trees
are at the same place if we would walk the tree in the order chosen
above.

To keep the order of the nodes I would keep two stacks of DIEs that
represent the path traveled so far. That way whether or not a DIE has
already been seen is just whether it is already on the stack, and the
"location" can just be expressed as the index of that DIE into the
stack.

Does any of the above make sense? Is this what the comparator/tracker
logic does? Are there wrinkles in the above logic that make it more
involved? What kind of caching should be added to make the above
easier/better/faster?

Cheers,

Mark

_______________________________________________
elfutils-devel mailing list
[email protected]
https://fedorahosted.org/mailman/listinfo/elfutils-devel

Reply via email to