Eureka!!! Unified nodes

Edward K. Ream Sun, 13 Apr 2008 05:20:59 -0700

I believe it is possible to unify vnodes and tnodes into what I'll
just call a **unified node**, or node for short.  A node will contain
headline, body text, attributes, and a (Python) list of the node's
children.


I have been searching for this Aha for more than 10 years.

The aha is that all we need are iterators (positions) and nodes.
**iterators replace vnodes**.  Iterators, positions and nodes suffice
to do anything that can presently be done.  Examples:

- Leo can draw the screen.  The screen drawing code merely traverses
the outline, putting widgets on the screen corresponding to nodes.

- Given a position p, Leo can compute p.parent(), p.firstChild(),
p.back() and p.next() using the (unified) nodes in p.stack and p.n
(analogous to p.v today).

- Leo can insert, delete and move nodes.  Nodes are inserted or
deleted at a **position**, not at a node.  Given a position, we can
determine the **direct parent** of a node **at that position**.  Given
the parent node, we can insert or delete a node from the *parent's*
child list.

- Leo can create nodes and the present position from .leo files.  The
will probably require a rewrite of the sax read code, but I suspect it
will be a pleasant job.  In particular, the scheme used to represent
the present position will work unchanged.

The unification of vnodes and tnodes into a unified node type has
important benefits:

- There will be no more confusion about tnodes and vnodes.  This is
equivalent to saying that two vnodes v1 and v2 are the **same vnode**
if v1.t == v2.t.  Indeed, in the *present* scheme of things, let us
say that two nodes v1 and v2 are **joined** if and only if v1.t ==
v2.t.  So cloned nodes are joined, and all nodes in the shared
subtrees of cloned nodes are joined.  In this respect, the unified-
node world is equivalent to yesterday's proposal to move all
attributes into tnodes.

You can't mark different jointed vnodes independently in the unified-
node world: they become *exactly* the same node.  But notice, in
today's world, joined vnodes update their headline and body text in
synch.  Imo, it is conceptually **much** clearer to treat all joined
nodes as the same node.  The unified-node world would simply build in
this properly into Leo in a fundamental way.

- Unified nodes are *precisely* the nodes contemplated in the graph
world.  Given a different set of iterators, we *could* implement a
graph world with these data structures.  I probably will not do so, at
least immediately, for the same reasons why I thought the graph world
was not a great idea, but now there are no data-structure impediments
to doing so. Recall that a key part in my rejection of the graph world
was a realization that iterators in any graph world are much more
complex than Leo's iterators.

- Speaking of iterators, the unified-node world would just about halve
the number of iterators needed.  There are just *node* iterators, not
vnode iterators plus tnode iterators.  And we can add a 'unique'
argument to iterators, telling whether the iterator should return each
node just once in the iteration. The default would be True. So the
list of iterator would be:

- c.allNodes_iter(unique=True)
- p.children_iter(unique=True)
- p.parents_iter() # returns the direct parents at position p.
- p.siblings_iter(unique=True)
- p.sub_tree_iter(unique=True)
- p.self_and_siblings_iter(unique=True)
- p.self_and_subtree_iter(unique=True)
- p.self_and_parents_iter()
etc.

The parent iterators do not need a 'unique' keyword because Leo
outlines are DAG's.  I assume, but haven't proven, that the recent
iterator optimizations could be applied just as easily in the unified-
node world as in today's world.

- All links fields (v._back, v._next, v._parent, v.t._firstChild)
disappear, greatly simplifying the implementation.

You might ask, "why didn't you see this before?"  It's mostly a matter
of history.  Leo started life as C code, and that lead to data
structures that used links.  Positions came relatively late, and
moving to a position-based world was a big enough project.  The
realization that the shared node world (i.e., today's world) was
possible *at all* was enough revolution at the time.

Perhaps the biggest reason was that, for all these years, I have been
obsessed with representing the tree/graph structure entirely in
nodes.  But iterators and positions aren't nodes!!!

Another reason is Leo's file format.  It has naturally lead me in the
direction of vnodes and tnodes.  In the unified node world, we shall
want to revise and/or reinterpret Leo's file format.  Now that we know
that Leo's nodes could indeed form an arbitrary graph, it's becoming
clear that a tree-based file format is both misleading and too
restrictive.  In the past, I have shown that Leo outlines could be
represented in a strictly hierarchical format such as OPML, but that
obscures that fact that Leo data are DAG's, not trees.  So it may be
time to confront head on the fact that Leo data are (potentially)
general graphs.  This is actually an important step forward.  For way
too long I have been reluctant to "tell the truth" in .leo files.  In
particular, hierarchy does *not* suffice to represent Leo outlines.
That's why we have t attributes in <v> elements and tx attributes in
<t> elements.  **These attributes form links**: the hierarchy of <v>
elements is just sugar coating.  You might even call it an untruth :-)

There have been several proposals lately for allowing variations in
Leo's file format.  These proposals amount to requiring that the sax
read code be flexible.  It should not be a big deal.

In conclusion, today is one of the most important days in Leo's
history.  It appears that Leo will, relatively soon, move to the
unified-node world. Naturally, all of today's conclusions are
preliminary.  The next step will be to create a unified-node bzr
branch.  I'll use that branch to look for problems that I have not yet
foreseen.  Perhaps the ease with which huge reorgs can be contemplated
with bzr was another reason why this Aha came when it did.

Edward

P.S. As a matter of symmetry, a unified node could contain a list of
the node's parents.  This is not necessary: it is analogous to double-
linking a linked list.  Similarly, a node could contain a list of the
node's siblings.  Nodes won't actually contain such parent and sibling
lists, but realizing that the symmetry could be preserved was an
important part of the Aha.  In other words, the path to the Aha wasn't
as I have presented it.

P.P.S. Even this morning, I probably can not reproduce last night's
thinking: the new world drives away the old.  However, I do recall
that the realization that cloned nodes could be considered the *same*
node (in spite of their different positions in the outline!) was a key
step leading to the Aha.

I better put this in now, or it will be gone forever.  I had a picture
of nodes as 'beans', and the outline as fingers holding the beans.
Ah, I see it now.  The outline (structure) *wasn't* nodes.  The
picture was probably my subconscious telling me that iterators aren't
nodes.  So, just at this very minute as I write this, I see how the
Aha happened.

EKR
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~----------~----~----~----~------~----~------~--~---

Eureka!!! Unified nodes

Reply via email to