I do have sympathy for those who would like to leave their code alone,
but...
Vitalije has inspired me to create the FastRead class in
leoFileCommands.py. It's in the "fast-read" branch. You can test it with
this script:
import leo.core.leoFileCommands as fc
fc.FastRead(c).read(<path to some .leo file>)
When run on my local copy of leoPy.leo, the output is:
makeVnodes 792 nodes, 0.0076 sec
This code proves that no changes to Leo's vnode class are required, but see
the caveats section below.
*Acknowledgements*
This would never have happened without Vitalije's recent work, for several
reasons:
1. I didn't know that xml.etree.ElementTree is now part of Python's
standard lib. It wasn't in 2001 when I wrote the first version of the .leo
file read code.
2. Vitalije has challenged me to get much better performance. I seriously
underestimated how important this is.
3. The lisp-like pattern used in FastRead.makeVnodes was foreign to me
until a few days ago. It's beautiful and creates the fastest possible
code. I'll show the code later.
*Caveats*
The code isn't woven into Leo's read code. In particular, the code uses an
internal VNode class for testing, to ensure that the actual outline doesn't
get bombed. It's possible that using the actual vnode class might slow the
code. If so, a new kwarg will eliminate the useless slow code.
*FastRead.makeVnodes*
Here is the heart of the new code:
def makeVnodes(self, gnx2body, gnx2vnode, v_elements):
context = None
def v_element_visitor(parent_e, parent_v):
for e in parent_e:
assert e.tag in ('v','vh'), e.tag
if e.tag == 'vh':
head = e.text or g.u('')
assert g.isUnicode(head), head.__class__.__name__
parent_v._headString = head
continue
gnx = e.attrib['t']
if gnx in gnx2vnode:
# A clone
v = gnx2vnode.get(gnx)
parent_v.children.append(v)
v.parents.append(parent_v)
else:
# Make a new vnode, linked to the parent.
v = self.VNode(context=context, gnx=gnx)
gnx2vnode [gnx] = v
parent_v.children.append(v)
v.parents.append(parent_v)
body = gnx2body.get(gnx) or ''
assert g.isUnicode(body), body.__class__.__name__
v._bodyString = body
v._headString = 'PLACE HOLDER'
# Handle all inner elements.
for child_e in e:
v_element_visitor(e, v)
#
# Create the hidden root vnode.
gnx = 'hidden-root-vnode-gnx'
v = self.VNode(context=context, gnx=gnx)
v._headString = '<hidden root vnode>'
gnx2vnode [gnx] = v
#
# Traverse the tree of v elements.
v_element_visitor(v_elements, v)
return v
*Classes vs functions*
Vitalije and I have been discussing the merits of a lisp-like style. Imo,
the FastRead class is a full resolution of the debate. The top-level code
is:
def readWithElementTree(self, s):
xroot = ElementTree.fromstring(s)
v_elements = xroot.find('vnodes')
t_elements = xroot.find('tnodes')
gnx2body = self.makeBodyDict(t_elements)
gnx2vnode = {}
hidden_v = self.makeVnodes(gnx2body, gnx2vnode, v_elements)
return hidden_v
Clearly, there is no performance penalty in using this code.
But the inner v_element_visitor is a superb pattern. The bindings are
clear, there is no need to use "self", and the fewest possible arguments
are passed to this function. Perhaps most importantly, starting the
recursion seemed much simpler than usual.
*Summary*
The FastRead class is the way Leo's .leo file read code is written in "The
Book". It is simple and spectacularly fast.
readWithElementTree and makeVnodes resolve the tension between class-based
and function-based code.
No major changes are needed to the VNode class. A small change might
support the new code.
None of this would have happened without Vitalije's challenges. I make no
apologies for stealing his ideas ;-)
Edward
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.