Rev 6432 contains the prototype of a simpler, faster, more flexible
importer for Python files. This is a great day for Leo.
You will find the new code in leoToDo.txt (accessible via leoPy.leo) in the
node:
@file ../doc/leoToDo.txt-->First-->Experiment with new @auto-->@button
@auto-read
I am proud of this work, for several reasons:
1. More flexible.
First, and most importantly, the new importer allows the programmer (you)
to add code to the beginning *or* end of each node *without* using sentinel
comments. Instead, *plain* comments function as "stealth" sentinels in a
way that is completely undetectable by non-Leo users!
This is a truly brilliant invention, if I do say so myself, and it "just
works" without any difficult hacks in the import code. For example,
suppose the file contains::
# This is the foo method.
# Another comment.
bar = 0
def foo():
bar += 1
foo2 = foo
All this code will be imported into the outline node for foo, even if other
defs or classes appear before and after foo at the same indentation
level(!!!) This seems like magic, but in fact the rules for assigning lines
to nodes are simple. Here they are:
A. Initially the node for def x starts with the def x(): line and
continues until the next class or def at x's indentation or less.
See the Post Script for a discussion of embedded def's (def's at higher
indentation than x contained *within* x).
B. A simple, fast, post-pass adjust the contents of nodes as follows:
- The first "underindented" comment (a comment with the same indentation as
the def x line) becomes part of the next node.
- All lines (in a node) that appear after the first underindented comment
line are also copied to the next node.
- As an additional special case, the first underindented comment and
following lines of the *very last node of the outline* are copied to the
root node. This convention places the following code correctly in the root:
# Outer
if __name__ == '__main__':
main()
The "# Outer" comment is, in effect, a sentinel comment! Those that don't
use Leo aren't likely to object to such comments. Heh, heh. Even if
non-Leo remove the comments, the import will still succeed. In that case,
the importer will place ome lines will be placed in the "wrong" nodes.
In short, for the first time ever, it is possible to place code *exactly*
where you want it *without* using (obvious) sentinels.
I've designed many markup conventions over the years, but this must surely
be the best and most important.
2. More reliable.
The old importers work on a character-by-character basis. The new importer
works on a line-by-line basis. In essence, the new importer simply copies
lines from the file to newly-created outline nodes.
The new code completely bypasses the extremely complex "code generators"
used by all the present importers. The code generators tweak the contents
of nodes on a character-by-character basis.
The line orientation means that all kinds of errors that may happen in the
old importers become impossible. Still, running the post-import checks
still seems prudent ;-) They were certainly useful while debugging.
3. [Faster]
Last, and least importantly, the new code is about twice as fast as the
previous Python importer.
An important innovation was required. The parse method scans for the
beginning and end of strings using s.count and s.find *before* handling
each line. This trick makes the line-based approach possible. Similar code
could handle comments that span lines for languages like C, html, etc.
Edward
P.S. This is the first of two "risky" projects needed to make @auto pretty
much a drop-in replacement for @file. The new code, while apparently simple
now that the dust has cleared, was very hard to get right. I've had my head
down for the last three days while I wrestled with the code. The project
became less risky after I saw how to handle strings, but it remained
speculative until yesterday.
The other risky part of the puzzle is the code that will reconstitute clone
links from nodes in the .leo file to nodes in @auto trees. The key idea of
this "relinking" project is that node headlines are good (not perfect)
proxies for the true node identities given by gnx's in sentinels.
P.P.S. A brag: def's or classes can appear *inside* any def, and the
importer will create the proper @others directive *inside the def*. For
example: given:
def a():
# A comment.
def b():
pass
print('hi')
The importer generates:
+ a
def a():
# A comment.
@others
print('hi')
+ b
def b():
pass
Still to do, but feasible: generate section references if more than one
@others directive would be generated in a single node. This will only
happen if two "nested" defs are not adjacent within the containing node.
P.P.P.S. The conventions for placing text within nodes do what the @shadow
algorithm can never ever do, that is, *reliably* place changed lines in the
proper node. Sure, one can imagine situations in which non-Leo users can
muck things up a bit, but such situations are likely to be rare, and a
"stealthy" reintroduction of the required comments will keep Leo users
happy without upsetting non-Leo users. One can even imagine letting Nancy
in on the secret. Hehe.
EKR
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/groups/opt_out.