Rev 6432: A completed prototype for a better @auto importer for Python

Edward K. Ream Wed, 18 Dec 2013 06:35:09 -0800

Rev 6432 contains the prototype of a simpler, faster, more flexible 
importer for Python files.  This is a great day for Leo.


You will find the new code in leoToDo.txt (accessible via leoPy.leo) in the 
node:
@file ../doc/leoToDo.txt-->First-->Experiment with new @auto-->@button 
@auto-read

I am proud of this work, for several reasons:

1. More flexible.

First, and most importantly, the new importer allows the programmer (you) 
to add code to the beginning *or* end of each node *without* using sentinel 
comments.  Instead, *plain* comments function as "stealth" sentinels in a 
way that is completely undetectable by non-Leo users!

This is a truly brilliant invention, if I do say so myself, and it "just 
works" without any difficult hacks in the import code.  For example, 
suppose the file contains::

    # This is the foo method.
        # Another comment.
    bar = 0
    def foo():
        bar += 1
    foo2 = foo

All this code will be imported into the outline node for foo, even if other 
defs or classes appear before and after foo at the same indentation 
level(!!!) This seems like magic, but in fact the rules for assigning lines 
to nodes are simple. Here they are:

A.  Initially the node for def x starts with the def x(): line and 
continues until the next class or def at x's indentation or less.

See the Post Script for a discussion of embedded def's (def's at higher 
indentation than x contained *within* x).

B. A simple, fast, post-pass adjust the contents of nodes as follows:

- The first "underindented" comment (a comment with the same indentation as 
the def x line) becomes part of the next node.

- All lines (in a node) that appear after the first underindented comment 
line are also copied to the next node.

- As an additional special case, the first underindented comment and 
following lines of the *very last node of the outline* are copied to the 
root node.  This convention places the following code correctly in the root:

   # Outer
   if __name__ == '__main__':
       main()

The "# Outer" comment is, in effect, a sentinel comment! Those that don't 
use Leo aren't likely to object to such comments.  Heh, heh.  Even if 
non-Leo remove the comments, the import will still succeed. In that case, 
the importer will place ome lines will be placed in the "wrong" nodes.

In short, for the first time ever, it is possible to place code *exactly* 
where you want it *without* using (obvious) sentinels.

I've designed many markup conventions over the years, but this must surely 
be the best and most important.

2. More reliable.

The old importers work on a character-by-character basis.  The new importer 
works on a line-by-line basis.  In essence, the new importer simply copies 
lines from the file to newly-created outline nodes.

The new code completely bypasses the extremely complex "code generators" 
used by all the present importers.  The code generators tweak the contents 
of nodes on a character-by-character basis.

The line orientation means that all kinds of errors that may happen in the 
old importers become impossible. Still, running the post-import checks 
still seems prudent ;-)  They were certainly useful while debugging.

3. [Faster]

Last, and least importantly, the new code is about twice as fast as the 
previous Python importer.

An important innovation was required. The parse method scans for the 
beginning and end of strings using s.count and s.find *before* handling 
each line. This trick makes the line-based approach possible.  Similar code 
could handle comments that span lines for languages like C, html, etc.

Edward

P.S. This is the first of two "risky" projects needed to make @auto pretty 
much a drop-in replacement for @file. The new code, while apparently simple 
now that the dust has cleared, was very hard to get right. I've had my head 
down for the last three days while I wrestled with the code.  The project 
became less risky after I saw how to handle strings, but it remained 
speculative until yesterday.

The other risky part of the puzzle is the code that will reconstitute clone 
links from nodes in the .leo file to nodes in @auto trees.  The key idea of 
this "relinking" project is that node headlines are good (not perfect) 
proxies for the true node identities given by gnx's in sentinels.

P.P.S. A brag: def's or classes can appear *inside* any def, and the 
importer will create the proper @others directive *inside the def*.  For 
example: given:

    def a():
        # A comment.
        def b():
            pass
        print('hi')

The importer generates:

    + a
        def a():
            # A comment.
            @others
            print('hi')
        + b
            def b():
                pass

Still to do, but feasible: generate section references if more than one 
@others directive would be generated in a single node.  This will only 
happen if two "nested" defs are not adjacent within the containing node.

P.P.P.S.  The conventions for placing text within nodes do what the @shadow 
algorithm can never ever do, that is, *reliably* place changed lines in the 
proper node.  Sure, one can imagine situations in which non-Leo users can 
muck things up a bit, but such situations are likely to be rare, and a 
"stealthy" reintroduction of the required comments will keep Leo users 
happy without upsetting non-Leo users.  One can even imagine letting Nancy 
in on the secret.  Hehe.

EKR

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/groups/opt_out.

Rev 6432: A completed prototype for a better @auto importer for Python

Reply via email to