This Engineering Notebook post reviews the workings of Leo's importers so
that I will be clear about the details as I revise the python importers.
The architecture of the importers is surprisingly clever, as I'll now
explain.
*Overview*
gen_lines, the main importer loop*,* splits the incoming lines into nodes,
allocating nodes as necessary. gen_lines calls *add_line *to add a line to
a node. The post-pass calls undent on each node to adjust leading
whitespace.
*Adding lines to nodes*
*Importers only ever add entire lines to nodes. *In other words, add_line
never removes leading whitespace! This clever policy ensures that gen_lines
only needs to detect where nodes begin and end, a major simplification!
*Removing leading whitespace*
The *undent *method adjusts a node's lws independently of the gen_lines.
The python importer overrides the base *Importer.undent* method. i.undent
is complex, possibly buggy, and clearly unsuitable for python.
*Py_Importer.undent* removes the lws of the *first* non-blank line of the
node. I shall soon change py_i.indent so that it never generates Leo's
escape convention:
- It will "promote" underindented *comment *lines.
- It will cause unit tests to fail for any underindented non-comment line.
*Note*: neither i.undent nor py_i.undent is the same as textwrap.dedent!
*Splitting lines into nodes*
gen_lines splits lines into nodes, generating nodes as necessary. Unlike
other importers, indentation drives the py_i.gen_lines. Here, a node's
*indentation* means vnode_info [p.v] ['indent']. Similarly for a node's
*kind*.
*Case 1: Organizer nodes: kind = 'org'*
Organizer nodes contain lines outside of classes and defs. Organizer nodes
also handle unusual indentation, *including *unusually indented class and
def lines.
*Rule 1*: *Organizer nodes never contain @others*. Naturally, their
ancestor nodes could contain @others. So add_lines and py_i.undent should
just work for org nodes!
*Rule 2*: *Org nodes never contain children.*
gen_lines sets the indentation of an org node to the indentation of its
first non-blank line.
- A class or def line whose lws is less than or equal to the org node's
indentation will end the org node.
- A class or def whose lws is greater than the indentation of the org node
must reside completely within the org node. This rule is likely the only
way of handling unusual indentation!
*Case 2: class nodes: kind == 'class'*
Most class definitions will occur outside of org nodes. All class nodes
*will* contain an @others directive. The first non-blank line within the
class determines:
- the lws of the @others directive and
- the indentation of the class node!
*Rule 3*: An org node must contain the entire range of an indented class or
def that appears outside the range of any class or def node.
As a consequence of python's syntax rules, *a parent org node must already
exist*. For example, an indented def or class line would be invalid syntax
unless it were already contained in a (top-level) complex statement such as
'if', 'while', etc.
*Case 3: def nodes: kind in ('function', 'method')*
*Rule 4*: def lines without lws will generate function nodes.
*Rule 5*: Indented def lines appearing with the same indentation as a
parent 'class' node will generate method nodes.
*Rule 6*: Indented def lines appearing at a greater indentation than a
parent class node will be included within the containing method node. Imo,
this is the only reasonable way of handling inner function definitions.
*Rule 7*: def lines appearing at a lesser indentation than a parent class
node will terminate the class node. In most cases, the def line will then
become a method of another parent class node. Per rule 3, if there is no
such class node, the def line *must* be allocated to an *already existing*
parent org node.
*Summary*
add_lines and undent_lines should work for all nodes, regardless of the
node's kind. Happily, the vnode_info dict is available to all methods of
the post-pass, if special cases are necessary.
gen_lines assigns an indentation to all generated nodes:
- For org nodes, the indentation is the lws of the first non-blank line.
- For class and def nodes, the indentation is the lws of the first
non-blank line *following* the class or def line.
Unindented class or def lines always generate top-level class or function
nodes.
Indented class lines generate class nodes if their lws match the
indentation of a parent class node. Otherwise, the class must appear within
an already existing org node.
Indented def lines generate method nodes if their lws match the indentation
of a parent class node. Otherwise, the def line will appear in an enclosing
function or method node. As a last resort, the def line must appear within
an already existing org node.
These complex rules are likely buggy. I'll revise them as needed.
Edward
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/leo-editor/86182c83-bee7-401f-b4a4-d980f9178c11n%40googlegroups.com.