This Engineering Notebook post reviews the workings of Leo's importers so 
that I will be clear about the details as I revise the python importers.  
The architecture of the importers is surprisingly clever, as I'll now 
explain.

*Overview*

gen_lines, the main importer loop*,* splits the incoming lines into nodes, 
allocating nodes as necessary.  gen_lines calls *add_line *to add a line to 
a node. The post-pass calls undent on each node to adjust leading 
whitespace.


*Adding lines to nodes*

*Importers only ever add entire lines to nodes.  *In other words, add_line 
never removes leading whitespace! This clever policy ensures that gen_lines 
only needs to detect where nodes begin and end, a major simplification!

*Removing leading whitespace*

The *undent *method adjusts a node's lws independently of the gen_lines. 
The python importer overrides the base *Importer.undent* method. i.undent 
is complex, possibly buggy, and clearly unsuitable for python. 

*Py_Importer.undent* removes the lws of the *first* non-blank line of the 
node. I shall soon change py_i.indent so that it never generates Leo's 
escape convention:

- It will "promote" underindented *comment *lines.
- It will cause unit tests to fail for any underindented non-comment line.

*Note*: neither i.undent nor py_i.undent is the same as textwrap.dedent! 

*Splitting lines into nodes*

gen_lines splits lines into nodes, generating nodes as necessary. Unlike 
other importers, indentation drives the py_i.gen_lines. Here, a node's 
*indentation* means vnode_info [p.v] ['indent'].  Similarly for a node's 
*kind*.

*Case 1: Organizer nodes: kind = 'org'*

Organizer nodes contain lines outside of classes and defs. Organizer nodes 
also handle unusual indentation, *including *unusually indented class and 
def lines.

*Rule 1*: *Organizer nodes never contain @others*.  Naturally, their 
ancestor nodes could contain @others. So add_lines and py_i.undent should 
just work for org nodes!

*Rule 2*: *Org nodes never contain children.* 

gen_lines sets the indentation of an org node to the indentation of its 
first non-blank line.

- A class or def line whose lws is less than or equal to the org node's 
indentation will end the org node.

- A class or def whose lws is greater than the indentation of the org node 
must reside completely within the org node. This rule is likely the only 
way of handling unusual indentation!

*Case 2: class nodes: kind == 'class'*

Most class definitions will occur outside of org nodes. All class nodes 
*will* contain an @others directive. The first non-blank line within the 
class determines:

- the lws of the @others directive and
- the indentation of the class node!

*Rule 3*: An org node must contain the entire range of an indented class or 
def that appears outside the range of any class or def node.

As a consequence of python's syntax rules, *a parent org node must already 
exist*. For example, an indented def or class line would be invalid syntax 
unless it were already contained in a (top-level) complex statement such as 
'if', 'while', etc.

*Case 3: def nodes: kind in ('function', 'method')*

*Rule 4*: def lines without lws will generate function nodes.

*Rule 5*: Indented def lines appearing with the same indentation as a 
parent 'class' node will generate method nodes.

*Rule 6*: Indented def lines appearing at a greater indentation than a 
parent class node will be included within the containing method node. Imo, 
this is the only reasonable way of handling inner function definitions.

*Rule 7*: def lines appearing at a lesser indentation than a parent class 
node will terminate the class node. In most cases, the def line will then 
become a method of another parent class node. Per rule 3, if there is no 
such class node, the def line *must* be allocated to an *already existing* 
parent org node.

*Summary*

add_lines and undent_lines should work for all nodes, regardless of the 
node's kind. Happily, the vnode_info dict is available to all methods of 
the post-pass, if special cases are necessary.

gen_lines assigns an indentation to all generated nodes:

- For org nodes, the indentation is the lws of the first non-blank line.

- For class and def nodes, the indentation is the lws of the first 
non-blank line *following* the class or def line.

Unindented class or def lines always generate top-level class or function 
nodes.

Indented class lines generate class nodes if their lws match the 
indentation of a parent class node. Otherwise, the class must appear within 
an already existing org node.

Indented def lines generate method nodes if their lws match the indentation 
of a parent class node. Otherwise, the def line will appear in an enclosing 
function or method node. As a last resort, the def line must appear within 
an already existing org node.

These complex rules are likely buggy. I'll revise them as needed.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/86182c83-bee7-401f-b4a4-d980f9178c11n%40googlegroups.com.

Reply via email to