Yesterday's ENB provided background for the new python importer. In this
ENB post I'll discuss rules (criteria) for allocating lines to nodes.
*Initial rules*
Imo, there is no right way to allocate incoming lines to nodes. There are
judgment calls involved. The spectrum of choices ranges from putting all
lines into a single node, to putting each line in its own node :-)
Let's start with the following relatively straightforward rules:
*Rule 1*. Put every *class line* in its own node, *unindented*, like this:
class TheClass:
<lws>@others
* Note*: The leading whitespace (lws) of the @others line will be the lws
of the first inner class or def contained in TheClass.
*Rule 2*. Similarly, put every *def line* in its own node, *unindented*,
like this:
def TheMethodOrFunction(...):
<The rest of the Method or function>
Unlike class lines, def nodes do *not* contain an @others directive. This
rule implies that *inner (nested) defs and classes will be allocated to the
def node*. This is a debatable choice, but in practice it should cause few
problems. The user can always create more nodes later.
*Example*
These rules will "just work" for typical python programs. They also work
for strangely-indented code such as:
Consider the following (valid!) python fragment:
if 2:
class C1: # 2 space lws.
def method1(): # 2+3 space lws.
pass # 2+3+4 space lws.
if 4: # 4-space indentation everywhere
def d1():
pass
Rules 1 and 2 *require *the following node structure (I'll use the MORE
format for representing nodes, in which lines starting with '-' denote
headlines):
- if 2:
if 2:
@others (2-space lws comes from `class C1`.)
- class C1
class C1: (No lws for this line!)
@others (3-space lws comes from the `def method1`.)
- def method1():
def method1(): (No lws for this line!)
pass (4-space lws in this node)
- if 4: # 4-space indentation everywhere
@others (4-space lws comes from the `def d1():` line.)
- d1
def d1(): (no lws for this line!)
pass (4-space lws in this node)
*Discussion*
The rules constrain both the code generator and the post-pass
strangely-indented code as follows:
1. The "if 2" and "if 4" organizer nodes *must exist* if class and def
nodes are to start with *unindented* class or def lines.
2. The post pass *must not* "dedent away" the lws before @others lines in
organizer nodes. Happily, the post-pass knows that 'org' nodes are
organizer nodes.
I have glossed over some complications, especially:
- *Prefix lines*: lines that appear before class or def lines at the same
indentation level as the class or def lines.
- *Tail lines*: lines that appear after class or def lines at the same
indentation level as the class or def lines.
The code generator *might* generate organizer nodes for prefix and/or tail
lines. The post-pass may then optimize some of those organizer nodes away.
What I'll actually do remains to be seen. It's a sticky problem that can't
be designed away here.
*Summary*
Two seemingly simple rules (criteria) for allocating lines to nodes
constrain what nodes the code generator must produce. I am happy, for now,
with these rules. We shall see how well the rules work.
Edward
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/leo-editor/ad9678b1-4e03-4fa7-a50a-fe9976f735een%40googlegroups.com.