On Sunday, November 14, 2021 at 7:17:02 AM UTC-6 Edward K. Ream wrote:

> #2327 <https://github.com/leo-editor/leo-editor/issues/2327> suggests 
improving the python importer. 

I have spent the last several days revising the python importer. See PR 
#2331 <https://github.com/leo-editor/leo-editor/pull/2331>.

Here I'll summarize what I have done and the experiences that have shaped 
the new code.

*Seeing with new eyes*

We often say that it's valuable to look at a project "with fresh eyes". But 
what does that mean?  At least two factors are involved:

1. We have forgotten details about the original code. This gives us a 
clearer sense of the big picture, and may suggest substantial revisions.

2. We have had new experiences, seemingly unrelated to the project at hand, 
that suggest improvements.  For this project, the "new experiences" also 
include an appreciation of how inadequate the python importer is :-)

*Applying new techniques*

The following "recent" principles have helped improve the python importer, 
especially *python_i.gen_lines*, the main line of the importer.

- Eliminate faux helpers, even if it means duplicating code.  This is an 
amazingly potent guide. In particular, the evil *cut_stack* helper obscures 
and complicates the logic.

- Make "if" statements explicit in the main loop, even if it complicates 
the visual appearance slightly.  The value of this became apparent in 
revising FastAtRead.scan_lines.

- Use test-driven development.  This is possible now for the first time!

*Aha: Use two or more passes in gen_lines*

The original version tried to do everything in one go.  Now, the first pass 
is concerned primarily with splitting lines into nodes so as to pass 
perfect import checks. An as-yet-unwritten second pass with split/merge 
nodes and reassign lines.

Yes, the base Importer class supports a so-called post-pass, but doing the 
second pass directly in gen_lines is clearer and more flexible.

*Other improvements*

The existing unit tests cover unusual (valid!) python code. The first pass 
no longer creates separate nodes for nested defs or for functions 
(non-methods).  Instead, the lines within such defs are "allocated" to the 
(existing) node presently being accumulated. This removes several sticky 
special cases.

Similarly, the first pass no longer puts "prefix lines" (lines appearing 
before the first class line) in a separate node.  The plan is to create a 
section for prefix lines if their length exceeds a threshold of 10 lines or 
so.

I've improved the three relevant unit testing classes:

- Improved LeoUnitTest.dump_tree.
- Improved BaseTestImporter.run_test.
- Added the TestPython.check_tree switch.
- Added @command test-one, for running just one unit test.
  This command makes it convenient to single-step through failing code.
- Coming soon: easier tests that generated outline has the expected 
structure and contents.
- Maybe: separate perfect-import tests following pass 1 and pass 2.

*Summary*

The rewritten python importer will likely be part of Leo 6.6. This should 
be safe: the existing unit tests cover all the likely failures.

I might remove the evil cut_stack helper from *all* importers, but this 
might not happen for 6.6.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/d7e30767-9110-4319-be64-962972c542een%40googlegroups.com.

Reply via email to