On Sunday, November 14, 2021 at 7:17:02 AM UTC-6 Edward K. Ream wrote: > #2327 <https://github.com/leo-editor/leo-editor/issues/2327> suggests improving the python importer.
I have spent the last several days revising the python importer. See PR #2331 <https://github.com/leo-editor/leo-editor/pull/2331>. Here I'll summarize what I have done and the experiences that have shaped the new code. *Seeing with new eyes* We often say that it's valuable to look at a project "with fresh eyes". But what does that mean? At least two factors are involved: 1. We have forgotten details about the original code. This gives us a clearer sense of the big picture, and may suggest substantial revisions. 2. We have had new experiences, seemingly unrelated to the project at hand, that suggest improvements. For this project, the "new experiences" also include an appreciation of how inadequate the python importer is :-) *Applying new techniques* The following "recent" principles have helped improve the python importer, especially *python_i.gen_lines*, the main line of the importer. - Eliminate faux helpers, even if it means duplicating code. This is an amazingly potent guide. In particular, the evil *cut_stack* helper obscures and complicates the logic. - Make "if" statements explicit in the main loop, even if it complicates the visual appearance slightly. The value of this became apparent in revising FastAtRead.scan_lines. - Use test-driven development. This is possible now for the first time! *Aha: Use two or more passes in gen_lines* The original version tried to do everything in one go. Now, the first pass is concerned primarily with splitting lines into nodes so as to pass perfect import checks. An as-yet-unwritten second pass with split/merge nodes and reassign lines. Yes, the base Importer class supports a so-called post-pass, but doing the second pass directly in gen_lines is clearer and more flexible. *Other improvements* The existing unit tests cover unusual (valid!) python code. The first pass no longer creates separate nodes for nested defs or for functions (non-methods). Instead, the lines within such defs are "allocated" to the (existing) node presently being accumulated. This removes several sticky special cases. Similarly, the first pass no longer puts "prefix lines" (lines appearing before the first class line) in a separate node. The plan is to create a section for prefix lines if their length exceeds a threshold of 10 lines or so. I've improved the three relevant unit testing classes: - Improved LeoUnitTest.dump_tree. - Improved BaseTestImporter.run_test. - Added the TestPython.check_tree switch. - Added @command test-one, for running just one unit test. This command makes it convenient to single-step through failing code. - Coming soon: easier tests that generated outline has the expected structure and contents. - Maybe: separate perfect-import tests following pass 1 and pass 2. *Summary* The rewritten python importer will likely be part of Leo 6.6. This should be safe: the existing unit tests cover all the likely failures. I might remove the evil cut_stack helper from *all* importers, but this might not happen for 6.6. Edward -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/d7e30767-9110-4319-be64-962972c542een%40googlegroups.com.
