ENB: Modest yet important @auto improvements

Edward K. Ream Mon, 17 Feb 2014 16:33:07 -0800

This is an Engineering Note Book post.  It consists of notes to myself.  
Feel free to ignore.


If nothing else, the recent (defunct) @auto project reminded me of some 
deficiencies in the @auto code.   The well-reported bugs in the JavaScript 
importer are probably the most grievous examples.

I have spent several days wondering how to improve (and hopefully simplify) 
the @auto code.  The ultimate prize would be to be able to eliminate the 
perfect import checks during development.  That may never happen, but it's 
worth attempting.

Some notes:

1. My first thought was that a line-oriented *parse* of languages would be 
more robust than the present character-by-character approach.   After 
reviewing the code, that would not likely be significantly better.  What 
*is* important is that nodes be split on line boundaries.  This can be 
assured (with various hacks) without wholesale changes to the present 
parsing code.

2. The present code generates nodes one def/class at a time.  That's wrong, 
for an interesting reason.

The present code *assumes* that it knows how to generate the indentation of 
@others directives, thereby baking in a minimum indentation for all methods 
of a class.  This may be reasonable for Python, but it is simply wrong for 
most other languages.  For example, there is no required indentation at all 
in C++ or JavaScript.

Instead, what should happen is that all method nodes are "undented" by the 
indentation of the @others directive.  To make this work, the indentation 
of the @others directive should be the minimum indentation of *all* the 
methods of the class.

As a happy side effect, there will be no longer any need for a convention 
to handle "underindented" lines.  Such lines would simply reduce the 
minimum indentation.  Of course, the side effect would be "extra" 
indentation in some/all of the nodes, but that's not a big deal.  If 
desired, the user could "correct" that situation by increasing the 
indentation of underindented lines.

As I write this, I see that it might be possible to generate the @others 
directive after handling all methods of a class.  That may be simplest, but 
I suspect not.  Let's ignore this possibility for now, and consider another 
(possibly equivalent) way.

3. The only real task of the parser code is to split lines into nodes, such 
that each lines is inserted (in proper order) in exactly one node.  Sounds 
simple ;-)

However, nodes themselves have structure: class nodes contain declaration 
nodes and nodes for methods.  Function nodes may (or may not) contain nodes 
for nested defs.

Let us imagine an **intermediate tree* structure consisting of nodes 
describing the nodes to be created.  The nodes of this intermediate 
structure will have the following fields:

- parent/child links
- the headline of the node.
- a list of lines to become the body. (Maybe represented by indices into a 
global list of all lines of the imported file.
- maybe other data.

The advantage of this scheme is that it greatly simplifies the code 
generators.  In particular, it becomes trivial to generate code for a class 
because all the intermediate nodes for the class's methods have already 
been generated. The disadvantage is the complexity of creating the nodes of 
the intermediate list.  I suspect the advantages will outweigh the 
disadvantages.

As I write this, I see what might be a simpler way.  The *parser* (in 
collaboration with the present code generators) will create the actual 
outline nodes (much as is done at present) but **without any 
adjustments**.  That is, the body text of each *outline* node would be 
simply what would have been put in the intermediate node.  Again, this 
scheme should simply the code generators.

Then, after *all* outline nodes have been generated, a post-pass would 
adjust indentation of @others directives and body text.  This delays the 
adjustment of whitespace until all required data are known.  With luck, the 
code will be simple enough so that a separate checking pass will not be 
needed.

===== Summary

There are some interesting ideas/tricks here, but actually making these 
ideas work will surely not be straightforward.  The present code is 
mind-bogglingly complex, and simplifying it will take some real work.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/groups/opt_out.

ENB: Modest yet important @auto improvements

Reply via email to