On Wednesday, October 5, 2016 at 6:21:16 AM UTC-5, Edward K. Ream wrote:

> the new base is a more robust and flexible way of handling the myriad 
complexities of javascript.

The code continues to simplify.  The task itself is becoming more 
"interesting". 

Rev a8087c5 is a rewrite of the *new *javascript scanner, allowing 
sophisticated rescanning of blocks. At present, such scanning does not take 
place.

This post will be prewriting for the code-level documentation. Feel free to 
ignore it. It explains the present code does and how it can be improved. 
The new code will be more complicated, but it should be worth the effort.

*The present code*

The new scanner seems solid.  It's not clear how it could possibly fail a 
perfect import test. Every line of the imported code ends up (unchanged) in 
the output.

The code inserts *only* @others (unindented) in the root node, and creates 
a list of zero or more child nodes. Each child node contains the lines 
associated with a *block* of lines.  A line that increases the number of 
parens and/or curly brackets starts a block that continues up to and 
including the line that restores the number of parens and curly brackets to 
what they were just before the start of the block.

*Remaining problems*

1. All blocks become direct children of the parent node, which (to repeat) 
contains only an unindented @others.  This works, but puts too much into 
child nodes. Instead, we would like to put some text into the root node, if 
that is "reasonable", as discussed below.  If we don't do this the result 
will seem strange to human eyes, and we will forego the opportunity to 
split nodes in a more Leonine manner.

2. Once we allow text to appear in the root, we must use section references 
if a node will end up with several child nodes. 

3. All blocks should be *rescanned* to see if child nodes should be 
created.  However, there is a bit of AI involved.  Here are some cases:

A. If a block contains less than, say, 20 lines, it would be best to leave 
it as it is.  This is the only easy case.

B. If a node would have only one child, and the body of the parent is 
"small enough", we would prefer not to create a child node, promoting the 
child's grandchildren.

B. If a block is an "if" or "else" clause, we might want to leave the block 
at the next higher lever, depending on the size of the "if" or "else" 
clauses. These are especially hard cases.

In short, whether and how to rescan a block may depend on preceding, 
following or parent blocks.  For that reason, blocks of lines are 
represented as *Block* objects, so that context is available during 
rescanning.

4. Depending on indentation levels, we want to *indent* the *reference line* 
(@others or section reference) and *unindent* the corresponding lines in 
child nodes.  To keep the resulting file unchanged, we must look ahead to 
see how much indentation the child nodes contain, and indent the reference 
line only that much.  "Factoring out" the maximal indentation for each 
reference line is good Leonine style.  At present, that is not done. This 
problem is masked at present, because the code never produces grand-child 
nodes...

5. When using section references instead of @others, the code must ensure 
that section references within a node are unique.  We can do this by adding 
level numbers to all section names.  For example,

    1.3.2: <headline>

where <headline> is computed by jss.get_headline by applying patterns 
against the first lines of the block.

*Summary*

Rescanning blocks so that the result looks reasonable to humans is tricky. 
The present code provides a framework for doing so.  I'll work on the 
javascript importer until it reliably creates a Leonine tree.

The present work already suggests improvements that can be made to other 
scanners and the base scanner code.  This project is worth substantial 
effort.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.

Reply via email to