On Wednesday, October 5, 2016 at 6:21:16 AM UTC-5, Edward K. Ream wrote:
> the new base is a more robust and flexible way of handling the myriad
complexities of javascript.
The code continues to simplify. The task itself is becoming more
"interesting".
Rev a8087c5 is a rewrite of the *new *javascript scanner, allowing
sophisticated rescanning of blocks. At present, such scanning does not take
place.
This post will be prewriting for the code-level documentation. Feel free to
ignore it. It explains the present code does and how it can be improved.
The new code will be more complicated, but it should be worth the effort.
*The present code*
The new scanner seems solid. It's not clear how it could possibly fail a
perfect import test. Every line of the imported code ends up (unchanged) in
the output.
The code inserts *only* @others (unindented) in the root node, and creates
a list of zero or more child nodes. Each child node contains the lines
associated with a *block* of lines. A line that increases the number of
parens and/or curly brackets starts a block that continues up to and
including the line that restores the number of parens and curly brackets to
what they were just before the start of the block.
*Remaining problems*
1. All blocks become direct children of the parent node, which (to repeat)
contains only an unindented @others. This works, but puts too much into
child nodes. Instead, we would like to put some text into the root node, if
that is "reasonable", as discussed below. If we don't do this the result
will seem strange to human eyes, and we will forego the opportunity to
split nodes in a more Leonine manner.
2. Once we allow text to appear in the root, we must use section references
if a node will end up with several child nodes.
3. All blocks should be *rescanned* to see if child nodes should be
created. However, there is a bit of AI involved. Here are some cases:
A. If a block contains less than, say, 20 lines, it would be best to leave
it as it is. This is the only easy case.
B. If a node would have only one child, and the body of the parent is
"small enough", we would prefer not to create a child node, promoting the
child's grandchildren.
B. If a block is an "if" or "else" clause, we might want to leave the block
at the next higher lever, depending on the size of the "if" or "else"
clauses. These are especially hard cases.
In short, whether and how to rescan a block may depend on preceding,
following or parent blocks. For that reason, blocks of lines are
represented as *Block* objects, so that context is available during
rescanning.
4. Depending on indentation levels, we want to *indent* the *reference line*
(@others or section reference) and *unindent* the corresponding lines in
child nodes. To keep the resulting file unchanged, we must look ahead to
see how much indentation the child nodes contain, and indent the reference
line only that much. "Factoring out" the maximal indentation for each
reference line is good Leonine style. At present, that is not done. This
problem is masked at present, because the code never produces grand-child
nodes...
5. When using section references instead of @others, the code must ensure
that section references within a node are unique. We can do this by adding
level numbers to all section names. For example,
1.3.2: <headline>
where <headline> is computed by jss.get_headline by applying patterns
against the first lines of the block.
*Summary*
Rescanning blocks so that the result looks reasonable to humans is tricky.
The present code provides a framework for doing so. I'll work on the
javascript importer until it reliably creates a Leonine tree.
The present work already suggests improvements that can be made to other
scanners and the base scanner code. This project is worth substantial
effort.
Edward
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.