On Wed, Feb 07, 2007 at 01:52:48AM -0800, Bob Foster wrote:
> The repair algorithm is to re(parse)scan from the start of the damaged 
> region/partition until the output of the parse is the same as the 
> previous output. Your example is just like the comment example. The 
> (parse)scan may need to run to the end of the text; it may not.

Has anyone proved that the terminating condition never gives wrong results?
After all, you said that the scanner knows very little about regions when
it compares them.

For some languages and some changes, I can see that you could keep lists of
potential region starts, that is, places in the text where you know the new
tree will relate to (even though you might not know which place).  In the
example of adding and removing semicolons in LISP, a comment always ends
at the end of a line, so a list of semicolons and ends of lines allows you
to handle the deletion of a semicolon (the next semicolon, if any, becomes
the new comment start) or the insertion (add a new semicolon to the list and
move the comment start back to the earliest semicolon) or the deletion of
a line end (the next line becomes like whatever was before the line end) or
the insertion (the new second line is never a comment).

For parentheses, you have the counts attached to the characters, and you
can update just the counts and do one list insertion or deletion per
parentheses insertion or deletion.  You still may have to propagate changes
to the characters between the parentheses.

Word wrap is apparently the same kind of structure-updating problem, since
words can only break in certain places.  I don't know anything about the
performance of word wrap algorithms though.

I have no idea how useful this approach is for languages like C and Java, or
even for LISP.  But I've thought about it enough to conclude that changes in
the text always have _some_ structure.  Finding and updating that structure
(statically analyzing the grammar and possible text changes to create an
algorithm that dynamically updates the parse tree) is obviously the
difficult part.

> It's probabilistic whether any incremental scheme will improve on the 
> responsiveness of a full parse, except the division of labor between 
> simple syntax coloring and full semantics coloring/error reporting.

Is "probabilistic" really the best word?  For the same original text and
the same change, the algorithm always goes through the same computation.
Maybe you mean that incremental schemes can't cope well with all grammars
or all source texts or all changes to a given source text, and that no one
knows the conditions that make them cope well or badly.

> I guess the real question is are they worth writing?  Users aren't 
> really suffering because no incremental parsers are available. They're 
> pretty much satisfied by a combination of fast colorers and slower 
> parsers...as long as the parsers aren't annoyingly slow.

The snag is that I don't really want colored text, I want an accurate
syntax tree so I can do other things besides coloring text, e.g., jumping
to related places in the text, or finding regions of text to give to later
editing commands.  I would hate to have the editor jump to one place, accept
typing, and then decide that it should have jumped to another place!  I am
guessing that with structured data and accurate updating of the structure,
text editors could do interesting things, but until I find the right editor
I can't try out my ideas.

-- Derek

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to