On Wed, Feb 07, 2007 at 01:52:48AM -0800, Bob Foster wrote: > The repair algorithm is to re(parse)scan from the start of the damaged > region/partition until the output of the parse is the same as the > previous output. Your example is just like the comment example. The > (parse)scan may need to run to the end of the text; it may not.
Has anyone proved that the terminating condition never gives wrong results? After all, you said that the scanner knows very little about regions when it compares them. For some languages and some changes, I can see that you could keep lists of potential region starts, that is, places in the text where you know the new tree will relate to (even though you might not know which place). In the example of adding and removing semicolons in LISP, a comment always ends at the end of a line, so a list of semicolons and ends of lines allows you to handle the deletion of a semicolon (the next semicolon, if any, becomes the new comment start) or the insertion (add a new semicolon to the list and move the comment start back to the earliest semicolon) or the deletion of a line end (the next line becomes like whatever was before the line end) or the insertion (the new second line is never a comment). For parentheses, you have the counts attached to the characters, and you can update just the counts and do one list insertion or deletion per parentheses insertion or deletion. You still may have to propagate changes to the characters between the parentheses. Word wrap is apparently the same kind of structure-updating problem, since words can only break in certain places. I don't know anything about the performance of word wrap algorithms though. I have no idea how useful this approach is for languages like C and Java, or even for LISP. But I've thought about it enough to conclude that changes in the text always have _some_ structure. Finding and updating that structure (statically analyzing the grammar and possible text changes to create an algorithm that dynamically updates the parse tree) is obviously the difficult part. > It's probabilistic whether any incremental scheme will improve on the > responsiveness of a full parse, except the division of labor between > simple syntax coloring and full semantics coloring/error reporting. Is "probabilistic" really the best word? For the same original text and the same change, the algorithm always goes through the same computation. Maybe you mean that incremental schemes can't cope well with all grammars or all source texts or all changes to a given source text, and that no one knows the conditions that make them cope well or badly. > I guess the real question is are they worth writing? Users aren't > really suffering because no incremental parsers are available. They're > pretty much satisfied by a combination of fast colorers and slower > parsers...as long as the parsers aren't annoyingly slow. The snag is that I don't really want colored text, I want an accurate syntax tree so I can do other things besides coloring text, e.g., jumping to related places in the text, or finding regions of text to give to later editing commands. I would hate to have the editor jump to one place, accept typing, and then decide that it should have jumped to another place! I am guessing that with structured data and accurate updating of the structure, text editors could do interesting things, but until I find the right editor I can't try out my ideas. -- Derek _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg