First, in response to Chris, your method does indeed sound similar. Also, your project overall is very interesting, I will probably download the code and see what I can learn. As to Bryan's points.
On 9/2/07, Bryan Ford <[EMAIL PROTECTED]> wrote: > 1. There may be some immediate performance cost to trying to parse > everything at least twice, to see if it changes after the first > iteration, before entering it into the memo table. But presumably > that cost could easily be minimized in a smart parser generator by > performing static left recursion detection and only emitting loops in > parse functions that are potentially left-recursive. (I expect > Chris's Katahdin already does this...?) True, and for my implementation I planned to support three 'types' of rules, being transient (ala Rats!), memorized, and left-recursive. As to letting the user select and erroring on left-recursion for non-recursive rules, or trying to infer the proper categorization, I'm not certain. Probably heuristic inference and user override. > 2. As currently formulated, it's not obvious to me that with this > modification the parser is guaranteed to terminate: although the > "expected case" is obviously that the amount of text "consumed" by > the looping construction should increase at each iteration, it should > be easy to create a grammar that yields success if the prior > iteration failed and vice versa, resulting in an infinite loop. > Presumably this could be fixed trivially by adding a hard-coded > restriction that the amount of text consumed actually does increase > with each iteration - e.g., use 'cur != -1 && len(cur) > len(prev)' > as the predicate of the 'while' loop. Very interesting point, and I believe you are correct. I forgot to consider the use of the ! operation in PEGS, and it's impact. I think you are also correct about the fix, but this brings me to: > 3. Even after the above modification, it's not obvious to me that > this modification would preserve the linear parse time guarantee of > the basic PEG model... which has interesting implications that could be > either positive or negative. There now appears to be an additional > "dimension" of looping - each parse rule at each of the n text > positions can potentially be evaluated O(n) times, so it looks like > we're up to O(n^2) complexity in theory, the same as an Early parse > of an unambiguous CFG for example. Which suggests the question: does > this modification bring PEGs closer to unambiguous CFGs in some > formal way? For example, with this modification can we come up with > a "middle-finding" PEG corresponding to the CFG "S -> [ab] S [ab] | a"? At first, I thought that there was a way to modify the accounting by 'reassigning' the work for the inner loops to other parts of the parsing to recover O(n) complexity, but as I tried to formulate an argument, I realized this was not true. However, as far as the middle-finding goes, how about: S = S a [ab] \ S S [ab] \ [ab] I think that works. Or at least when I hand simulated out on a test case it seems to. I would have missed the increase in expressiveness, as I was simply looking for ease of coding (who wants to rewrite left recursion), and for a easy way to get left associativity for binary operations, but it looks like your right. I wonder is this would make PEGS an actual superset of unambiguous CFGS? -Jeremy _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg