Not a bad idea - quite simple and elegant. A few potential issues:
1. There may be some immediate performance cost to trying to parse everything at least twice, to see if it changes after the first iteration, before entering it into the memo table. But presumably that cost could easily be minimized in a smart parser generator by performing static left recursion detection and only emitting loops in parse functions that are potentially left-recursive. (I expect Chris's Katahdin already does this...?)
2. As currently formulated, it's not obvious to me that with this modification the parser is guaranteed to terminate: although the "expected case" is obviously that the amount of text "consumed" by the looping construction should increase at each iteration, it should be easy to create a grammar that yields success if the prior iteration failed and vice versa, resulting in an infinite loop. Presumably this could be fixed trivially by adding a hard-coded restriction that the amount of text consumed actually does increase with each iteration - e.g., use 'cur != -1 && len(cur) > len(prev)' as the predicate of the 'while' loop.
3. Even after the above modification, it's not obvious to me that this modification would preserve the linear parse time guarantee of the basic PEG model, which has interesting implications that could be either positive or negative. There now appears to be an additional "dimension" of looping - each parse rule at each of the n text positions can potentially be evaluated O(n) times, so it looks like we're up to O(n^2) complexity in theory, the same as an Early parse of an unambiguous CFG for example. Which suggests the question: does this modification bring PEGs closer to unambiguous CFGs in some formal way? For example, with this modification can we come up with a "middle-finding" PEG corresponding to the CFG "S -> [ab] S [ab] | a"?
Cheers, Bryan On Sep 3, 2007, at 6:48 AM, Jeremy Bruestle wrote:
I seem to have run across a very simple way to add support for left recursion in PEGS. It looks like it works to me in all cases, but perhaps I'm missing somthing. Without further ado... Presume we have your standard memorizing peg parser. In psudo-code it might look like: // Lookup memorized result, or do actual parsing // return -1 on failure, length of match on success int check_parse(rule, position) key = tuple(rule, postion); if find(memory[key]) return memory[key]; else memory[key] = do_pase(rule, position); return memory[key]; end end // Do actual work, to parse sub-components, call check_parse // Return result in same format as above int do_parse(rule, position); // Actual code would be here Now if we have any left recursion in our rules, we go into an infinite recursion. We can fix the infinite recursion, although not cause our left hand rules to actually work by just setting the memory to failure when we realize this is our first time trying a rule, but before we call do_parse. that is: ... else memory[key] = -1; memory[key] = do_pase(rule, position); return memory[key]; end ... Of course this just effectively makes the left recursive rules worthless. However if change check_parse to be: int check_parse(rule, position) key = tuple(rule, postion); if find(memory[key]) return memory[key]; else cur = -1; do memory[key] = cur; prev = cur; cur = do_pase(rule, position); while(cur != -1 && cur != prev) end end We now cause the first parse to pick out only non-recursive rules, then we try again with out current 'match' for the recursive nonterminal at the position in question. So long as it keeps working we keep expanding at the same location. Once we have expanded as much as we can, we stop and return. If we ever try to parse that same location for the same terminal type, we just return the memorized version as usual. This of course presumes that our recursive rules come before our non-recursive one for a given type. Anyway, it seems to work, just curious on people's thoughts. -Jeremy _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg