First, in response to Chris, your method does indeed sound similar.
Also, your project overall is very interesting, I will probably
download the code and see what I can learn.  As to Bryan's points.

On 9/2/07, Bryan Ford <[EMAIL PROTECTED]> wrote:

> 1. There may be some immediate performance cost to trying to parse
> everything at least twice, to see if it changes after the first
> iteration, before entering it into the memo table.  But presumably
> that cost could easily be minimized in a smart parser generator by
> performing static left recursion detection and only emitting loops in
> parse functions that are potentially left-recursive.  (I expect
> Chris's Katahdin already does this...?)

True, and for my implementation I planned to support three 'types' of
rules, being transient (ala Rats!), memorized, and left-recursive.  As
to letting the user select and erroring on left-recursion for
non-recursive rules, or trying to infer the proper categorization, I'm
not certain.  Probably heuristic inference and user override.

> 2. As currently formulated, it's not obvious to me that with this
> modification the parser is guaranteed to terminate: although the
> "expected case" is obviously that the amount of text "consumed" by
> the looping construction should increase at each iteration, it should
> be easy to create a grammar that yields success if the prior
> iteration failed and vice versa, resulting in an infinite loop.
> Presumably this could be fixed trivially by adding a hard-coded
> restriction that the amount of text consumed actually does increase
> with each iteration - e.g., use 'cur != -1 && len(cur) > len(prev)'
> as the predicate of the 'while' loop.

Very interesting point, and I believe you are correct.  I forgot to
consider the use of the ! operation in PEGS, and it's impact.  I think
you are also correct about the fix, but this brings me to:

> 3. Even after the above modification, it's not obvious to me that
> this modification would preserve the linear parse time guarantee of
> the basic PEG model... which has interesting implications that could be
> either positive or negative.  There now appears to be an additional
> "dimension" of looping - each parse rule at each of the n text
> positions can potentially be evaluated O(n) times, so it looks like
> we're up to O(n^2) complexity in theory, the same as an Early parse
> of an unambiguous CFG for example.  Which suggests the question: does
> this modification bring PEGs closer to unambiguous CFGs in some
> formal way?  For example, with this modification can we come up with
> a "middle-finding" PEG corresponding to the CFG "S -> [ab] S [ab] | a"?

At first, I thought that there was a way to modify the accounting by
'reassigning' the work for the inner loops to other parts of the
parsing to recover O(n) complexity, but as I tried to formulate an
argument, I realized this was not true.  However, as far as the
middle-finding goes, how about:

S = S a [ab] \ S S [ab] \ [ab]

I think that works.  Or at least when I hand simulated out on a test
case it seems to.  I would have missed the increase in expressiveness,
as I was simply looking for ease of coding (who wants to rewrite left
recursion), and for a easy way to get left associativity for binary
operations, but it looks like your right.  I wonder is this would make
PEGS an actual superset of unambiguous CFGS?

-Jeremy

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to