As is well known, PEGs have problems with expressing some grammars like
Expr <- Sum / Diff / Num
Sum <- Expr, '+', Num
Diff <- Expr, '-', Num
This uses left recursion which is not a straightforward concept and is hard to 
formalize and hard to implement in a parser. Of course, it is possible to write 
Expr <- Num, ('+'/'-', Num)*
but this will not give (Expr (Sum (Diff '1' '2') '3') as a parse tree for 
'1-2+3'.
I have an idea how to replace left recursion by something more straightforward. 

Sometimes e_0 followed by e_1 can get some additional semantic meaning. For 
example, in PEG's own grammar Primary followed by '?' becomes Optional, Primary 
followed by '*' becomes ZeroOrMore, Primary followed by '+' becomes OneOrMore. 
I will call such a rule a "gulper rule" because it "gulps" an expression to 
which it is applied.

We can write a series of gulper rules as follows:
e_0 (?) id_1: e_1 / id_2: e_2 / ... / id_n: e_n
When e_0 is followed by e_1 then the sequence is interpreted as id_1. When e_0 
is not followed by e_1 and is followed by e_2 then the sequence is interpreted 
as id_2 and so on. When e_0 is not followed by any of e_1,...,e_n then e_0 is 
just e_0.

Of course, these one-shot gulper rules are no more than syntactic sugar for 
ordinary PEG rules.
id_0 <- e_0 (?) id_1: e_1 / id_2: e_2 / ... / id_n: e_n
is essentially equivalent to
id_0 <- id_1 / id_2 / ... / id_n / e_0
id_1 <- e_1, e_0
...
id_n <- e_n, e_0

More important is that some cases of left recursion can be replaced by 
repetitive gulper rules. We can write such rules as follows:
e_0 (*) id_1: e_1 / id_2: e_2 / ... / id_n: e_n
For example, for sums and differences the grammar can be
Expr <- Num (*) Sum: '+', Num / Diff: '-', Num.

Such extended PEGs should not produce any serious problems with top-down 
parsing.
    -----------------------------
    Alexander Tsyplakov



_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to