As is well known, PEGs have problems with expressing some grammars like Expr <- Sum / Diff / Num Sum <- Expr, '+', Num Diff <- Expr, '-', Num This uses left recursion which is not a straightforward concept and is hard to formalize and hard to implement in a parser. Of course, it is possible to write Expr <- Num, ('+'/'-', Num)* but this will not give (Expr (Sum (Diff '1' '2') '3') as a parse tree for '1-2+3'. I have an idea how to replace left recursion by something more straightforward.
Sometimes e_0 followed by e_1 can get some additional semantic meaning. For example, in PEG's own grammar Primary followed by '?' becomes Optional, Primary followed by '*' becomes ZeroOrMore, Primary followed by '+' becomes OneOrMore. I will call such a rule a "gulper rule" because it "gulps" an expression to which it is applied. We can write a series of gulper rules as follows: e_0 (?) id_1: e_1 / id_2: e_2 / ... / id_n: e_n When e_0 is followed by e_1 then the sequence is interpreted as id_1. When e_0 is not followed by e_1 and is followed by e_2 then the sequence is interpreted as id_2 and so on. When e_0 is not followed by any of e_1,...,e_n then e_0 is just e_0. Of course, these one-shot gulper rules are no more than syntactic sugar for ordinary PEG rules. id_0 <- e_0 (?) id_1: e_1 / id_2: e_2 / ... / id_n: e_n is essentially equivalent to id_0 <- id_1 / id_2 / ... / id_n / e_0 id_1 <- e_1, e_0 ... id_n <- e_n, e_0 More important is that some cases of left recursion can be replaced by repetitive gulper rules. We can write such rules as follows: e_0 (*) id_1: e_1 / id_2: e_2 / ... / id_n: e_n For example, for sums and differences the grammar can be Expr <- Num (*) Sum: '+', Num / Diff: '-', Num. Such extended PEGs should not produce any serious problems with top-down parsing. ----------------------------- Alexander Tsyplakov _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg