Hello everyone,

As you know, one of the greatest benefits of PEGs over CFGs is that the
former is deterministic. This means that even if two alternatives could
recognize the same string (or even the same language), one can rely on that
the parser will always choose the alternative you have specified first.

start <- A B / C
A     <- a
B     <- b
C     <- ab

will always return an AST of a sequence of an A and a B node, never a C
node with the value "ab".

There are a number of algorithms that extends PEGs to support left
recursion, but do they agree on how ordered choice should be interpreted if
one of the alternatives is left recursive? The situation I am thinking of:

A <- A a / a A / empty

While the first and second alternative describe the same language, a
sequence of zero or more a:s, we would expect it to return a parse tree
with left associativity, that is

(((...((a)a)...)a)a)

and not

(a(a(...(a(a))...)))

as suggested by the second alternative, since intiutively, a PEG should
prefer the first alternative of an ordered choice, even if the first choice
is defined in a left recursive way.

However, packrat parsers implementing left recursion through "growing a
seed parse", in the style of Warth et. al.:s algorithm, would skip the left
recursive first choice in wait for a seed parse. Then the right recursive
second alternative would consume all the a:s, skipping the left recursive
case at each position. On the way back up the stack, each left
recursive case would be tried, parsing all the a:s as a seed parse and
failing to parse one more, getting a memoized value of equal length for the
second alternative. It won't be reparsed since it is only part of left
recursive definitions at later positions. And then empty will be ignored
since it consumes less positions.

So all left recursions will fail, since their seed parse, the right
recursion, have already consumed all the a:s, leaving no trailing a for the
left recursive alternative to succeed on. That way we end up with an
unintuitive right-associative AST, although the left recursive alternative
was clearly the prioritized option.

Clearly, this example would not cause much trouble in practice, but are
there any languages that would?
Is there any algorithm that "solves" this situation?
Is there any research regarding the interpretation of ordered choice in
PEGs with left recursive definitions?

Sincerely, Johan Thiborg-Ericson

P. S.
I'm new in this mailing list and to mailing lists as a concept. If I am
doing anything wrong, please tell me :)
D. S.
_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to