Re: [PEG] Left Recursion in PEG

Nicolas Laurent Thu, 27 Oct 2016 07:23:36 -0700

Hello, Juancarlo.

I have a paper (http://norswap.com/pubs/sle2015.pdf) that discusses
left-recursion in PEG, as well as a matching implementation (
https://github.com/norswap/autumn) (which is now however completely
different from what the paper describes).

So maybe I can shed light on some aspects of left-recursion within PEG for
you.

The solution follows, in spirit, Warth's idea of "growing a seed".

The algorithm can actually be described quite succinctly. It requires all
left-recursive rules to be marked as such. I suspect it may sound familiar,
but I'll include the description anyway to make sure we are on the same
page.

The idea is to iteratively invoke the parser for the left-recursive rule
(let's call it R) until as much input as possible has been matched. To do
so, you start by forbidding left-recursion. This is how you obtain your
"seed" (let's call it R0). You then try R again but this time, upon
left-recursion, the parser will pretend as though the seed (R0) has just
been matched. You call the new result R1. Then you iterate R again, but
this time you use R1 upon left-recursion. Etc, until the result (the amount
of matched input) stops growing.

This works well, but there is a well-noted pitfall: rules that are both
left- and right-recursive will be parsed right-recursively.

This is not by itself a problem. A rule like "E = E + E" is naturally
ambiguous. I strongly object to people who say that one associativity is
more correct than the other.

However, if your grammar is annotated with the required associativity, then
you can use a modified version of the algorithm to support
right-associativity.

The trick is to forbid all recursion (instead of only left-recursion) in
the algorithm described above. This forces the parser to fall back on the
non-recursive case for the right side, and hence to parse left-recursively.

There is however a problem (note that this is the *only* pitfall in the
final scheme): it precludes well-defined rules with "inner" recursion to
work, e.g. "E = E (E) E". The middle "E" is not ambiguous and should be
allowed to recurse. The usual fix is to add an "escape hatch operator" that
allows recursion to resume within a recursive rule. This is only required
for rules where (1) right-associativity is required, (2) there is "inner"
recursion.

This whole scheme is due to Seaton, who used it in his Katahdin language (
http://chrisseaton.com/katahdin/), although my own paper is probably a
better starting point on this particular issue.

After contorting this algorithm in many directions, a have yet to find a
case where it does not work reliably. It supports indirect  and hidden
left-recursion just fine, as well as nested left-recursive rules. In my
recent work (http://norswap.com/pubs/sle2016.pdf), I even use it with
context-sensitive parsing.

I hope this helps.

Cheers,

Nicolas LAURENT

On 27 October 2016 at 03:09, Juancarlo Añez <apal...@gmail.com> wrote:

>
> On Mon, Oct 10, 2016 at 8:54 AM, Sérgio Medeiros <sqmedei...@gmail.com>
> wrote:
>
>> We need to take some care here. It is difficult to establish a general
>> relation between CFGs and PEGs in order to guarantee that a grammar
>> G will give us the same result when interpreted as a CFG and when
>> interpreted as a PEG.
>>
>
> Indeed.
>
> I've mean to refer to the "set of languages expressible with grammars of
> type X". It's likely I wasn't precise in what I said.
>
>
>> I have worked on this subject (https://arxiv.org/abs/1304.3177), but we
>> did not establish a relation between an LR grammar interpreted as a CFG
>> and what would be an equivalent PEG.
>>
>
> Thanks for the reference. I'll study it.
>
> My work is in parsing applied to large collections of text in legacy
> programming languages. Finding an algorithm that finds the equivalent PEG
> grammar for an LR one, or the likes, is more interesting than PEG+Left.
>
> I've kept left recursion in Grako because I reckon the tool is used in
> teaching, where allowing some simple forms of left recursion may be useful,
> but just mentioning the experimental support for left recursive grammars
> sets the wrong expectations. For example, when limits are found in the
> current implementation of Warth's algorithm are found, they are reported as
> bugs against the parser generator.
>
>
>> Well, I think you can say the same about parsers based on CFGs, you can
>> build an AST with left-associativity operators without a left-recursive
>> CFG.
>>
>> Usually, it is easier to build a left-associative with a left-recursive
>> grammar
>> than with a right-recursive one, but you can choose which strategy to use.
>>
>
> "The 'correct' associativity" is just one of the expectations of users new
> to parsing.
>
>
>> In case you have examples of left-recursive grammars which do not give
>> the expected result when interpreted as a PEG, you can post them here.
>>
>
> The implementation of Warth's algorithm in Grako drops information about
> associativity when it enters the iterative step. I hadn't noticed until I
> implemented your algorithm, which does recover associativity, because I
> don't use left recursion in my work.
>
> Left or right associativity doesn't matter because it is easy to convert
> one to the other when the parsing semantics are none. If there's no
> associativity (the AST is flat), then another parsing step would be
> required to impose the associativity.
>
> If you care about the examples, I can recover some from the unit tests for
> left-associativity in Grako. Your algorithm passes the cases included in
> bug reports against Grako, but fails to parse cases that the current Warth
> implementation passes.
>
> Personally, because I publish a tool that can be used for
> professional/commercial work, I'm inclined to remove support for left
> recursive grammars until I can provide rules to construct "valid"
> left-recursive grammars.
>
> Cheers,
>
>
> --
> Juancarlo *Añez*
>
> _______________________________________________
> PEG mailing list
> PEG@lists.csail.mit.edu
> https://lists.csail.mit.edu/mailman/listinfo/peg
>
>

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] Left Recursion in PEG

Reply via email to