Hi Bryan, I'm not sure if this is quite the answer you were looking for.
In my latest version of a PEG parsing library I tried a number of hacks (like allowing tree decomposition syntactic actions, and special rules to express binary recursion) but in the end I abandoned them all as just being too complicated to explain to outsiders. I decided to go back to basics where each rule literally corresponds to a matching algorithm. This seems to be more in the spirit of a PEG. Therefore all of the left-recursion examples simply lead to a stack-overflow. I find that expressing left-recursive rules iteratively rather than recursively leads to easier to understand grammars, even though the parse tree is a pain to work with. To deal with the ugly parse trees I simply use a tree transformation to convert the tree into the desired form. So for a concrete example, here is a snippet from a JavaScript grammar from my latest parsing library (Jigsaw) public static Rule LeafExpr = Node(ParenExpr | NewExpr | Function | Literal | Identifier); public static Rule PrefixExpr = Node(PrefixOp + Recursive(() => UnaryExpr)); public static Rule UnaryExpr = (PrefixExpr | LeafExpr) + WS; public static Rule PostfixOp = Field | Index | ArgList; public static Rule PostfixExpr = Node(UnaryExpr + WS + ZeroOrMore(PostfixOp)); Here is a snippet from the "Node" rewriting algorithm that transforms the "PostfixExpr" node into something resembling the output of a left-recursive rule parse-tree. case "PostfixExpr": { Debug.Assert(n.Count != 0); // Is it really a postfix expression? If not return the sub-expression if (n.Count == 1) return n.Nodes[0]; var last = n.Nodes.Last(); switch (last.Label) { case "Field": return LeftGroup(n, "FieldExpr"); case "Index": return LeftGroup(n, "IndexExpr"); case "ArgList": return LeftGroup(n, "CallExpr"); } Cheers, Christopher Diggins http://code.google.com/p/jigsaw-library On Sun, Oct 9, 2011 at 4:39 PM, Bryan Ford <bryan.f...@yale.edu> wrote: > Left recursion in PEGs indeed seems like an interesting can of worms. For > those interested, I'm wondering how a few example grammars behave under your > preferred left-recursive parsing technique, and how you think they should > behave. > > First, a trivially evil left-recursive grammar: > > S <- S > > For example, does your parser detect and reject this somehow, or does it > behave the same as 'S <- f'? (I hope it doesn't result in an infinite loop > at runtime anyway. :) ) > > Now a grammar that's weird, not necessarily evil, in a slightly more subtle > way: > > S <- S / a > > Does this behave the same as 'S <- a', or do something else? How should it > behave? > > Cranking up the evilness factor one notch with a mutually left-recursive > grammar… > > S <- T / a > T <- S / &a > > Given the input string "a", does this behave the same as 'S <- a' > (succeeding and consuming) or the same as 'S <- &a' (succeeding but > consuming no input)? Do S and T behave the same way or differently? Should > they? > > Now, another grammar that's not necessarily evil but strange in a slightly > different way: > > S <- Saa / a / > > Given the input string 'aaaa', for example, does/should this grammar > consume just 3 or all 4 a's, or does it do something else? What should it > do? > > Cheers, > Bryan > _______________________________________________ > PEG mailing list > PEG@lists.csail.mit.edu > https://lists.csail.mit.edu/mailman/listinfo/peg >
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg