[PEG] Left recursion

Roman Redziejowski Fri, 07 Jan 2011 14:04:36 -0800

I have been following for a while the discussion of left recursion inPEG, and would like to express my humble opinion on the subject.

I see PEG as a language for writing a set of recursive procedures,disguised as "rules" of a BNF-like grammar.The language is simple and transparent: one can easily see what eachprocedure does.The problem is that, due to the limited backtracking, the emergentbehavior of the set of these procedures is not at all obvious. Justconsider the classical example of "A <- aAa /aa", that applied to astring of eight a's consumes all of them, but applied to a string ofnine consumes only two. I have now spent a couple of years, and wrote acouple of papers, in a search of methods "to find out what this damnedthing is doing". (I am quoting here someone else'sstatement of the problem.)

The mechanisms for handling left recursion, proposed by Warth & al. andTratt addinvisiblecomplex behavior to the procedures (I mean, the PEGrules). I believe this makes the emergent behavior of the whole set evenmore complex and difficult to predict.

The question is: why one wants to spend so much effort to imitate in PEGthis specific feature of BNF?

One answer is: just for the fun of it. But, getting more serious, I seetwo reasons:

(1) To construct PEG for a language defined in BNF. For some reason,such definitions tend to be full of left recursion.

(2) To get the desired shape of syntax tree generated by the parser.

Concerning (1), I have serious doubts if rewriting left-recursive BNF asPEG (replacing | by /) and applying the suggested mechanisms willproduce the desired language. I made once an effort of converting to PEGthe definition of Java Primary. The definition in Java LanguageSpecification is heavily left-recursive, with several levels ofindirection. I used X = BA* as the least fixpoint of X = XA | B andobtained a rather complex set of PEG rules - just to find that they DONOT define the required language. The reason were greedy alternativesand repetitions hiding parts of the language. I am pretty sure the samewould happen with "left-recursive" PEG.

I see the discussion of (2) going on at full volume. It seems to bebased on a strong conviction that parsers MUST automatically produceparse trees, and these trees MUST reflect the odrer of operations. For aleft-to-right execution, this requires a left-recursive rule.My opinion about these MUSTs is "not at all". My "Mouse" does notautomatically generate parse trees. Its semantic actions have access tothe result of rule such as "sum <- number (+ number)*" and can accessthe "number"s in any desired order. In one of my applications of"Mouse", the semantic action directly performs the computation. Inanother, the semantic action produces a tree.Letting semantic actions construct the tree has a number of advantages.You do not pollute the grammar with extra instructions of the kind "donot construct a tree showing how identifier is composed of letters".More important, you can tailor the tree's nodes to suit your specificapplication. (By the way, in "Mouse" you do not either pollute thegrammar with semantic actions; they are placed in a separate Java class.)

A philosophical remark: aren't we sometimes unnecessarily impressed bythe elegance of recursion (as compared to iteration)?Compare "sum = sum + number | number" with "sum = number (+ number)*".Which one clearer conveys the message that "sum" is a sequence of one ormore "number"s separeted by "+"?Compare "type= type[ ] | name" with "type = name ([ ])*". Which oneclearer conveys the message that "type" is a "name" followed by zero ormore pairs of brackets?

(The answer is probably  "depends on your training"...)

Regards
Roman

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

[PEG] Left recursion

Reply via email to