I have been working on my own TPDL parser (I make the distinction from PEG because while it works in approximately the same way, it does not support the fancy Regular-Expression-like syntax, and right now I am working on a token level rather than a character level) in C (imagine that!) as part of my summer goal of building a compiler. The parser is working very well, I have it support left recursion directly (per a paper listen on Bryan's website), and I am all-around very happy with it.
I wanted to mention one of the way in which I have been building an AST from my parse tree while the tree itself is being constructed (a more appropriate term might be: reduced parse tree). I use one rule within the parser and also two operators within the grammar. One of these operators is already well- understood and has been mentioned on this list several times. My two operators are: i) A non-excludable operator (-), which requires that a non/terminal appear in the reduced parse tree. If no such operator is used, then terminals are *not* added to the tree as leafs, and non-terminals, where only one child exists, are replaced with their child (my automatic reduction rule). ii) A raise children operator (^), which puts the derivations of a particular non-terminal in a production in place of the non-terminal. This allows me to collect lists of nodes. Note: an identifier surrounded with '<' and '>' signals a token. The empty '<>' represents the empty token (epsilon transition). The notation I use for my grammars is fairly straightforward. Here are some simple things I am able to do with my one rule and two operators: Things : Thing ^Things : <> ; ListOfThings : -Things ; When applied with 'ListOfThings' as the starting production, this grammar will result in a tree of 'Thing' nodes, rooted at a 'Things' node. Note: 'ListOfThings' will not appear in the tree. One normally expects terminals / tokens to appear as the leaves of the tree; however, with these rules we can force non-terminals to appear as leaves. In my case, this has helpful as a flag. For example, if a certain token appears in the terminal stream then I can have a non-terminal appear as a leaf node in the parse tree; however, if the token does not appear then no such non- terminal will appear. Rule : RuleFlag -<non_terminal> : RuleFlag -<terminal> : RuleFlag -<epsilon> ; RuleFlag : -NonExcludable : -Subsumable : <> ; NonExcludable : <dash> ; Subsumable : <up_arrow> ; The above grammar is actually describing the syntax of the grammar itself, in this case, how a non/terminal looks in the right-hand-side of a production. Subsumable might not be the most appropriate word; however, it is what I thought of at the time. In the above example, if a '-' appears before a non_terminal token then a 'NonExcludable' non-terminal node will appear as a leaf in the parse tree. I was wondering if anyone has done similar things, other than to have a similar operator to my non-excludable (-) one. Also, does anyone have any thoughts on my approach; am I possibly missing an obvious pitfalls? _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg