Will you explain in this example what you do with the tree object afterwards?
In the example I gave, the production is converted to a number. I think in your example, you're showing how to convert a parse tree to a syntax tree, and I assume you later operator over the syntax tree? Thank you, -Alan On Sat, Dec 11, 2010 at 10:46:40AM +0100, Ondřej Bílka wrote: > exp <- digit '*' digit EOF -> {(lambda (x y) (* x y))} > This looks as step backwards > How it is different from old yacc > exp <- digit '*' digit EOF -> {$$ = (* $1 $3)} > > My approach is different. I made tree structure explicit in rule by binding as > in example > > tree = '(' number:value tree?:left tree?:rigth ')' result(Tree) > > result(Tree) creates Tree object and sets value,left,rigth fields to > corresponding values > > On Fri, Dec 10, 2010 at 01:47:59PM +1300, Peter Cashin wrote: > > Mathias: > > A good plan to separate out the: > > - grammar operators > > - parse tree building > > - parser action code > > I agree with you, but in my case I want to have the grammar language > > totally independent of the implementation programming language. > > So the grammar has no parser action code, although this can be added > > later > > in a particular programming language implementation. Parboiled has much > > closer integration with the programming language, which has advantages, > > but I really want grammar specifications that are totally independent. > > So now for the grammar operators, and parse tree building annotations. > > For > > many years I kept these separate: tree building annotations in the rule > > head (ie they annotate the rule name) or definition syntax (= for > > interior > > nodes, : for leaf nodes or terminals). The right hand side of the rule > > had > > the grammar expression body withs grammar operations only. One more head > > notation allows pruning the tree. > > This works fine, and maybe I should have stuck to that separation, but I > > introduced the `x prefix notation because I found that it was an > > advantage > > to be able to see the children nodes that would be generated by looking > > at > > the parent rule alone, and not having to refer to the children rule > > definitions to see how they are annotated. > > I have found that about half of many grammars turn out to be leaf nodes, > > so the ":" rule definition notation has a huge payoff in tree size and > > performance. > > Because of this the `x rule is not vital, but over time I found it was > > convenient, and a good trade-off for me. > > Cheers, > > Peter. > > > > On Fri, Dec 10, 2010 at 11:44 AM, Mathias <[1]math...@parboiled.org> > > wrote: > > > > Over the course of developing parboiled (PEG parsing engine for Java > > and > > Scala using internals DSLs for grammar specification) I found that it's > > best to clearly separate the following three things: > > - grammar operators > > - parse tree building > > - parser action code > > > > The operators, together with the rules they are applied to, define the > > language your grammar recognizes. Nothing more, nothing less. > > Whether your engine builds a parse tree for input of that language or > > whether it doesn't should likely not be specified at the operator > > level, > > but it some other way. > > In "parboiled for Java" operators and rules are modeled as method calls > > whereas parse tree building is controlled via annotations on those > > methods. You can selectively enable or disable the creation of parse > > tree nodes per grammar rule, which allows you to tweak parse tree > > building exactly to your needs. > > > > The interface between the parsing engine and custom parser action code > > is the thing that has changed the most over the course of parboileds > > life cycle so far. > > Initially parser action code had to access the parse tree nodes in > > order > > to get to matched input. Creating a custom object structure during the > > parsing run (e.g. an AST) was done by decorating the parse tree. So the > > parse tree was the "work bench" of the parsing process. > > Over time this heavy centering around the parse tree turned out to have > > two problems: > > > > 1. Low Performance. > > Always having to create a parse tree even when all you care about is > > the > > AST is just wasteful. Especially since the parse tree can contain a > > huge > > number of nodes for larger inputs (sometimes more nodes than input > > characters). > > > > 2. Less room for automatic optimizations. > > The structure of the parse tree is dictated by the structure of the > > grammar rules. If your parser action code is built under the assumption > > that the parse tree has a given structure there is little leeway for > > the > > parsing engine to apply automatic rule optimizations. Decoupling the > > parser action code from the parse tree opens up the possibility to > > apply > > all kinds of automatic grammar tweaking before running the parser the > > first time. The engine might decide to completely change the rule > > structure, as long as all changes do not change the recognized language > > and are transparent to the parser action code. > > > > Currently parboiled implements the interface between the parsing engine > > and the action code in the following way: > > 1. Actions can appear anywhere in a rule. > > 2. Actions can access the matched input text of the sub rule > > immediately > > preceding the action but not of any other rule (so there is no sub rule > > labeling required). > > 3. For working with custom objects (e.g. AST nodes) the engine provides > > a "Value Stack", which is a simple stack structure that serves as a > > fast > > work bench for a parsing run. Actions can push objects onto this stack, > > pop them off, swap them around, and so on. > > > > This solution completely decouples the parse tree from everything else. > > You can enable or disable parse tree building without any effect on the > > rest of the parser. There is no need for addressing sub rules in action > > expressions and given a somewhat efficient value stack implementation > > the whole thing is quite fast. Additionally, in "parboiled for Scala", > > the action code with its manipulations of the value stack can be > > statically type-checked at compile time, which is a huge plus. > > > > In case you are interested in more details or broader explanations, the > > parboiled documentation is quite complete. > > > > Cheers, > > Mathias > > > > --- > > [2]math...@parboiled.org > > [3]http://www.parboiled.org > > On 09.12.2010, at 21:01, Alan Post wrote: > > > > > I'm working on my PEG parser, in particular the interface between > > > the parse tree and the code one can attach to productions that > > > are executed on a successful parse. > > > > > > I've arranged for the two predicate operations, & and !, to not add > > > any output to the parse tree. �That means that the following > > > production: > > > > > > �rule <- &a !b "c" > > > > > > Produces the same parse tree as: > > > > > > �rule <- "c" > > > > > > Internally, this means that I recognize that the sequence operator > > > (which contains the productions '&a', '!b', and '"c"' in this > > > example) is being called with predicates in every position but one, > > > and rather than returning a list containing that single element, > > > I return just the single element. > > > > > > As I've been doing this, I've found that I want a new operator > > similar > > > to '&'. �'&' matches the production it is attached to, but it does > > not > > > advance the position of the input buffer. > > > > > > I'd like an operator that matches the production it is attached to, > > > advances the input buffer, but doesn't add anything to the parse > > > tree. > > > > > > Here's an example: > > > > > > �mulexp <- digit '*' digit EOF -> {(lambda (x y) (* x y))} > > > > > > the mulexp production is a sequence of four other rules, but only > > > two of them are needed by the associated code. �It would be nice > > > if I could write the code rule like it is above, rather than say > > > this: > > > > > > �(lambda (x op y EOF) (* x y)) > > > > > > Having to account for all the rules in the sequence, but really > > > only caring about two of them. �Here is the example rewritten > > > with '^' expressing "match the rule, advance the input, but don't > > > modify the parse tree": > > > > > > �mulexp <- digit ^'*' digit ^EOF -> {(lambda (x y) (* x y))} > > > > > > Before I go inventing syntax for this use case, will you tell me if > > > this is already being done with other parsers? �Have any of you had > > > this problem and already solved it, and if so, what approach did you > > > take? > > > > > > -Alan > > > -- > > > .i ko djuno fi le do sevzi > > > > > > _______________________________________________ > > > PEG mailing list > > > [4]...@lists.csail.mit.edu > > > [5]https://lists.csail.mit.edu/mailman/listinfo/peg > > > > _______________________________________________ > > PEG mailing list > > [6]...@lists.csail.mit.edu > > [7]https://lists.csail.mit.edu/mailman/listinfo/peg > > > > References > > > > Visible links > > 1. mailto:math...@parboiled.org > > 2. mailto:math...@parboiled.org > > 3. http://www.parboiled.org/ > > 4. mailto:PEG@lists.csail.mit.edu > > 5. https://lists.csail.mit.edu/mailman/listinfo/peg > > 6. mailto:PEG@lists.csail.mit.edu > > 7. https://lists.csail.mit.edu/mailman/listinfo/peg > > > _______________________________________________ > > PEG mailing list > > PEG@lists.csail.mit.edu > > https://lists.csail.mit.edu/mailman/listinfo/peg > > > -- > > Plasma conduit breach > > _______________________________________________ > PEG mailing list > PEG@lists.csail.mit.edu > https://lists.csail.mit.edu/mailman/listinfo/peg -- .i ko djuno fi le do sevzi _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg