Hi Alan

I have been working on grammar rules I'm calling PBNF, for Parser-BNF, that
can be automatically executed as a parser. The PEG operators are a subset of
the PBNF operators, but to fully automate a grammar I need to define the
implicit syntax tree that the grammar rules specify.

Your issue comes up all the time in that context: my approach is to have a
literal 'x' match without producing a syntax tree node (you can always add a
rule if you do want it in the syntax tree). Rules that generate leaf nodes
are designated in the grammar, they are terminal rules if you like, so they
generate a literal match (but no internal syntax sub-tree).  But sometimes
you want to reference a rule but still not to generate a syntax tree node,
and I have used the `x operator: the ` prefix is a sort of quote like, and
its unobtrusive in the grammar.

If you want to take a look you will find it all at:
http://github.com/spinachtree/gist

Maybe other people have different solutions, I'd like to know..

Cheers,
Peter.


On Fri, Dec 10, 2010 at 9:01 AM, Alan Post <alanp...@sunflowerriver.org>wrote:

> I'm working on my PEG parser, in particular the interface between
> the parse tree and the code one can attach to productions that
> are executed on a successful parse.
>
> I've arranged for the two predicate operations, & and !, to not add
> any output to the parse tree.  That means that the following
> production:
>
>  rule <- &a !b "c"
>
> Produces the same parse tree as:
>
>  rule <- "c"
>
> Internally, this means that I recognize that the sequence operator
> (which contains the productions '&a', '!b', and '"c"' in this
> example) is being called with predicates in every position but one,
> and rather than returning a list containing that single element,
> I return just the single element.
>
> As I've been doing this, I've found that I want a new operator similar
> to '&'.  '&' matches the production it is attached to, but it does not
> advance the position of the input buffer.
>
> I'd like an operator that matches the production it is attached to,
> advances the input buffer, but doesn't add anything to the parse
> tree.
>
> Here's an example:
>
>  mulexp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
>
> the mulexp production is a sequence of four other rules, but only
> two of them are needed by the associated code.  It would be nice
> if I could write the code rule like it is above, rather than say
> this:
>
>  (lambda (x op y EOF) (* x y))
>
> Having to account for all the rules in the sequence, but really
> only caring about two of them.  Here is the example rewritten
> with '^' expressing "match the rule, advance the input, but don't
> modify the parse tree":
>
>  mulexp <- digit ^'*' digit ^EOF -> {(lambda (x y) (* x y))}
>
> Before I go inventing syntax for this use case, will you tell me if
> this is already being done with other parsers?  Have any of you had
> this problem and already solved it, and if so, what approach did you
> take?
>
> -Alan
> --
> .i ko djuno fi le do sevzi
>
> _______________________________________________
> PEG mailing list
> PEG@lists.csail.mit.edu
> https://lists.csail.mit.edu/mailman/listinfo/peg
>
_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to