Re: [PEG] and-predicate with slightly different behavior?

Ondřej Bílka Sat, 11 Dec 2010 10:24:58 -0800

Sorry I was replaying to two things at once. My objectinion about your example 
was first thing. 
My approach how do that is something like 
exp <- digit:x '*' digit:y EOF -> { (* x y)}
Then I started replaying to parboiled that we don't need build parse tree as 
syntax tree suffices.
Also in my peg you can refer to subrule used only once by its name.
On Sat, Dec 11, 2010 at 10:14:36AM -0700, .alyn.post. wrote:
> Will you explain in this example what you do with the tree object
> afterwards?
> 
> In the example I gave, the production is converted to a number.
> 
> I think in your example, you're showing how to convert a parse tree
> to a syntax tree, and I assume you later operator over the syntax
> tree?
> 
> Thank you,
> 
> -Alan
> 
> On Sat, Dec 11, 2010 at 10:46:40AM +0100, Ondřej Bílka wrote:
> > exp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
> > This looks as step backwards
> > How it is different from old yacc
> > exp <- digit '*' digit EOF -> {$$ = (* $1 $3)}
> > 
> > My approach is different. I made tree structure explicit in rule by binding 
> > as
> > in example
> > 
> > tree = '(' number:value tree?:left tree?:rigth ')' result(Tree)
> > 
> > result(Tree) creates Tree object and sets value,left,rigth fields to 
> > corresponding values
> > 
> > On Fri, Dec 10, 2010 at 01:47:59PM +1300, Peter Cashin wrote:
> > >    Mathias:
> > >    A good plan to separate out the:
> > >    - grammar operators
> > >    - parse tree building
> > >    - parser action code
> > >    I agree with you, but in my case I want to have the grammar language
> > >    totally independent of the implementation programming language.
> > >    So the grammar has no parser action code, although this can be added 
> > > later
> > >    in a particular programming language implementation. Parboiled has much
> > >    closer integration with the programming language, which has advantages,
> > >    but I really want grammar specifications that are totally independent.
> > >    So now for the grammar operators, and parse tree building annotations. 
> > > For
> > >    many years I kept these separate: tree building annotations in the rule
> > >    head (ie they annotate the rule name) or definition syntax (= for 
> > > interior
> > >    nodes, : for leaf nodes or terminals). The right hand side of the rule 
> > > had
> > >    the grammar expression body withs grammar operations only. One more 
> > > head
> > >    notation allows pruning the tree.
> > >    This works fine, and maybe I should have stuck to that separation, but 
> > > I
> > >    introduced the `x prefix notation because I found that it was an 
> > > advantage
> > >    to be able to see the children nodes that would be generated by 
> > > looking at
> > >    the parent rule alone, and not having to refer to the children rule
> > >    definitions to see how they are annotated.
> > >    I have found that about half of many grammars turn out to be leaf 
> > > nodes,
> > >    so the ":" rule definition notation has a huge payoff in tree size and
> > >    performance.
> > >    Because of this the `x rule is not vital, but over time I found it was
> > >    convenient, and a good trade-off for me.
> > >    Cheers,
> > >    Peter.
> > > 
> > >    On Fri, Dec 10, 2010 at 11:44 AM, Mathias <[1]math...@parboiled.org>
> > >    wrote:
> > > 
> > >      Over the course of developing parboiled (PEG parsing engine for Java 
> > > and
> > >      Scala using internals DSLs for grammar specification) I found that 
> > > it's
> > >      best to clearly separate the following three things:
> > >      - grammar operators
> > >      - parse tree building
> > >      - parser action code
> > > 
> > >      The operators, together with the rules they are applied to, define 
> > > the
> > >      language your grammar recognizes. Nothing more, nothing less.
> > >      Whether your engine builds a parse tree for input of that language or
> > >      whether it doesn't should likely not be specified at the operator 
> > > level,
> > >      but it some other way.
> > >      In "parboiled for Java" operators and rules are modeled as method 
> > > calls
> > >      whereas parse tree building is controlled via annotations on those
> > >      methods. You can selectively enable or disable the creation of parse
> > >      tree nodes per grammar rule, which allows you to tweak parse tree
> > >      building exactly to your needs.
> > > 
> > >      The interface between the parsing engine and custom parser action 
> > > code
> > >      is the thing that has changed the most over the course of parboileds
> > >      life cycle so far.
> > >      Initially parser action code had to access the parse tree nodes in 
> > > order
> > >      to get to matched input. Creating a custom object structure during 
> > > the
> > >      parsing run (e.g. an AST) was done by decorating the parse tree. So 
> > > the
> > >      parse tree was the "work bench" of the parsing process.
> > >      Over time this heavy centering around the parse tree turned out to 
> > > have
> > >      two problems:
> > > 
> > >      1. Low Performance.
> > >      Always having to create a parse tree even when all you care about is 
> > > the
> > >      AST is just wasteful. Especially since the parse tree can contain a 
> > > huge
> > >      number of nodes for larger inputs (sometimes more nodes than input
> > >      characters).
> > > 
> > >      2. Less room for automatic optimizations.
> > >      The structure of the parse tree is dictated by the structure of the
> > >      grammar rules. If your parser action code is built under the 
> > > assumption
> > >      that the parse tree has a given structure there is little leeway for 
> > > the
> > >      parsing engine to apply automatic rule optimizations. Decoupling the
> > >      parser action code from the parse tree opens up the possibility to 
> > > apply
> > >      all kinds of automatic grammar tweaking before running the parser the
> > >      first time. The engine might decide to completely change the rule
> > >      structure, as long as all changes do not change the recognized 
> > > language
> > >      and are transparent to the parser action code.
> > > 
> > >      Currently parboiled implements the interface between the parsing 
> > > engine
> > >      and the action code in the following way:
> > >      1. Actions can appear anywhere in a rule.
> > >      2. Actions can access the matched input text of the sub rule 
> > > immediately
> > >      preceding the action but not of any other rule (so there is no sub 
> > > rule
> > >      labeling required).
> > >      3. For working with custom objects (e.g. AST nodes) the engine 
> > > provides
> > >      a "Value Stack", which is a simple stack structure that serves as a 
> > > fast
> > >      work bench for a parsing run. Actions can push objects onto this 
> > > stack,
> > >      pop them off, swap them around, and so on.
> > > 
> > >      This solution completely decouples the parse tree from everything 
> > > else.
> > >      You can enable or disable parse tree building without any effect on 
> > > the
> > >      rest of the parser. There is no need for addressing sub rules in 
> > > action
> > >      expressions and given a somewhat efficient value stack implementation
> > >      the whole thing is quite fast. Additionally, in "parboiled for 
> > > Scala",
> > >      the action code with its manipulations of the value stack can be
> > >      statically type-checked at compile time, which is a huge plus.
> > > 
> > >      In case you are interested in more details or broader explanations, 
> > > the
> > >      parboiled documentation is quite complete.
> > > 
> > >      Cheers,
> > >      Mathias
> > > 
> > >      ---
> > >      [2]math...@parboiled.org
> > >      [3]http://www.parboiled.org
> > >      On 09.12.2010, at 21:01, Alan Post wrote:
> > > 
> > >      > I'm working on my PEG parser, in particular the interface between
> > >      > the parse tree and the code one can attach to productions that
> > >      > are executed on a successful parse.
> > >      >
> > >      > I've arranged for the two predicate operations, & and !, to not add
> > >      > any output to the parse tree. �That means that the following
> > >      > production:
> > >      >
> > >      > �rule <- &a !b "c"
> > >      >
> > >      > Produces the same parse tree as:
> > >      >
> > >      > �rule <- "c"
> > >      >
> > >      > Internally, this means that I recognize that the sequence operator
> > >      > (which contains the productions '&a', '!b', and '"c"' in this
> > >      > example) is being called with predicates in every position but one,
> > >      > and rather than returning a list containing that single element,
> > >      > I return just the single element.
> > >      >
> > >      > As I've been doing this, I've found that I want a new operator 
> > > similar
> > >      > to '&'. �'&' matches the production it is attached to, but it does 
> > > not
> > >      > advance the position of the input buffer.
> > >      >
> > >      > I'd like an operator that matches the production it is attached to,
> > >      > advances the input buffer, but doesn't add anything to the parse
> > >      > tree.
> > >      >
> > >      > Here's an example:
> > >      >
> > >      > �mulexp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
> > >      >
> > >      > the mulexp production is a sequence of four other rules, but only
> > >      > two of them are needed by the associated code. �It would be nice
> > >      > if I could write the code rule like it is above, rather than say
> > >      > this:
> > >      >
> > >      > �(lambda (x op y EOF) (* x y))
> > >      >
> > >      > Having to account for all the rules in the sequence, but really
> > >      > only caring about two of them. �Here is the example rewritten
> > >      > with '^' expressing "match the rule, advance the input, but don't
> > >      > modify the parse tree":
> > >      >
> > >      > �mulexp <- digit ^'*' digit ^EOF -> {(lambda (x y) (* x y))}
> > >      >
> > >      > Before I go inventing syntax for this use case, will you tell me if
> > >      > this is already being done with other parsers? �Have any of you had
> > >      > this problem and already solved it, and if so, what approach did 
> > > you
> > >      > take?
> > >      >
> > >      > -Alan
> > >      > --
> > >      > .i ko djuno fi le do sevzi
> > >      >
> > >      > _______________________________________________
> > >      > PEG mailing list
> > >      > [4]...@lists.csail.mit.edu
> > >      > [5]https://lists.csail.mit.edu/mailman/listinfo/peg
> > > 
> > >      _______________________________________________
> > >      PEG mailing list
> > >      [6]...@lists.csail.mit.edu
> > >      [7]https://lists.csail.mit.edu/mailman/listinfo/peg
> > > 
> > > References
> > > 
> > >    Visible links
> > >    1. mailto:math...@parboiled.org
> > >    2. mailto:math...@parboiled.org
> > >    3. http://www.parboiled.org/
> > >    4. mailto:PEG@lists.csail.mit.edu
> > >    5. https://lists.csail.mit.edu/mailman/listinfo/peg
> > >    6. mailto:PEG@lists.csail.mit.edu
> > >    7. https://lists.csail.mit.edu/mailman/listinfo/peg
> > 
> > > _______________________________________________
> > > PEG mailing list
> > > PEG@lists.csail.mit.edu
> > > https://lists.csail.mit.edu/mailman/listinfo/peg
> > 
> > 
> > -- 
> > 
> > Plasma conduit breach
> > 
> > _______________________________________________
> > PEG mailing list
> > PEG@lists.csail.mit.edu
> > https://lists.csail.mit.edu/mailman/listinfo/peg
> 
> -- 
> .i ko djuno fi le do sevzi


-- 

Internet exceeded Luser level, please wait until a luser logs off before 
attempting to log back on.

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] and-predicate with slightly different behavior?

Reply via email to