I've recently finished up a round of refactors and feature additions to Parrot's tree grammar engine (TGE). (TGE is the part of the compiler tools that takes a raw parse tree from a language parser written in PGE and transforms that parse tree into an abstract syntax tree, then an opcode syntax tree, and ultimately into a PIR source file.)

In addition to general cleanups to make the code cleaner, saner, and more maintainable, these are the significant feature changes:

* The TGE compiler, the component that compiles tree-grammar source files (.tg) down to PIR code, now supports a new, more sensible syntax for defining tree-grammar rules:

  transform <rulename> (<pattern>) <modifiers> { <transformation code> }

(The word "transform" is both a verb "to alter something" and a noun "a
rule for making a transformation".)

For example:

  transform result (PAST::Var) :language('PIR') {
    .local pmc result
    result = new 'POST::Var'
    # ...
  }

This defines a transform named 'result' that applies to nodes of type 'PAST::Var'. In addition, it declares that the syntax used inside the body of the rule is PIR. (PIR is the only valid language at the moment, but others are on the way...)

* The TGE compiler now handles a 'grammar' keyword, which compiles down to the appropriate PIR instructions to create a class that inherits from another class:

  grammar MyTreeGrammar is TGE::Grammar;

This means that .tg files now compile down to complete PIR (or bytecode) class libraries.

TGE::Grammar is the base class of all tree grammars, but you might inherit from your own custom subclass:

  grammar TweenGrammar is TGE::Grammar;

  # ...
  grammar MyTreeGrammar is TweenGrammar;

* The TGE compiler now supports POD, so you can include POD documentation in your .tg files.

* Transforms are now methods on the tree-grammar class. You can define utility methods or object attributes in the grammar. (For example, the TGE::Grammar base class now has an attribute 'symbols' that can be used to track symbols during transformations. This is most useful on the AST->OST transformation.)

Allison

Reply via email to