Re: Syntax for PAST Optimizations

Allison Randal Mon, 31 May 2010 04:41:17 -0700

On 5/31/10 3:11 AM, Tyler Curtis wrote:

Part of the purpose of my GSoC project is to develop tools to make
writing PAST::Optimizations not just possible, but actually
convenient.


Excellent, it's a much needed task.


There are existing special tools for tree transformations: for
example, the TGE or Tree Grammar Engine.
TGE provides a syntax for transformations on trees. To some extent.
More specifically, it provides a syntax for dispatch based on the type
of node in the tree to a PIR sub that does the transformation work. It
also has the ability to define subs to be executed on the attributes
of nodes of a certain type whenever a transform is performed on such a
node.

Agreed, TGE doesn't have a syntax, so you can't look there for syntax.In fact, a syntax is the single biggest thing TGE needs, and you couldbe the person to write that syntax. It also doesn't have the treematching syntax that was planned for it (allowing more detailed matchingthan "this is a PAST::Op" node, to selecting individual features).

If working on the next version of TGE inspires you, what you should knowis that it's just an implementation of basic Attribute Grammars, whichhave been written about since the dawn of computing theory (firstmentioned in an ancient paper by Don Knuth). They are a way oftransforming trees by traversing up and down the tree building up"attributes" (i.e. annotations) on the tree.

In actual practice of writing compilers, I've found that you pretty muchnever need to traverse up the tree, so a huge simplification is possibleby only implementing downward traversals. For optimizers there's achance you might need some upward traversals.

Caveat: Even though they've got a great deal of history behind them, I'mno longer convinced that attribute grammars are the best way to writecompilers. They push you into thinking about annotating the existingtree, when you really should be thinking about how to build thereplacement tree (or replacement node for optimizations). That is, theyforce your brain into the wrong flow.

What other options do we have for syntaxes?

One that has been mentioned to me is XPath. XPath is a syntax for
matching sub-trees of an XML document.

Yes, I also considered XPath and agreed, it's not the right fit. We cando better.

http://pastie.org/985050 is an example of how a HLL compiler could
apply optimizations to its generated code. Line 8 in Compiler.pm is
commented out, but if some of the language-specific optimizations in
Optimizer.pm were particularly complex, there could be a separate
"SomeLang::Optimizer::Actions" class that would hold action methods
similar to those used for normal NQP-rx grammars.

HLL::Compiler is definitely the place to hook in your optimizationstages when run as part of a larger compiler. But, HLL::Compiler acceptsany arbitrary stage of any arbitrary code, so this convenience ispossible no matter what the optimizations are written in.

* Not easy to programmatically construct optimizations outside of NQP.
  Is this actually useful? I'm not sure.


Yes, it's useful.

I'm curious what others think of the two syntax proposals
included(XPath and Regexes), and if anyone has any other ideas of
possible syntaxes.

If you are inspired by Perl 6 Regexes, then go ahead and do it that way.As pmichaud said, it is part of the plan for Regexes so someone will doit eventually, and that someone could be you.

There are really two parts of a tree transformation engine, one ismatching (finding) the nodes of the existing tree, and the other isbuilding the nodes of the new tree.

I have doubts at how well Perl 6 Regexes will do at matching nodes. Notfrom practical experience of trying it out, mind you, but from atheoretical perspective. In general, a system that tries to do twocompletely separate things will end up doing both of them poorly. And,from a straight complexity perspective, Perl 6 Regexes are alreadyenormously complex from trying to be good at natural language processingand computer language processing. Adding another dimension of being goodat tree processing is a combinatorial explosion of complexity, whichjust may push it over the edge. Also, even with Perl 6 Regexes, you'dreally be inventing new syntax anyway, you'd just be trying to force itinto a string matching mold. (http://pastie.org/983989 is a good exampleof this, where the intent of the rule is buried under the fact that it'strying to act like string matching.)

On the side of building up new nodes, I find the NQP syntax (not theRegex part, but the plain old tree transformation part, which is acrippled subset of Perl 6) hopelessly byzantine, and in my most recentcompilers have gone back to writing the tree transformation rules inpure PIR. (Not even TGE+PIR, just plain old PIR classes with methods foreach transformation rule.) Where I used to spend hours in frustrationtrying to get NQP to do what I wanted, I now spend minutes writing thePIR. Which is *not* to say that PIR is the best tool for the job, but tosay that we can do so much better than what we're doing now.


------

Rather than looking to adopt a existing syntax, can you take a step backand think "What do I want to do and how do I make it easy?" A quick listof the operations I regularly perform when writing a tree transformationare:

- Decide if this is the node I want to process (matching, which couldbe declarative syntax).

- Access attributes/children of a node: either class attribute names,hash keys, array keys, or iterating over hashes/arrays ofattributes/children. (Could be declarative, but want the option ofprocedural for more complex cases.)


 - Construct a new node (could be declarative syntax).

- Recursively transform children of this node (could be declarativesyntax), into the children of the new node.


And, unique to the optimization case:

- Replace an existing node (and the tree of nodes under it) with a newoptimized node (which may have a tree of nodes under it). (Might bedeclarative, certainly easy with procedural.)

I think if you had a simple declarative syntax for matching nodes,accessing attributes of nodes, and building new nodes, plus the abilityto drop in a block of procedural code (in PIR/NQP/random-HLL, basicallyanything that compiles to a Parrot Sub), you'd have made a hugecontribution to Parrot, and to compiler writing techniques all over theworld. It doesn't have to be fancy (in fact, better if it isn't fancy),just something plain like:



transform as_post
PAST::Op(:pirop('add'),:childcount(2),:childreturns('PAST::Val'|'PAST::Var'))
->

POST::Op(:name(tree.name),:pos(tree.pos),:children(tree.left.as_post,tree.right.as_post))


which translates down to something like:

.sub 'as_post' :method :multi('PAST::Op')
    .param pmc tree
    .local pmc post

    # because parrot only multidispatches on type, rather than
    # value, have to handle the values inside the multimethod.
    $S0 = tree.'pirop'()
    if $S0 == 'add' goto matched_pirop
        # not a match, try other alternates of type 'PAST::Op'
        # if the tree grammar has them, otherwise, return failed match
  matched_pirop:

     # test number of children and types of returns here

     post = new ['PAST';'Op']
     $S1 = tree.'name'()
     post.'name'($S1)

     $I0 = tree.'pos'()
     post.'pos'($I0)

     $P0 = tree.'left'()
     # or $P0 = tree[0]
     $P1 = self.'as_post'($P0)
     post.'push'($P1)

     # same for second child

    .return (post)
.end


Or:

replace op
PAST::Op(:pirop('add'),:childcount(2),:childreturns('PAST::Val'|'PAST::Var'))
->

{ # code that returns an optimized PAST::Op, to replace the existing onein the tree }

I intentionally kept the syntax here close to your original NQP-basedexample to show that it's possible to keep the familiarity ofParrot-style argument syntax without trying to merge two very differentbehaviors into one codebase. You could equally well go more towardregular-expressions like:


transform as_post { <nodetype: 'PAST::Op'> <childcount: 2> }
{ # there isn't really any good syntax for the replacement part in regexes }

or:
transform as_post / match / replace /

Or more toward attribute grammars, XSLT, Prolog, etc... or somethingcompletely new and different.


Allison
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Re: Syntax for PAST Optimizations

Reply via email to