2010/5/31 Allison Randal <[email protected]> > On 5/31/10 3:11 AM, Tyler Curtis wrote: > >> Part of the purpose of my GSoC project is to develop tools to make >> writing PAST::Optimizations not just possible, but actually >> convenient. >> > > Excellent, it's a much needed task. > > > >> There are existing special tools for tree transformations: for >> example, the TGE or Tree Grammar Engine. >> TGE provides a syntax for transformations on trees. To some extent. >> More specifically, it provides a syntax for dispatch based on the type >> of node in the tree to a PIR sub that does the transformation work. It >> also has the ability to define subs to be executed on the attributes >> of nodes of a certain type whenever a transform is performed on such a >> node. >> > > Agreed, TGE doesn't have a syntax, so you can't look there for syntax. In > fact, a syntax is the single biggest thing TGE needs, and you could be the > person to write that syntax. It also doesn't have the tree matching syntax > that was planned for it (allowing more detailed matching than "this is a > PAST::Op" node, to selecting individual features). > > If working on the next version of TGE inspires you, what you should know is > that it's just an implementation of basic Attribute Grammars, which have > been written about since the dawn of computing theory (first mentioned in an > ancient paper by Don Knuth). They are a way of transforming trees by > traversing up and down the tree building up "attributes" (i.e. annotations) > on the tree. > > In actual practice of writing compilers, I've found that you pretty much > never need to traverse up the tree, so a huge simplification is possible by > only implementing downward traversals. For optimizers there's a chance you > might need some upward traversals. > > Caveat: Even though they've got a great deal of history behind them, I'm no > longer convinced that attribute grammars are the best way to write > compilers. They push you into thinking about annotating the existing tree, > when you really should be thinking about how to build the replacement tree > (or replacement node for optimizations). That is, they force your brain into > the wrong flow. > > > > What other options do we have for syntaxes? >> >> One that has been mentioned to me is XPath. XPath is a syntax for >> matching sub-trees of an XML document. >> > > Yes, I also considered XPath and agreed, it's not the right fit. We can do > better. > > > http://pastie.org/985050 is an example of how a HLL compiler could >> apply optimizations to its generated code. Line 8 in Compiler.pm is >> commented out, but if some of the language-specific optimizations in >> Optimizer.pm were particularly complex, there could be a separate >> "SomeLang::Optimizer::Actions" class that would hold action methods >> similar to those used for normal NQP-rx grammars. >> > > HLL::Compiler is definitely the place to hook in your optimization stages > when run as part of a larger compiler. But, HLL::Compiler accepts any > arbitrary stage of any arbitrary code, so this convenience is possible no > matter what the optimizations are written in. > > > * Not easy to programmatically construct optimizations outside of NQP. >> Is this actually useful? I'm not sure. >> > > Yes, it's useful. > > > I'm curious what others think of the two syntax proposals >> included(XPath and Regexes), and if anyone has any other ideas of >> possible syntaxes. >> > > If you are inspired by Perl 6 Regexes, then go ahead and do it that way. As > pmichaud said, it is part of the plan for Regexes so someone will do it > eventually, and that someone could be you. > > There are really two parts of a tree transformation engine, one is matching > (finding) the nodes of the existing tree, and the other is building the > nodes of the new tree. > > I have doubts at how well Perl 6 Regexes will do at matching nodes. Not > from practical experience of trying it out, mind you, but from a theoretical > perspective. In general, a system that tries to do two completely separate > things will end up doing both of them poorly. And, from a straight > complexity perspective, Perl 6 Regexes are already enormously complex from > trying to be good at natural language processing and computer language > processing. Adding another dimension of being good at tree processing is a > combinatorial explosion of complexity, which just may push it over the edge. > Also, even with Perl 6 Regexes, you'd really be inventing new syntax anyway, > you'd just be trying to force it into a string matching mold. ( > http://pastie.org/983989 is a good example of this, where the intent of > the rule is buried under the fact that it's trying to act like string > matching.) > > On the side of building up new nodes, I find the NQP syntax (not the Regex > part, but the plain old tree transformation part, which is a crippled subset > of Perl 6) hopelessly byzantine, and in my most recent compilers have gone > back to writing the tree transformation rules in pure PIR. (Not even > TGE+PIR, just plain old PIR classes with methods for each transformation > rule.) Where I used to spend hours in frustration trying to get NQP to do > what I wanted, I now spend minutes writing the PIR. Which is *not* to say > that PIR is the best tool for the job, but to say that we can do so much > better than what we're doing now. > > ------ > > Rather than looking to adopt a existing syntax, can you take a step back > and think "What do I want to do and how do I make it easy?" A quick list of > the operations I regularly perform when writing a tree transformation are: > > - Decide if this is the node I want to process (matching, which could be > declarative syntax). > > - Access attributes/children of a node: either class attribute names, hash > keys, array keys, or iterating over hashes/arrays of attributes/children. > (Could be declarative, but want the option of procedural for more complex > cases.) > > - Construct a new node (could be declarative syntax). > > - Recursively transform children of this node (could be declarative > syntax), into the children of the new node. > > And, unique to the optimization case: > > - Replace an existing node (and the tree of nodes under it) with a new > optimized node (which may have a tree of nodes under it). (Might be > declarative, certainly easy with procedural.) > > > I think if you had a simple declarative syntax for matching nodes, > accessing attributes of nodes, and building new nodes, plus the ability to > drop in a block of procedural code (in PIR/NQP/random-HLL, basically > anything that compiles to a Parrot Sub), you'd have made a huge contribution > to Parrot, and to compiler writing techniques all over the world. It doesn't > have to be fancy (in fact, better if it isn't fancy), just something plain > like: > > > transform as_post > > PAST::Op(:pirop('add'),:childcount(2),:childreturns('PAST::Val'|'PAST::Var')) > -> > POST::Op(:name(tree.name),:pos(tree.pos),:children(tree.left.as_post, > tree.right.as_post)) > > which translates down to something like: > > .sub 'as_post' :method :multi('PAST::Op') > .param pmc tree > .local pmc post > > # because parrot only multidispatches on type, rather than > # value, have to handle the values inside the multimethod. > $S0 = tree.'pirop'() > if $S0 == 'add' goto matched_pirop > # not a match, try other alternates of type 'PAST::Op' > # if the tree grammar has them, otherwise, return failed match > matched_pirop: > > # test number of children and types of returns here > > post = new ['PAST';'Op'] > $S1 = tree.'name'() > post.'name'($S1) > > $I0 = tree.'pos'() > post.'pos'($I0) > > $P0 = tree.'left'() > # or $P0 = tree[0] > $P1 = self.'as_post'($P0) > post.'push'($P1) > > # same for second child > > .return (post) > .end > > > Or: > > replace op > > PAST::Op(:pirop('add'),:childcount(2),:childreturns('PAST::Val'|'PAST::Var')) > -> > { # code that returns an optimized PAST::Op, to replace the existing one in > the tree } > > > I intentionally kept the syntax here close to your original NQP-based > example to show that it's possible to keep the familiarity of Parrot-style > argument syntax without trying to merge two very different behaviors into > one codebase. You could equally well go more toward regular-expressions > like: > > transform as_post { <nodetype: 'PAST::Op'> <childcount: 2> } > { # there isn't really any good syntax for the replacement part in regexes > } > > or: > transform as_post / match / replace / > > Or more toward attribute grammars, XSLT, Prolog, etc... or something > completely new and different. > > Allison > > _______________________________________________ > http://lists.parrot.org/mailman/listinfo/parrot-dev >
Hi, Take a look to http://github.com/fperrad/lua, which still uses 2 stages with TGE. Real examples give some inspiration. François
_______________________________________________ http://lists.parrot.org/mailman/listinfo/parrot-dev
