2010/5/31 Allison Randal <[email protected]>

> On 5/31/10 3:11 AM, Tyler Curtis wrote:
>
>> Part of the purpose of my GSoC project is to develop tools to make
>> writing PAST::Optimizations not just possible, but actually
>> convenient.
>>
>
> Excellent, it's a much needed task.
>
>
>
>> There are existing special tools for tree transformations: for
>> example, the TGE or Tree Grammar Engine.
>> TGE provides a syntax for transformations on trees. To some extent.
>> More specifically, it provides a syntax for dispatch based on the type
>> of node in the tree to a PIR sub that does the transformation work. It
>> also has the ability to define subs to be executed on the attributes
>> of nodes of a certain type whenever a transform is performed on such a
>> node.
>>
>
> Agreed, TGE doesn't have a syntax, so you can't look there for syntax. In
> fact, a syntax is the single biggest thing TGE needs, and you could be the
> person to write that syntax. It also doesn't have the tree matching syntax
> that was planned for it (allowing more detailed matching than "this is a
> PAST::Op" node, to selecting individual features).
>
> If working on the next version of TGE inspires you, what you should know is
> that it's just an implementation of basic Attribute Grammars, which have
> been written about since the dawn of computing theory (first mentioned in an
> ancient paper by Don Knuth). They are a way of transforming trees by
> traversing up and down the tree building up "attributes" (i.e. annotations)
> on the tree.
>
> In actual practice of writing compilers, I've found that you pretty much
> never need to traverse up the tree, so a huge simplification is possible by
> only implementing downward traversals. For optimizers there's a chance you
> might need some upward traversals.
>
> Caveat: Even though they've got a great deal of history behind them, I'm no
> longer convinced that attribute grammars are the best way to write
> compilers. They push you into thinking about annotating the existing tree,
> when you really should be thinking about how to build the replacement tree
> (or replacement node for optimizations). That is, they force your brain into
> the wrong flow.
>
>
>
>  What other options do we have for syntaxes?
>>
>> One that has been mentioned to me is XPath. XPath is a syntax for
>> matching sub-trees of an XML document.
>>
>
> Yes, I also considered XPath and agreed, it's not the right fit. We can do
> better.
>
>
>  http://pastie.org/985050 is an example of how a HLL compiler could
>> apply optimizations to its generated code. Line 8 in Compiler.pm is
>> commented out, but if some of the language-specific optimizations in
>> Optimizer.pm were particularly complex, there could be a separate
>> "SomeLang::Optimizer::Actions" class that would hold action methods
>> similar to those used for normal NQP-rx grammars.
>>
>
> HLL::Compiler is definitely the place to hook in your optimization stages
> when run as part of a larger compiler. But, HLL::Compiler accepts any
> arbitrary stage of any arbitrary code, so this convenience is possible no
> matter what the optimizations are written in.
>
>
>  * Not easy to programmatically construct optimizations outside of NQP.
>>  Is this actually useful? I'm not sure.
>>
>
> Yes, it's useful.
>
>
>  I'm curious what others think of the two syntax proposals
>> included(XPath and Regexes), and if anyone has any other ideas of
>> possible syntaxes.
>>
>
> If you are inspired by Perl 6 Regexes, then go ahead and do it that way. As
> pmichaud said, it is part of the plan for Regexes so someone will do it
> eventually, and that someone could be you.
>
> There are really two parts of a tree transformation engine, one is matching
> (finding) the nodes of the existing tree, and the other is building the
> nodes of the new tree.
>
> I have doubts at how well Perl 6 Regexes will do at matching nodes. Not
> from practical experience of trying it out, mind you, but from a theoretical
> perspective. In general, a system that tries to do two completely separate
> things will end up doing both of them poorly. And, from a straight
> complexity perspective, Perl 6 Regexes are already enormously complex from
> trying to be good at natural language processing and computer language
> processing. Adding another dimension of being good at tree processing is a
> combinatorial explosion of complexity, which just may push it over the edge.
> Also, even with Perl 6 Regexes, you'd really be inventing new syntax anyway,
> you'd just be trying to force it into a string matching mold. (
> http://pastie.org/983989 is a good example of this, where the intent of
> the rule is buried under the fact that it's trying to act like string
> matching.)
>
> On the side of building up new nodes, I find the NQP syntax (not the Regex
> part, but the plain old tree transformation part, which is a crippled subset
> of Perl 6) hopelessly byzantine, and in my most recent compilers have gone
> back to writing the tree transformation rules in pure PIR. (Not even
> TGE+PIR, just plain old PIR classes with methods for each transformation
> rule.) Where I used to spend hours in frustration trying to get NQP to do
> what I wanted, I now spend minutes writing the PIR. Which is *not* to say
> that PIR is the best tool for the job, but to say that we can do so much
> better than what we're doing now.
>
> ------
>
> Rather than looking to adopt a existing syntax, can you take a step back
> and think "What do I want to do and how do I make it easy?" A quick list of
> the operations I regularly perform when writing a tree transformation are:
>
>  - Decide if this is the node I want to process (matching, which could be
> declarative syntax).
>
>  - Access attributes/children of a node: either class attribute names, hash
> keys, array keys, or iterating over hashes/arrays of attributes/children.
> (Could be declarative, but want the option of procedural for more complex
> cases.)
>
>  - Construct a new node (could be declarative syntax).
>
>  - Recursively transform children of this node (could be declarative
> syntax), into the children of the new node.
>
> And, unique to the optimization case:
>
>  - Replace an existing node (and the tree of nodes under it) with a new
> optimized node (which may have a tree of nodes under it). (Might be
> declarative, certainly easy with procedural.)
>
>
> I think if you had a simple declarative syntax for matching nodes,
> accessing attributes of nodes, and building new nodes, plus the ability to
> drop in a block of procedural code (in PIR/NQP/random-HLL, basically
> anything that compiles to a Parrot Sub), you'd have made a huge contribution
> to Parrot, and to compiler writing techniques all over the world. It doesn't
> have to be fancy (in fact, better if it isn't fancy), just something plain
> like:
>
>
> transform as_post
>
> PAST::Op(:pirop('add'),:childcount(2),:childreturns('PAST::Val'|'PAST::Var'))
> ->
> POST::Op(:name(tree.name),:pos(tree.pos),:children(tree.left.as_post,
> tree.right.as_post))
>
> which translates down to something like:
>
> .sub 'as_post' :method :multi('PAST::Op')
>    .param pmc tree
>    .local pmc post
>
>    # because parrot only multidispatches on type, rather than
>    # value, have to handle the values inside the multimethod.
>    $S0 = tree.'pirop'()
>    if $S0 == 'add' goto matched_pirop
>        # not a match, try other alternates of type 'PAST::Op'
>        # if the tree grammar has them, otherwise, return failed match
>  matched_pirop:
>
>     # test number of children and types of returns here
>
>     post = new ['PAST';'Op']
>     $S1 = tree.'name'()
>     post.'name'($S1)
>
>     $I0 = tree.'pos'()
>     post.'pos'($I0)
>
>     $P0 = tree.'left'()
>     # or $P0 = tree[0]
>     $P1 = self.'as_post'($P0)
>     post.'push'($P1)
>
>     # same for second child
>
>    .return (post)
> .end
>
>
> Or:
>
> replace op
>
> PAST::Op(:pirop('add'),:childcount(2),:childreturns('PAST::Val'|'PAST::Var'))
> ->
> { # code that returns an optimized PAST::Op, to replace the existing one in
> the tree }
>
>
> I intentionally kept the syntax here close to your original NQP-based
> example to show that it's possible to keep the familiarity of Parrot-style
> argument syntax without trying to merge two very different behaviors into
> one codebase. You could equally well go more toward regular-expressions
> like:
>
> transform as_post { <nodetype: 'PAST::Op'> <childcount: 2> }
> { # there isn't really any good syntax for the replacement part in regexes
> }
>
> or:
> transform as_post / match / replace /
>
> Or more toward attribute grammars, XSLT, Prolog, etc... or something
> completely new and different.
>
> Allison
>
> _______________________________________________
> http://lists.parrot.org/mailman/listinfo/parrot-dev
>


Hi,

Take a look to http://github.com/fperrad/lua, which still uses 2 stages with
TGE.
Real examples give some inspiration.

François
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Reply via email to