I'm still fooling around with some experiments. The following is now working (in my local repo) as a way of transforming for-froms:

class ForInToForFrom(XPathTransform):
@template("pyr:ForInStatNode[iterator/pyr:IteratorNode/sequence/pyr:SimpleCallNode/function/pyr:NameNode/@name = 'range']")
   def for_in_range_to_for_from_range(self, node):
      result = Nodes.ForFromStatNode(...
      ...
      return result

Everything happens on the Pyrex tree, there's no translation to XML or anything like that. Example attached (though you can't run it outside me repo, it's just for demonstration).

The question is: Is this a way forward for transforms? For some more examples, consider that one could for instance select all equality statements that must have some coercion by

"pyr:SimpleAssignmentNode[lhs/@type != rhs/@type]"

But this is contrived, coercion won't work this way. But also consider that one can select inner functions by

"pyr:FuncDefNode//pyr:FuncDefNode"

and outer functions only by

"pyr:ModuleNode/body/pyr:FuncDefNode"

and so on.

The gains are highest if XPath selections are used for all transforms written, because then the finite state machines (see below) can (in principle at least) be combined so that only one tree traversal per phase is needed regardless of how the code is modularized into multiple transforms. (If combining, one must use a subset of XPath where only the descendants axis is available outside of predicates, I guess this is the same as XSLT match statements?).

What I've done:
- Put a subset of the W3C DOM API on top of the tree. No modifications to Cython code tree was necesarry except adding a base class (and I finally had a legitimate use for a metaclass or two. Yay!). A "side-effect" is that the tree can be streamed to XML (see example code). - Use the webpath XPath 2.0 transform to select nodes (http://sourceforge.net/projects/webpath), and act on them on traversal.

Questions:
- Anyone know of good DOM transformation libraries for Cython?
- Does anyone think this would be useful?
- Does anyone think this could be a standard for writing transforms?
- Any other good uses for a W3C DOM on our parse trees? (it is a seperate component) I'll assume that streaming in and out of XSLT is not going to be convenient, but something else perhaps?

Some notes:
- It currently scales horribly with the number of "templates"; one full tree traversal per match. In order to fix this, one either has to find a better XPath library (which must be hacked a bit), an XSLT processor or similar implemented entirely in Python, or a full-time week is needed to improve webpath by using a Finite State Automata library (which does the standard non-deterministic automatas to deterministic automatas, there are several good ones and this is not too hard to do).

Does it matter if we do 30 traversals on the tree rather than 2-3? As long as it can be optimized "in principle"?

- On the other hand, once that is done, one can "combine" tree traversals so that multiple transforms work in the same traversal, meaning that the number of traversals will be reduced compared to what is in sight now.

- But, the current less efficient implementation is working.

I might probably leave it for now at this because the gains seem less than the effort, but if anyone thinks this is interesting then speak up and we can see.

--
Dag Sverre

import Cython.Compiler.Main
import Cython.Compiler.Errors
from Cython.Compiler.Symtab import BuiltinScope, ModuleScope
import tempfile
import os

from Cython.Compiler.CythonNode import Document, walkdom

from Cython.Compiler.XPathTransform import XPathTransform, template

def w3cfix(tree):
    doc = Document()
    for node in walkdom(tree):
        if node.ownerDocument != doc:
            node.ownerDocument = doc
    doc.childNodes = [tree]


# The below duplicates enough of Main.py behaviour in order for isolated experimentation of
# parsing and playing with the tree. Sigh.
def parse_string_to_pyrex_tree(s):
    cython = Cython.Compiler.Main.Context([])
    fileno, source = tempfile.mkstemp(text=True)
    f = os.fdopen(fileno, "w")
    print >>f, s
    f.close()
    module_name = "testbed"
    initial_pos = (source, 1, 0)
    Cython.Compiler.Errors.open_listing_file(None)
    scope = cython.find_module(module_name, pos = initial_pos, need_pxd = 0)
    tree = cython.parse(source, scope.type_names, pxd = 0, full_module_name = module_name)
    os.unlink(source)
    w3cfix(tree)
    return tree


def dumpxml(tree):
    from xml.dom.ext import PrettyPrint
    from xml.dom.minidom import Document
    doc = Document()
    doc.appendChild(doc.importNode(A, deep=True))
    PrettyPrint(doc)

import Cython.Compiler.Nodes as Nodes
import Cython.Compiler.ExprNodes as ExprNodes

class ForInToForFrom(XPathTransform):
    @template("pyr:ForInStatNode[iterator/pyr:IteratorNode/sequence/pyr:SimpleCallNode/function/pyr:NameNode/@name = 'range']")
    def for_in_range_to_for_from_range(self, node):
        result = Nodes.ForFromStatNode(
            pos=node.pos,
            target=node.target,
            body=node.body,
            else_clause=node.else_clause,
            step=None,
            relation1 = "<=",
            relation2 = "<"
        )

        range_func = node.iterator.sequence
        if len(range_func.args) >= 2:
            result.bound1 = range_func.args[0]
            result.bound2 = range_func.args[1]
            if len(range_func.args) == 3:
                result.step = range_func.args[2]
        else:
            result.bound1 = ExprNodes.IntNode(pos=node.pos, value=0)
            result.bound2 = range_func.args[0]
        return result

A = parse_string_to_pyrex_tree("""
a = True
def foo():
    for i in range(10):
        print "Hello"
""")


pt = ForInToForFrom()

pt.initialize("testbed")
pt.process_tree(A)

dumpxml(A)

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to