Re: [Cython] Results of XPathTransform / W3CDOM experiments

Robert Bradshaw Mon, 07 Apr 2008 11:19:30 -0700

On Apr 6, 2008, at 4:43 AM, Dag Sverre Seljebotn wrote:
> I'm still fooling around with some experiments. The following is  
> now working (in my local repo) as a way of transforming for-froms:
>
> class ForInToForFrom(XPathTransform):
>    @template("pyr:ForInStatNode[iterator/pyr:IteratorNode/sequence/ 
> pyr:SimpleCallNode/function/pyr:NameNode/@name = 'range']")
>    def for_in_range_to_for_from_range(self, node):
>       result = Nodes.ForFromStatNode(...
>       ...
>       return result
>
> Everything happens on the Pyrex tree, there's no translation to XML  
> or anything like that. Example attached (though you can't run it  
> outside me repo, it's just for demonstration).
>
> The question is: Is this a way forward for transforms?


Perhaps, but I doubt it. I would echo Fabrizio's call for a dozen or  
so transformations that one would want to do with examples that the  
xpath way of going about it is cleaner. Also, I don't see the end  
user as writing their own transformations much (nor do I think it's a  
good idea to encourage it--it makes the language much more obscure).  
It also greatly increases the dependancies to run Cython.

> For some more examples, consider that one could for instance select  
> all equality statements that must have some coercion by
>
> "pyr:SimpleAssignmentNode[lhs/@type != rhs/@type]"
>
> But this is contrived, coercion won't work this way.

I think coercion (especially the relationship between types) is too  
complicated to be done this way. (OK, maybe it's possible, but it  
would almost certainly be horribly inefficient and hard to understand).

> But also consider that one can select inner functions by
>
> "pyr:FuncDefNode//pyr:FuncDefNode"
>
> and outer functions only by
>
> "pyr:ModuleNode/body/pyr:FuncDefNode"
>
> and so on.
>
> The gains are highest if XPath selections are used for all  
> transforms written, because then the finite state machines (see  
> below) can (in principle at least) be combined so that only one  
> tree traversal per phase is needed regardless of how the code is  
> modularized into multiple transforms. (If combining, one must use a  
> subset of XPath where only the descendants axis is available  
> outside of predicates, I guess this is the same as XSLT match  
> statements?).

I think most of the code processing is done in "phases" rather than a  
bunch of transforms that can be done all at once. Optimizations are  
perhaps an exception.

> What I've done:
> - Put a subset of the W3C DOM API on top of the tree. No  
> modifications to Cython code tree was necesarry except adding a  
> base class (and I finally had a legitimate use for a metaclass or  
> two. Yay!). A "side-effect" is that the tree can be streamed to XML  
> (see example code).
> - Use the webpath XPath 2.0 transform to select nodes (http:// 
> sourceforge.net/projects/webpath), and act on them on traversal.
>
> Questions:
> - Anyone know of good DOM transformation libraries for Cython?

Perhaps lxml does this?

> - Does anyone think this would be useful?
> - Does anyone think this could be a standard for writing transforms?
> - Any other good uses for a W3C DOM on our parse trees? (it is a  
> seperate component) I'll assume that streaming in and out of XSLT  
> is not going to be convenient, but something else perhaps?
>
> Some notes:
> - It currently scales horribly with the number of "templates"; one  
> full tree traversal per match. In order to fix this, one either has  
> to find a better XPath library (which must be hacked a bit), an  
> XSLT processor or similar implemented entirely in Python, or a full- 
> time week is needed to improve webpath by using a Finite State  
> Automata library (which does the standard non-deterministic  
> automatas to deterministic automatas, there are several good ones  
> and this is not too hard to do).

Again, there's the question of dependancies. I'd rather not require  
anything but Python itself.

> Does it matter if we do 30 traversals on the tree rather than 2-3?  
> As long as it can be optimized "in principle"?
>
> - On the other hand, once that is done, one can "combine" tree  
> traversals so that multiple transforms work in the same traversal,  
> meaning that the number of traversals will be reduced compared to  
> what is in sight now.
>
> - But, the current less efficient implementation is working.
>
> I might probably leave it for now at this because the gains seem  
> less than the effort, but if anyone thinks this is interesting then  
> speak up and we can see.

Perhaps it can be offered as a plugin, so people who want to do  
things like this can use it. But I'm not convinced that this is the  
direction we want to take for the Cython core.

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Results of XPathTransform / W3CDOM experiments

Reply via email to