Re: [Cython] Assignment nodes, buffers and transforms

Dag Sverre Seljebotn Wed, 23 Jul 2008 07:13:24 -0700

Note (forgot where you touched upon it) that I consider the current temp 
handling in transforms a temporary solution - temps (also those written as cdef 
in my examples for notation) should be regular allocate_temps things. In fact I 
have something locally and a proposal for that which I will post when I get 
around to it (tomorrow).


Robert Bradshaw wrote:
> We have talked about this already, but I think this is valuable for  
>public record and input.
>
>On Jul 20, 2008, at 6:30 AM, Dag Sverre Seljebotn wrote:
>
>> I've had some skirmishes with Robert and Stefan on this and I think it
> is time for a thread.
>
>> Currently, for buffers, I must somehow write code for two different
> cases that have to do with assignment:
>   1. When buffers variables are somehow assigned to
>   2. When buffers have items assigned to
>
>> The question is whether to write more code in Nodes.py to do this, or
> make it easier for transforms to react to assignments. The latter seem
> more intrusive to how you usually do things, however I also see more
> usecases than just buffers (especially for optimizing refcount in a
> better way while keeping the algorithmic code simple, as well as for
> parts of type inference, as well as compile-time evaluation, and just
> overall transform complexity in number of node types to deal with).
>
>> I'm not sure if I'm suggesting that there should only be one type of
> assignment node, but I'd love for them to be a bit different:
>
>> - SetItemNode
> - SetAttributeNode
> - SetCVarNode
>
>> ...so that the node type starts corresponding to what is going on in C
> without having to examine the lhs. I feel that would simplify a lot of
> code, also existing code (and it could be made "backwards-compatible"
> through inheritance anyway). However that is a second step that relies
> on reducing to a single assignment node first (to avoid combinatorial
> explosion).
>
>> Current ways of assigning a variable x:
> a) def f(x):
> b) cdef type x = 2
> c) x = 2
> d) x, (y, z) = (2, (3, 4))
> e) x = y = 2
> f) except Exception, x
> g) for x in C:
>
>And, (not fully supported yet)
>
>h) def x(...):
>
>>
> That's all I can think of (with is turned into c) already). All of  
> these
> cases must be covered for both 1 and 2 above. What I've done so far is
> support a) and c) directly, and transformed b) into c) in the  
> parsing stage.
>
>> What I *could* do now is move on and turn d) and e) into c) as well.
>
>(d) is more subtle then the example you give suggests, as the RHS  
>could be an iterator (would you expand it to a loop) or a tuple  
>(which is common and produces highly-optimized c). Of course it's  
>possible that two could be expressed in tree form before outputting  
>the C code.
>
>For (e) (and probably several of the others, one should keep in mind  
>that using temp variables is better than creating (uniquely-named of  
>course) local variables that last the whole scope.
>
>> The effect of all these things is to i) reduce the number of  
> different node
> types, and ii) remove code from Nodes.py that requires you to know  
> about
> how all code generation/analysis work and instead add code that  
> requires
> you to know about writing tree transforms (I think that even if the
> latter has usually more lines of code, they all follow a fundamental
> principle and results in code that is more robust to changes other
> places in the code, more "isolated" code).
>
>> f) and g) can also be turned into c) but with a bit extra work and in
> less obvious ways. If you think about SetItemNode, SetAttributeNode  
> etc.
> you see that it isn't wholly unatural for a loop to reduce into such
> instructions (that is of course what is happening today too, but less
> explicit).
>
>I am curious if C compilers are better at optimizing/unrolling for  
>loops (especially with static bounds) then while loops with an  
>implicit counter...

Yes, For*From*StatNode, which would be the target node for that kind of thing. 
So add that to the list as a seperate case... and it probably shouldn't turn 
into singleassignment. 

It doesn't include object assignments so tracking it is less crucial, a lot of 
algorithms could do without it. And it cannot assign to module scope etc either.

>> All in all I'm unsure about this point -- I have to transform them  
> all,
> or write BufferNodes to wrap namenodes in the tree anyway, in which  
> case
> the transforms weren't really necesarry. So I ask for input. (Help me
> counter the exhilirating feeling I get from having Buffer.py "just  
> work"
> in more cases by refactoring unrelated pieces of code :-) )
>
>>  From my own experiences, whenever I have to do something in  
> Nodes.py or
> ExprNodes.py I invariably introduce bugs that it takes hours to find,
> because I do not really understand it. Of course I could learn, but  
> I'd
> rather write the transforms _if that is a preferred direction anyway_.
> So the ideal preferred direction is what I'd like input on, and then
> I'll take the final call on work amount myself (and on how smart  
> things
> I can cook up for f) and g), if they end up being hopeless then I'll
> drop it).
>
>I think transformations to simplify/consolidate some of the  
>assignments is a good direction, but it seems things get increasingly  
>messy and unnatural (for example with the exception catching). We are  

Good point.

>also potentially loosing information that could be used for  
>optimization (which can be recovered, e.g. as you did for (b), but  
>what about

I prefer to think about it as "refined" rather than recovered :-)

In general, with each transform you loose some and get some info, and each 
algorithm that needs the information must figure out where in the pipeline it 
wants to be. I didn't say these transforms would be done very early, just that 
I wanted buffers to be after them. You just have added flexibility rather than 
having to look at them as seperate concepts..

>
>cdef object x
>cdef int a, b
>a = b = x # currently x gets converted twice, but one could imagine  
>wanting to optimize so that it happens only once...

Two options: you either do it before/as part of the splitting up, as in:
cdef int i, j
x = i = j = somecall()

turns into
cdef object tmp1
cdef int tmp2
tmp1 = somecall()
x = tmp1
tmp2 = tmp1
i = tmp2
j = tmp2

Option 2: you realize that your time is better spent at optimizing the general 
case that could arise if one wrote the thing using seperate assignments (it is 
still optimizeable in theory) and stick a (much more advanced) algorithm after 
the splitting up, where it can benefit more assignment types.

>
>I do, however, agree with you that using transformations can make for  
>more "isolated" code, and a smaller set of Nodes to understand at the  
>end.
>
>
>Switching gears a bit, what you state you (might) want is the nodes
>
>- SetItemNode
>- SetAttributeNode
>- SetCVarNode
>
>Now we already have
>
>- ItemNode
>- AttributeNode
>- NameNode
>
>And each of them comes with a analyse_target_types(),  
>generate_assignment_code(), and generate_post_assignment_code()  
>method. All types of assignments call these methods, so you only need  
>to worry about putting the buffer code there, instead of worrying  
>about all the current (or possibly future) ways of making an  
>assignment. 

Yes, this is what I do now. (and as you know at this point this discussion is 
not for gsoc any longer) 

My point is still that you cannot so easily write a visitor to do general code 
analysis. Also, each of these function you mention often read like a lots of 
top-level if-tests to see which situation we are in (setting a module dict 
entry or a local var? And so on. So there is no correspondance in the tree with 
what is vastly different operations in CPython).

Consider the old (obsolete?) idea of Cython-inlineable compile-time operator 
overloads. Then SetItem and GetItem could be transformed into method call 
nodes. And so on (of course, it is entirely possible to just add yet another if 
test and duplicate/call into some SimpleCallNode code instead).

> Another way to go is to treat it like a type test with  
>side effects (so stuff like "(<object[int,2]>x)[0,0]" could possibly  
>work), though ignore that if you're onto a working solution.

On a tangential note, this would mean that all the auxiliary vars and buffer 
acquisiton would happen again for each such cast, so I'd rather not support it 
because it should be avoided :-) which is why I've said that I think of this as 
acting on the variable rather than the type (though there is  BufferType 
similar to TypedefType now).

Dag Sverre

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Assignment nodes, buffers and transforms

Reply via email to