Re: [Cython] bug in optimize_min_max

Stefan Behnel Thu, 15 Jul 2010 20:52:48 -0700

Robert Bradshaw, 16.07.2010 04:59:
> On Thu, Jul 15, 2010 at 1:34 PM, Stefan Behnel wrote:
>> The min/max transformation is a perfect example here. It expands this code
>>
>>      x = min(a,b,c)
>>
>> into this code
>>
>>      x = a
>>      if x>  b: x = b
>>      if x>  c: x = c
>>
>> Now, you have to evaluate each input parameter exactly once, and each
>> evaluation can potentially throw an exception. How would you do this in a
>> source-level transformation? The only way I see is to introduce temp nodes
>> already at this point,
>
> The expanded code would be
>
> res, _b, _c = a, b, c
> if res>  _b: res = _b
> if res>  _c: res = _c
> # cleanup _b, _c
> x = res
>
> so the assignment (and type of the user-visible x) is orthogonal. Of
> course, a could be an object too...One could even do
>
> r1, _b, _c = a, b, c
> r2 = r1 if r1<  _b else _b
> r3 = r2 if r2<  _c else _c
>
> so it stays in C land as long as possible, but that may not be worth it.


Hmmm, I didn't think of that. That might even fix the problem at hand. But 
it also means that we may (currently) end up with a lot more ref-counting 
than strictly necessary for Python values. I'll have to look at the details.


>> Otherwise, both the comparison and the
>> assignment would lead to redundant coercions. How would you assure that?
>> Note that type inference will not work here, as it only impacts the type of
>> x. All other type analysis steps are local.
>>
>> I can see us making it less local by injecting back-references to those
>> nodes that depend on a node's type. That might allow us to minimise the
>> number of neccessary coercions for a value. However, to do this right, we
>> need flow analysis to even see which code paths actually lead to a coercion.
>
> Hopefully a simple kind of common subexpression elimination could
> remove the redundant coercions (as well as help out in a lot of other
> places).

You can't really do common subexpression elimination on Python code as even 
operators may have side effects. You can't even know if there may be *no* 
be side effects before the type analysis tells you that you are only 
dealing with plain C values.

No, I think the only way to deal with this early enough is to introduce a 
ReusedExpression node, and to handle that basically everywhere, also during 
type analysis. We currently do a lot of node type checks in the optimiser, 
so we'll need a better way to deal with that, too.


> Also, the 1-value min/max could be optimized away

Nope:

   l = [1,2,3]
   assert min(l) == 1

can't be optimised away at all, whereas

   min(1)

raises a TypeError. The third use case, generator expressions, are not 
currently supported in Cython, but would also be nice to have, even if 
they'd only deal with Python values in most cases. The reduced looping 
overhead and the inlined generator iteration is always worth it.

Note that min() is different from sum() here, as

    cdef int x = min(i*2 for i in xrange(2, 1000000000000000000000))

requires Python long values to evaluate correctly before it can assign the 
C int 4 to x, whereas

    cdef int x = sum(i*2 for i in xrange(2, 1000000000000000000000))

only requires i to be a Python variable but can otherwise safely overflow 
several times and thus use a plain C int for the intermediate sums.


>, and the 2-value one could be
>
>      x = a if a<  b else b

Right, that works. Absolutely worth doing as it's a pretty common case and 
the resulting code avoids an unnecessary initial assignment.


> in conjunction with another possible optimization
>
>      (a if X else b).coerce_to(T)
>
> becomes
>
>      a.coerce_to(T) if X else b.coerce_to(T)

I thought we did that already.

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] bug in optimize_min_max

Reply via email to