Re: [Cython] Enable for-in-range optimisation also for untyped loop vars?

Stefan Behnel Fri, 27 Mar 2009 03:22:08 -0700

Dag Sverre Seljebotn wrote:
> Stefan Behnel wrote:
>> now that we can guarantee Python semantics for the optimised for-in-range
>> loop, should we enable the optimisation also when the loop variable is
>> not explicitly typed, but when the range is constant and clearly within
>> int bounds?
>
> If this is done, I'd go even further and always do it. At least under
> Python 2:
>
>  >>> for i in xrange(10**90): pass
> ...
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> OverflowError: long int too large to convert to int


Well, yes, that might be the case in Py2.x. Py3 allows you to do that -
although it likely takes at least a couple of hours to run. I bet the same
thing in Cython would just be optimised away and run in no-time. ;)

We could auto-type the (hidden!) temp loop variable to "unsigned long long
int" if the value range is not known at compile time (i.e. having
non-constant, untyped range parameters). That would still be a lot faster
than iterating through Python int/long objects, but it wouldn't break
for-loops that just slip slightly over the 2**31 border (which might not
be that uncommon anymore these days).


> But I really thing that range() is such a common use, and super-ints so
> seldom, that I'd propose this as the one place it would make sense to
> break Python compatability also if Py3 fixes this (but keep a
> cython.bigrange which isn't optimized -- Sage has a corresponding srange
> I think).

For now, you can do

    cdef object large_loop_range = xrange(10**90)
    for i in large_loop_range: pass

It'll be a while until Cython optimises that...

This might also work:

    for i in <object>xrange(10**90): pass

and I even find it kind of explicit.


> My instinct is to wait until we can start implementing some kind of
> simplistic type inference transform, with this as the first simple case
> to deal with, rather than specific code for the for-in-range. That would
> depend on flow analysis though.

Not necessarily. It's actually extremely common to have a single point of
assignment to a variable in a function, e.g. when building a list, or when
running a single "for i in ..." loop. It shouldn't be too hard to optimise
just this specific case in a transform, and it would give you a simplistic
kind of type inference that should hit in quite a number of cases. Even
running it before type analysis would catch code like "l = [...]".

It would also lead to funny effects, such as having a loop inside a
function run slower when you add a second one that uses the same loop
variable :) However, this could only be fixed by live variable analysis
(which isn't all that trivial for Python, given that e.g. "for i in
range(X)" may not assign to i!)

JIC: http://en.wikipedia.org/wiki/Live_variable_analysis

Stefan

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Enable for-in-range optimisation also for untyped loop vars?

Reply via email to