Robert Bradshaw, 26.10.2009 19:09:
> On Oct 25, 2009, at 7:33 AM, Stefan Behnel wrote:
>> cdef int* x = ...
>> py_int_list = x[4:20:-1]
>>
>> basically a short form for this:
>>
>> py_int_list = [ int(x[i]) for i from 20 >= i > 4 by -1 ]
>
> While turning them into a list may be useful, we should consider if
> this will be backwards incompatible with conversion to a SIMD/memory
> type. What could be more safely (and explicitly) done is
>
> cdef int* x
> py_int_list = list(x[4:20:-1])
>
> or
>
> py_int_list = tuple(x[4:20:-1])
Makes sense to me. We optimise many similar calls already, so IMO that's
the perfect way of spelling it.
> with conversion to object still left for future debate.
ACK. That's easy to defer, given that it currently fails at compile time.
>> Using this in a loop context would also be easy to optimise, i.e.
>>
>> cdef int* x = ...
>> cdef Py_ssize_t length = ...
>> cdef int i
>>
>> for i in x[:length]:
>> ...
>>
>> would be implemented as a C for loop from 0 to length-1, and pointer
>> index
>> access to get a value for i in each step (or maybe a running pointer
>> variable, not sure what's better in C code).
>
> I see no reason why we couldn't already optimize such a loop, even for
> non-python compatible types i.
Well, I have working code to make this efficient:
cdef char* s = "...."
for c in s[:100]:
print c
and, for example, this also runs as a C loop:
cdef char* s = "...."
for i,c in enumerate(s[:100]):
print i,c
However, both are cases where it already works, but for which it currently
creates a Python byte string first to iterate over that.
It doesn't currently work for types like int[], because the ForInLoopNode
already tries to coerce the iterable to a Python object before we even get
to optimise it. Removing that restriction for C arrays would mean to
duplicate the decision if this can be optimised or not into both the loop
node and the optimisation transformation. Not exactly beautiful. Plus,
preventing "enumerate(int_array[:100])" from failing during type analysis
is a very non-local change.
The only two solutions I see here are 1) to somehow do the optimisations
between analysing and coercing the iterable subtree type or 2) to exclude
some subtree types from coercion and handle them entirely in the
optimisation phase (or a later error detection phase etc).
The latter is certainly hard to keep in tolerable bounds. 1) sounds cleaner
but may mean moving the loop optimisations from the current transformation
back into the type analysis phase, and use special NextNode/IteratorNode
classes for the special cases. Not exactly something for a happy hour of
coding.
Since none of this is actually trivial (and entirely safe), I guess this is
a candidate for at least 0.12.1 or 0.13, not 0.12.
Any objections to add at least the char* optimisation above for 0.12? The
only thing it adds is that this:
cdef char* s = "...."
cdef int c
for c in s[:100]:
print c
will now print the byte values of each char in s (which IMO makes sense),
whereas it failed before as the loop iterable was a Python string.
> I'd like to note that stack-allocated memory is a very non-pythonic
> thing, and I've seen a lot of Cython users tripped up by it, so I'm
> not sure how much we should encourage it/make it easy (though
> optimizing it when we know it's safe could be valuable to do).
At least, it's been a source for vulnerable programs for ages, so I agree
that encouraging it isn't really the right thing to do.
Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev