Den 15.07.2010 03:59, skrev Robert Bradshaw:
>
>> In my experience, I always have to resort to Fortran or C++. Cython cannot
>> use NumPy arrays efficiently as function arguments. That is a big show
>> stopper.
>>
>
> Could you clarify? I suppose for very small arrays, there's the extra
> O(1) type-check/stride extraction overhead. Is there anything else? Of
> course there's always more room for improvement for the array type.
>
>
First there is always the overhead of refcounting and unboxing the
buffer, even if we pass the array unmodified. Second, if we want to pass
a slice/subarray (a very common case), we get:
- creation of a Python slice object
- more refcounting
- a Python call to the (unoptimized) [] operator
- recounting
- creation of a new Python numpy.ndarray object
- more refcoutning
- passing the new view array to the function
- more refcounting
- unboxing the buffer
- etc.
Imagine this happening in a tight loop. It's close to useless. (Yes, and
we cannot release the GIL, but that's a minor point.)
To get the performance we need, we have to use C pointers instead, e.g.
pass &(array[n,0,0]) instead of array[n,:,:]. But while array[n,:,:] has
two dimensions, the C pointer &(array[n,0,0]) only has only one. A C
pointer also carries no information about the subarray shape. So
numerical Cython code quickly degenerates to C.
In Fortran 95, we have arrays that works very similarly to NumPy.
array[n,:,:] in NumPy is array(:,:,n) in Fortran 95 (assuming C order in
NumPy.) Fortran 95 arrays boradcast like NumPy: a*b in NumPy is a*b in
Fortran 95. Arrays can be repeated, transposed, and reshaped; they can
be queried for size and shape; indexed with integer of boolean arrays;
there are numerical functions in the RTL than work like ufuncs (sin,
cos, etc.). We can create ufunc-like functions using the "elemental"
keyword. Fortran has all the niceness of NumPy, but with the speed of C
(often better). Also, array expressions in Fortran 95 are easy to
auto-vectorize to use simd extensions or multithreading. We can also use
OpenMP directly in Fortran. There is a "where" construct that
corresponds to masked "fancy indexing" in NumPy (or numpy.where), and a
"forall" keyword that lets the compiler implement nested loops more
efficiently (i.e. no specific order of loop counters are requested, so
it can easily be vectorized to simd or multithreading.)
Fortran 95 is easy to read and just as powerful as C. I is nothing like
Fortran 77, which is commonly understood by "Fortran" among C/C++
developers. Unlike Fortran 77, there are dynamic allocation, and
pointers which are more safe than C pointers. Fortran 95 pointers can
even reference a strided subarray (just like a slice returns a view in
NumPy), and carry shape information about the subarray. Fortran 95 code
can be organized into modules (that are compiled separately), and one
module can import another, just like Python. (Fortran 2003 will have OOP
with inheritance and polymorphism as well, but compiler vendors are
still catching up.) So there is a reason why scientists are keeping
Fortran alive. Computer scientists don't understand it, but they usually
have no idea how important array processing is to scientific computing.
And they generally believe Fortran today is still FORTRAN 77 and g77,
which is a ccompletely false assumption.
Just consider what this would do in Cython
for i in range(N):
res[i] = foobar(b0*array0[i::2,:,:] + b1*array1[i,:,::2] +
b2*array2[i,:,:])
and compare with Fortran 95:
do i = 1,n
res(i) = foobar(b0*array0(:,:,i::2) + b1*array1(::2,:,i) +
b2*array2(:,:,i))
end do
The differences are () instead of [], do instead of for, and indexes are
usually tranposed (due to column-major ordering). But the Fortran 95
code would run as fast as hand-written C, while the Cython would hardly
be any better than Python.
Sturla
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev