On Thu, Jul 15, 2010 at 9:54 AM, Sturla Molden <[email protected]> wrote:
> Den 15.07.2010 03:59, skrev Robert Bradshaw:
>>
>>> In my experience, I always have to resort to Fortran or C++. Cython cannot
>>> use NumPy arrays efficiently as function arguments. That is a big show 
>>> stopper.
>>>
>>
>
>> Could you clarify? I suppose for very small arrays, there's the extra
>> O(1) type-check/stride extraction overhead. Is there anything else? Of
>> course there's always more room for improvement for the array type.
>>
>>
>
> First there is always the overhead of refcounting and unboxing the
> buffer, even if we pass the array unmodified. Second, if we want to pass
> a slice/subarray (a very common case), we get:
>
> - creation of a Python slice object
> - more refcounting
> - a Python call to the (unoptimized) [] operator
> - recounting
> - creation of a new Python numpy.ndarray object
> - more refcoutning
> - passing the new view array to the function
> - more refcounting
> - unboxing the buffer
> - etc.
>
> Imagine this happening in a tight loop. It's close to useless. (Yes, and
> we cannot release the GIL, but that's a minor point.)
>
> To get the performance we need, we have to use C pointers instead, e.g.
> pass &(array[n,0,0]) instead of array[n,:,:]. But while array[n,:,:] has
> two dimensions, the C pointer &(array[n,0,0]) only has only one. A C
> pointer also carries no information about the subarray shape. So
> numerical Cython code quickly degenerates to C.
>
> In Fortran 95, we have arrays that works very similarly to NumPy.
> array[n,:,:] in NumPy is array(:,:,n) in Fortran 95 (assuming C order in
> NumPy.) Fortran 95 arrays boradcast like NumPy: a*b in NumPy is a*b in
> Fortran 95. Arrays can be repeated, transposed, and reshaped; they can
> be queried for size and shape; indexed with integer of boolean arrays;
> there are numerical functions in the RTL than work like ufuncs (sin,
> cos, etc.). We can create ufunc-like functions using the "elemental"
> keyword. Fortran has all the niceness of NumPy, but with the speed of C
> (often better).  Also, array expressions in Fortran 95 are easy to
> auto-vectorize to use simd extensions or multithreading. We can also use
> OpenMP directly in Fortran. There is a "where" construct that
> corresponds to masked "fancy indexing" in NumPy (or numpy.where), and a
> "forall" keyword that lets the compiler implement nested loops more
> efficiently (i.e. no specific order of loop counters are requested, so
> it can easily be vectorized to simd or multithreading.)
>
> Fortran 95 is easy to read and just as powerful as C. I is nothing like
> Fortran 77, which is commonly understood by "Fortran" among C/C++
> developers. Unlike Fortran 77, there are dynamic allocation, and
> pointers which are more safe than C pointers. Fortran 95 pointers can
> even reference a strided subarray (just like a slice returns a view in
> NumPy), and carry shape information about the subarray. Fortran 95 code
> can be organized into modules (that are compiled separately), and one
> module can import another, just like Python. (Fortran 2003 will have OOP
> with inheritance and polymorphism as well, but compiler vendors are
> still catching up.) So there is a reason why scientists are keeping
> Fortran alive. Computer scientists don't understand it, but they usually
> have no idea how important array processing is to scientific computing.
> And they generally believe Fortran today is still FORTRAN 77 and g77,
> which is a ccompletely false assumption.

Well, many projects are still using Fortran 77 (just like many C
projects still are based on on c89 rather than moving to c99). It does
make me happy to hear that the language is (will be?) catching up to
modern times, but the lag time of old projects moving to new standards
is often measured in decades (!).

Totally agree with you about array processing being undervalued, IMHO
this is one of the primary long-term advantages of Fortran.

> Just consider what this would do in Cython
>
>    for i in range(N):
>       res[i] = foobar(b0*array0[i::2,:,:] + b1*array1[i,:,::2] +
> b2*array2[i,:,:])
>
> and compare with Fortran 95:
>
>    do i = 1,n
>       res(i) = foobar(b0*array0(:,:,i::2) + b1*array1(::2,:,i) +
> b2*array2(:,:,i))
>    end do
>
> The differences are () instead of [], do instead of for, and indexes are
> usually tranposed (due to column-major ordering). But the Fortran 95
> code would run as fast as hand-written C, while the Cython would hardly
> be any better than Python.

Yep. If the arrays are large, the speed difference should be
negligible, but there's a huge penalty in object allocation for small
ones.

It sounds like what we need is a non-object-backed array (e.g. a raw
NumPy array struct) where some of the metadata (e.g. type) is carried
around by the compiler rather than at runtime. The question here is
how to handle memory allocation--is this done manually in Fortran? I
suppose slices could just be structs passed around by value.

- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to