Re: [Cython] canonical way of converting numpy arrays to C arrays

Dag Sverre Seljebotn Mon, 13 Oct 2008 09:03:48 -0700

Ondrej Certik wrote:
> On Mon, Oct 13, 2008 at 3:38 PM, Dag Sverre Seljebotn
> <[EMAIL PROTECTED]> wrote:
>> Ondrej Certik wrote:
>>> Hi,
>>>
>>> what is now the canonical way to convert between (python) numpy array and
>>>
>>> int *a
>>>
>>> or
>>>
>>> double *a
>>>
>>> ?
>>>
>> Yes, this should probably be fixed/be made more available.
>>
>> The simplest thing is something like the (untested) code below. It is
>> certainly possible to do some C calls instead of Python calls here (like
>> petsc4py does); however I tink this definitely belong to the 80% of the
>> code that only takes 20% of the time; Cython seems to be more about
>> optimizing the other part that tends to get repeated :-)
>>
>> import numpy as np
>> arr = some generic numpy array
>> assert arr.dtype == np.float64 # or whatever
>> cdef ndarray contarr
>> if not arr.flags['C_CONTIGUOUS']:
>>    # Array is not contiguous, need to make contiguous copy
>>    contarr = arr.copy(order='C')
>> else:
>>    contarr = arr
>> # Get data pointer. Important that contarr is cdef-ed ndarray
>> cdef npy.float64_t* ptr = <npy.float64_t*>contarr.data
>> call_c_function(ptr)
>> # If the C function modifies the data, and the array was not contiguous,
>> # then the data of contarr must be copied into arr, using standard NumPy
>> calls:
>> arr[:] = contarr[:]
>>
>> Comment: The only "shady" part here is "contarr.data", which accesses
>> implementation details of NumPy arrays. I'm guessing that this will
>> never change, but I once planned to make a generic cython function
>> "cython.buffer.bufptr" which would return the same pointer but it could
>> be acquired through the buffer API.
>>
>> Note that the code above does not make use of the buffer API, as it is
>> written to work regardless of number of dimensions.
> 
> Currently I use these methods:
> 
> cdef inline ndarray array_d(int size, double *data):
>     #cdef ndarray ary2 = PyArray_ZEROS(1, &size, 12, 0)
>     cdef ndarray ary = zeros(size, dtype=float64)
>     if data != NULL: memcpy(ary.data, data, size*sizeof(double))
>     return ary
> 
> cdef inline int iarray_d(ndarray a, int *size, double **data) except -1:
>     if a.dtype != float64:
>         raise TypeError("The array must have the dtype=float64.")
>     if size!=NULL: size[0] = a.dimensions[0]
>     if data!=NULL: data[0] = <double *> (a.data)
> 
> 
> 
> If I uncomment the commented out line (i.e. to call numpy C/API
> directly instead of through Python), I'll get a segfault. I haven't
> figured out yet why. But nevertheless, the current version works.


The issue you are running into is likely that the size passed to 
PyArray_ZEROS must be an "npy_intp" rather than "int", which makes it 
64-bit on 64-bit platforms. As you pass it by its address this is a 
rather crucial point (otherwise you end up trying to allocate a rather 
random number of gigabytes).

This actually came up earlier on this mailing list. It would be great if 
you could make a documentation patch against ndarrayobject.h and submit 
it on the NumPy mailing list since it obviously seems like a common 
problem. Patches for numpy.pxd (including PyArray_ZEROS etc. declared 
with the right pointer types :-) ) are also welcome, currently numpy.pxd 
is geared towards high-level Python use and excludes a lot of C-level 
functions (because of lack of time/laziness only, throwing them in there 
doesn't hurt).

HOWEVER, note that your iarray_d will only work if the array that is 
passed in is contiguous. So you should either a) insert an assertion of 
a.flags['C_CONTIGUOUS'] at the beginning, or b) fix it so that it works 
with all arrays (as by my example).

Any non-trivial NumPy usage is likely to end up with non-contiguous 
arrays, so if the "a" argument could come from the end-user somehow then 
choking on non-contiguous arrays is not going to work well.


-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] canonical way of converting numpy arrays to C arrays

Reply via email to