Thanks,

I followed your advices and got my first useful cython program
cross-platform. It is a kd-tree implementation which takes a numpy
array as input and allows for k-nearest-neighbor and
points-within-radius queries. A crude benchmark shows that for 7000 3D
points (~number of heavy atoms in a protein) it is almost as fast as
ANN via scikits-ANN and more than 3x faster than the current
implementation of Bio.KDTree from biopython (SWIGed C++), which both
have problems on 64bit machines.

pyx file: http://pastebin.com/maade362
pxd file: http://pastebin.com/m51effa8c


For an array of 7000 3D points (0.0; 1.0):
1) find k=20 from (0.5, 0.5, 0.5)
# scikits-ann: best of 3: 8.19 ms per loop
# cython kd-tree: best of 3: 9.3 ms per loop

2) find all within radius 0.04 (~300 points):
# Bio.KDTree: best of 3: 30.9 ms per loop
# cython kd-tree: best of 3: 9.35 ms per loop
The numpy over-head from random matrix computation is ~1ms

Yours,
Marcin



>> This isn't related to the problem, but when you are using the NPY_DOUBLE
>> etc. constants directly you should instead do
>>
>> ctypedef np.npy_double DTYPE_t
>>
>> This is because there isn't a 1:1 correspondance between the "shorthand"
>> types and the NPY_TYPE constants (see bottom of
>> Cython/Includes/numpy.pxd for details. BTW, patches for numpy.pxd to add
>>   the Py_Array_SimpleNew... calls are welcome if submitted).
>>
>>> from numpy cimport NPY_DOUBLE, NPY_UINT
>>> from stdlib cimport malloc, realloc, free
>>>
>>> cdef extern from "arrayobject.h":
>>>     cdef object PyArray_SimpleNewFromData(int nd, int *dims,\
>>>                                             int typenum, void *data)
>>
>> dims should be declared with the type npy_intp instead (and this is also
>> what your compiler complained about).
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to