Thanks, I followed your advices and got my first useful cython program cross-platform. It is a kd-tree implementation which takes a numpy array as input and allows for k-nearest-neighbor and points-within-radius queries. A crude benchmark shows that for 7000 3D points (~number of heavy atoms in a protein) it is almost as fast as ANN via scikits-ANN and more than 3x faster than the current implementation of Bio.KDTree from biopython (SWIGed C++), which both have problems on 64bit machines.
pyx file: http://pastebin.com/maade362 pxd file: http://pastebin.com/m51effa8c For an array of 7000 3D points (0.0; 1.0): 1) find k=20 from (0.5, 0.5, 0.5) # scikits-ann: best of 3: 8.19 ms per loop # cython kd-tree: best of 3: 9.3 ms per loop 2) find all within radius 0.04 (~300 points): # Bio.KDTree: best of 3: 30.9 ms per loop # cython kd-tree: best of 3: 9.35 ms per loop The numpy over-head from random matrix computation is ~1ms Yours, Marcin >> This isn't related to the problem, but when you are using the NPY_DOUBLE >> etc. constants directly you should instead do >> >> ctypedef np.npy_double DTYPE_t >> >> This is because there isn't a 1:1 correspondance between the "shorthand" >> types and the NPY_TYPE constants (see bottom of >> Cython/Includes/numpy.pxd for details. BTW, patches for numpy.pxd to add >> the Py_Array_SimpleNew... calls are welcome if submitted). >> >>> from numpy cimport NPY_DOUBLE, NPY_UINT >>> from stdlib cimport malloc, realloc, free >>> >>> cdef extern from "arrayobject.h": >>> cdef object PyArray_SimpleNewFromData(int nd, int *dims,\ >>> int typenum, void *data) >> >> dims should be declared with the type npy_intp instead (and this is also >> what your compiler complained about). _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
