Marcin Cieślik wrote: > Hello, > >> Nice. Will you contribute it to SciPy/any other library? > > I'd love to. It will find it's way into my zenpdb: > http://code.google.com/p/zenpdb/ which is the reason i developed it. I > do not know if projects like scipy/biopython accept cython code. > Biopython has a very fresh (not yet released) implementation of > kd-trees in C, but without knn look-up. further i've been told that > the maker of h-cluste/scipy-cluster is involved in some big effort to > create a partitioning / clustering + nearest neighbor look-up / > classification C-library with bindings to everything. Is there any > scientific cython code besides sage?
SciPy had a longish Cython thread some time ago (June-ish) though I never checked back to see the results. >> cdef np.ndarray[unsigned int, mode="c"] ridx = np.empty((k,), dtype='I') >> ... >> for 0 <= i < k: >> ridx[i] = self.kdpnts[idx[i]].index >> return (ridx, ...) > > It looks fantastic, I think the direction of avoiding the PyArray_* > calls is the right one. Note that the only new thing is mode="c"; you can do this with strided in the latest release. (np.ndarray[...] can be assigned any numpy array objects, also those you create yourself). BTW you need to import numpy as a Python module as well: import numpy as np (for calling np.empty the cimport is not enough) > This should not be a problem since most of the time the algorithm > spends in qsort anyway. I don't know why but recursion overhead seems > quite high for cython generated code. The resursive version of qsort > was about 2x slower. The numpy qsort code (in C) is actually ~60% > faster than mine > (http://svn.scipy.org/svn/numpy/trunk/numpy/core/src/_sortmodule.c.src), > but using it would require to make qsort a cpdef function (and slice > arrays / concatenate ) which i assumed would be slower. I think that cpdef is not slower in itself (it uses a cdef calling from other Cython code), but when passing references out NumPy it would be horrible, yes. This is a long-standing concern of mine: There's no fast and convenient standard algorithmic library to use with Cython code, without going into Python land. Good wrappers for C++ and STL (and better C++ template support) would get us somewhere. Then you could just call C++ qsort. Recursion overhead *should* be almost the same as C in Cython (and if not we need to find a mechanism, or even compiler directive, to fix it). You could try to have a look at the generated C (use the -a switch to Cython!) and see if there's any slow operations perhaps? Does NumPy use recursion? And are you sure you compile your code with optimization turned on? (If gcc doesn't inline those "swap" for instance you'll have a problem). And BTW, you can write A[i], A[r] = A[r], A[i] and it will turn into efficient C for swapping elements :-) -- Dag Sverre _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
