Marcin Cieślik wrote:
> Hello,
> 
>> Nice. Will you contribute it to SciPy/any other library?
> 
> I'd love to. It will find it's way into my zenpdb:
> http://code.google.com/p/zenpdb/ which is the reason i developed it. I
> do not know if projects like scipy/biopython accept cython code.
> Biopython has a very fresh (not yet released) implementation of
> kd-trees in C, but without knn look-up. further i've been told that
> the maker of h-cluste/scipy-cluster is involved in some big effort to
> create a partitioning / clustering + nearest neighbor look-up /
> classification C-library with bindings to everything. Is there any
> scientific cython code besides sage?

SciPy had a longish Cython thread some time ago (June-ish) though I 
never checked back to see the results.

>> cdef np.ndarray[unsigned int, mode="c"] ridx = np.empty((k,), dtype='I')
>> ...
>> for 0 <= i < k:
>> ridx[i] = self.kdpnts[idx[i]].index
>> return (ridx, ...)
> 
> It looks fantastic, I think the direction of avoiding the PyArray_*
> calls is the right one.

Note that the only new thing is mode="c"; you can do this with strided 
in the latest release. (np.ndarray[...] can be assigned any numpy array 
objects, also those you create yourself).

BTW you need to import numpy as a Python module as well:

import numpy as np

(for calling np.empty the cimport is not enough)

> This should not be a problem since most of the time the algorithm
> spends in qsort anyway. I don't know why but recursion overhead seems
> quite high for cython generated code. The resursive version of qsort
> was about 2x slower. The numpy qsort code (in C) is actually ~60%
> faster than mine
> (http://svn.scipy.org/svn/numpy/trunk/numpy/core/src/_sortmodule.c.src),
>  but using it would require to make qsort a cpdef function (and slice
> arrays / concatenate ) which i assumed would be slower. I think that

cpdef is not slower in itself (it uses a cdef calling from other Cython 
code), but when passing references out NumPy it would be horrible, yes.

This is a long-standing concern of mine: There's no fast and convenient 
standard algorithmic library to use with Cython code, without going into 
Python land. Good wrappers for C++ and STL (and better C++ template 
support) would get us somewhere. Then you could just call C++ qsort.

Recursion overhead *should* be almost the same as C in Cython (and if 
not we need to find a mechanism, or even compiler directive, to fix it). 
You could try to have a look at the generated C (use the -a switch to 
Cython!) and see if there's any slow operations perhaps?

Does NumPy use recursion? And are you sure you compile your code with 
optimization turned on? (If gcc doesn't inline those "swap" for instance 
you'll have a problem).

And BTW, you can write

A[i], A[r] = A[r], A[i]

and it will turn into efficient C for swapping elements :-)

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to