Re: [Cython] NumPy output works on 32bit not on 64bit

Marcin Cieślik Thu, 18 Sep 2008 07:47:16 -0700

Hello,

> Nice. Will you contribute it to SciPy/any other library?


I'd love to. It will find it's way into my zenpdb:
http://code.google.com/p/zenpdb/ which is the reason i developed it. I
do not know if projects like scipy/biopython accept cython code.
Biopython has a very fresh (not yet released) implementation of
kd-trees in C, but without knn look-up. further i've been told that
the maker of h-cluste/scipy-cluster is involved in some big effort to
create a partitioning / clustering + nearest neighbor look-up /
classification C-library with bindings to everything. Is there any
scientific cython code besides sage?

> Your code inspired me to make some additions to the buffer code, up in
> the devel branch now. Your calls to PyArray_SimpleNewFromData should no
> longer be necesarry, instead you can (if you so wish, of course...) do
>
> cdef np.ndarray[unsigned int, mode="c"] ridx = np.empty((k,), dtype='I')
> ...
> for 0 <= i < k:
> ridx[i] = self.kdpnts[idx[i]].index
> return (ridx, ...)

It looks fantastic, I think the direction of avoiding the PyArray_*
calls is the right one.

> without any penalty. I.e. mode="c" causes no stride multiplication to
> happen on the last dimension (the only on in 1D). (The only additional
> overhead should be calling np.empty vs. PyArray_SimpleNewFromData + malloc).

This should not be a problem since most of the time the algorithm
spends in qsort anyway. I don't know why but recursion overhead seems
quite high for cython generated code. The resursive version of qsort
was about 2x slower. The numpy qsort code (in C) is actually ~60%
faster than mine
(http://svn.scipy.org/svn/numpy/trunk/numpy/core/src/_sortmodule.c.src),
 but using it would require to make qsort a cpdef function (and slice
arrays / concatenate ) which i assumed would be slower. I think that
if cython had a do ... while loop  (i don't think it should), which is
at the heart of qsort my code would be closer to numpy. What i could
improve performance wise is to make tree-creation non-recursive, but
in my first application I will create a static tree and query it
multiple times so it's not on my priority list.

> If you do this I'm very interested in the benchmarks and whether there's
> a performance regression, and also what happens to the timings if mode
> is set to "strided" instead. (With "strided" you have the advantage that
> you can take an optional "out" parameter like numpy often does...)

Here are timings for my non-strided version.

For an array of 7000 3D points (0.0; 1.0):
1) find k=20 from (0.5, 0.5, 0.5)
# scikits-ann: best of 3: 8.19 ms per loop
# cython kd-tree: best of 3: 9.3 ms per loop

2) find all within squared radius 0.04 (~300 points):
# Bio.KDTree: best of 3: 30.9 ms per loop
# cython kd-tree: best of 3: 9.35 ms per loop
he numpy over-head from random matrix computation is ~1ms

strdided using the devel-cython api, will come :)

Marcin
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] NumPy output works on 32bit not on 64bit

Reply via email to