Hi Brett,
There are a few small ways to speed up the code (I'm not thinking
about algorithmic ones; there may be some of those too)
> @cython.boundscheck(False) # turn of bounds-checking for entire function
> def table_bincount(int max_elements, np.ndarray[long, ndim=2] table):
> cdef long rows = table.shape[0]
> cdef long columns = table.shape[1]
> cdef long i, j
Use size_t (in latest cython version) or unsigned long instead of long
where appropriate. If it's a signed type, cython checks to see if
it's negative so things like a[-1] can be supported. Thus unsigned
types are faster.
> cdef np.ndarray[double, ndim=2] retval = np.zeros(
> (rows, max_elements), dtype=np.float64)
you can add mode="c" into this array as well, since you're creating
it. This means it's a contiguously aligned array, so the last stride
is assumed to be one. This can speed up internal loops. So the new
version is
> cdef np.ndarray[double, ndim=2, mode="c"] retval = np.zeros(
> (rows, max_elements), dtype=np.float64)
> for 0 <= i < rows:
> # Is there any way of typing these intermediate values so that they
> # do not get turned into python code?
> current_in_row = table[i]
> current_out_row = retval[i]
Any reason you can't use table[i, j] and retval[i,j] instead of
creating intermediate slices? Note that you should usually do A[i,j]
instead of A[i][j] in regular python as well.
> current_out_row[current_in_row[j]] += 1
Can't this just be something like:
retval[i, table[i,j]] += 1
Hope this all helps!
--Hoyt
++++++++++++++++++++++++++++++++++++++++++++++++
+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ [email protected]
++++++++++++++++++++++++++++++++++++++++++
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev