On Mon, Feb 13, 2012 at 6:23 PM, Marcel Oliver <m.oli...@jacobs-university.de> wrote: > Hi, > > I have a short piece of code where the use of an index array "feels > right", but incurs a severe performance penalty: It's about an order > of magnitude slower than all other operations with arrays of that > size. > > It comes up in a piece of code which is doing a large number of "on > the fly" histograms via > > hist[i,j] += 1 > > where i is an array with the bin index to be incremented and j is > simply enumerating the histograms. I attach a full short sample code > below which shows how it's being used in context, and corresponding > timeit output from the critical code section. > > Questions: > > - Is this a matter of principle, or due to an inefficient > implementation? > - Is there an equivalent way of doing it which is fast? > > Regards, > Marcel > > ================================================================= > > #! /usr/bin/env python > # Plot the bifurcation diagram of the logistic map > > from pylab import * > > Nx = 800 > Ny = 600 > I = 50000 > > rmin = 2.5 > rmax = 4.0 > ymin = 0.0 > ymax = 1.0 > > rr = linspace (rmin, rmax, Nx) > x = 0.5*ones(rr.shape) > hist = zeros((Ny+1,Nx), dtype=int) > j = arange(Nx) > > dy = ymax/Ny > > def f(x): > return rr*x*(1.0-x) > > for n in xrange(1000): > x = f(x) > > for n in xrange(I): > x = f(x) > i = array(x/dy, dtype=int) > hist[i,j] += 1 > > figure() > > imshow(hist, > cmap='binary', > origin='lower', > interpolation='nearest', > extent=(rmin,rmax,ymin,ymax), > norm=matplotlib.colors.LogNorm()) > > xlabel ('$r$') > ylabel ('$x$') > > title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$') > > show() > > ==================================================================== > > In [4]: timeit y=f(x) > 10000 loops, best of 3: 19.4 us per loop > > In [5]: timeit i = array(x/dy, dtype=int) > 10000 loops, best of 3: 22 us per loop > > In [6]: timeit img[i,j] += 1 > 10000 loops, best of 3: 119 us per loop > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion
This suggests to me that fancy indexing could be quite a bit faster in this case: In [40]: timeit hist[i,j] += 110000 loops, best of 3: 58.2 us per loop In [39]: timeit hist.put(np.ravel_multi_index((i, j), hist.shape), 1) 10000 loops, best of 3: 20.6 us per loop I wrote a simple Cython method def fancy_inc(ndarray[int64_t, ndim=2] values, ndarray[int64_t] iarr, ndarray[int64_t] jarr, int64_t inc): cdef: Py_ssize_t i, n = len(iarr) for i in range(n): values[iarr[i], jarr[i]] += inc that does even faster In [8]: timeit sbx.fancy_inc(hist, i, j, 1) 100000 loops, best of 3: 4.85 us per loop About 10% faster if bounds checking and wraparound are disabled. Kind of a bummer-- perhaps this should go high on the NumPy 2.0 TODO list? - Wes _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion