Hi Bruce, In the context of the actual problem, I have a long series of non-equidistant and irregularly spaced float numbers and I have to take values between given limits with the constraint of keeping a minimal separation. Option 2 just misses the first value of the input array if it is within the limits, but for my purposes (perform a fit with a given function) is acceptable. I said "this seems to be quite close to what I need" because I do not like missing the first point because that gives equivalent but not exactly the same solutions.
By the way, thanks for the % hint. That should make the .astype(int) disappear and make the expression look nicer. Armando Bruce Southey wrote: > On 06/09/2010 10:24 AM, Vicente Sole wrote: >>>> ? Well a loop or list comparison seems like a good choice to me. It is >>>> much more obvious at the expense of two LOCs. Did you profile the two >>>> possibilities and are they actually performance-critical? >>>> >>>> cheers >>>> >>>> >> The second is between 8 and ten times faster on my machine. >> >> import numpy >> import time >> x0 = numpy.arange(10000.) >> niter = 2000 # I expect between 10000 and 100000 >> >> >> def option1(x, delta=0.2): >> y = [x[0]] >> for value in x: >> if (value - y[-1]) > delta: >> y.append(value) >> return numpy.array(y) >> >> def option2(x, delta=0.2): >> y = numpy.cumsum((x[1:]-x[:-1])/delta).astype(numpy.int) >> i1 = numpy.nonzero(y[1:]> y[:-1]) >> return numpy.take(x, i1) >> >> >> t0 = time.time() >> for i in range(niter): >> t = option1(x0) >> print "Elapsed = ", time.time() - t0 >> t0 = time.time() >> for i in range(niter): >> t = option2(x0) >> print "Elapsed = ", time.time() - t0 >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > For integer arguments for delta, I don't see any different between > using option1 and using the '%' operator. > >>> (x0[(x0*10)%2==0]-option1(x0)).sum() > 0.0 > > Also option2 gives a different result than option1 so these are not > equivalent functions. You can see that from the shapes > >>> option2(x0).shape > (1, 9998) > >>> option1(x0).shape > (10000,) > >>> ((option1(x0)[:9998])-option2(x0)).sum() > 0.0 > > So, allowing for shape difference, option2 is the same for most of > output from option1 but it is still smaller than option1. > > Probably the main reason for the speed difference is that option2 is > virtually pure numpy (and hence done in C) and option1 is using a lot > of array lookups that are always slow. So keep it in numpy as most as > possible. > > > Bruce > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion