David Cournapeau wrote: > Hi, > > While profiling some code, I noticed that sum in numpy is kind of > slow once you use axis argument: > Yes, this is expected because when using an access argument, the following two things can happen
1) You may be skipping over large chunks of memory to get to the next available number and out-of-cache memory access is slow. 2) You have to allocate a result array. > import numpy as N > a = N.random.randn(1e5, 30) > %timeit N.sum(a) #-> 26.8ms > %timeit N.sum(a, 1) #-> 65.5ms > %timeit N.sum(a, 0) #-> 141ms > > Now, if I use some tricks, I get: > > %timeit N.sum(a) #-> 26.8 ms > %timeit N.dot(a, N.ones(a.shape[1], a.dtype)) #-> 11.3ms > %timeit N.dot(N.ones((1, a.shape[0]), a.dtype), a) #-> 15.5ms > > I realize that dot uses optimized libraries (atlas in my case) and all, > but is there any way to improve this situation ? > Sum does *not* use an optimized library so it is not too surprising that you can get speed-ups using ATLAS. It would be nice to do something to optimize the reduction functions in NumPy, but nobody has come forward with suggestions yet. Thanks for the reports, though. -Travis _______________________________________________ Numpy-discussion mailing list [email protected] http://projects.scipy.org/mailman/listinfo/numpy-discussion
