Yeah, but it's not so obvious what's happening "under the hoods". Consider this (with an old Win7 machine): Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] np.__version__ '1.11.1'
On Mon, Nov 14, 2016 at 10:38 AM, Jerome Kieffer <jerome.kief...@esrf.fr> wrote: > On Fri, 11 Nov 2016 11:25:58 -0500 > Matthew Harrigan <harrigan.matt...@gmail.com> wrote: > > > I started a ufunc to compute the sum of square differences here > > <https://gist.github.com/mattharrigan/6f678b3d6df5efd236fc23bfb59fd3bd>. > > It is about 4x faster and uses half the memory compared to > > np.sum(np.square(x-c)). > > Hi Matt, > > Using *blas* you win already a factor two (maybe more depending on you > blas implementation): > > % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" > "np.sum(np.square(x-2.))" > 10 loops, best of 3: 135 msec per loop > > % python -m timeit -s "import numpy as np;x=np.linspace(0,1,int(1e7))" > "y=x-2.;np.dot(y,y)" > 10 loops, best of 3: 70.2 msec per loop > x= np.linspace(0, 1, int(1e6)) timeit np.sum(np.square(x- 2.)) 10 loops, best of 3: 23 ms per loop y= x- 2. timeit np.dot(y, y) The slowest run took 18.60 times longer than the fastest. This could mean that an intermediate result is being cached. 1000 loops, best of 3: 1.78 ms per loop timeit np.dot(y, y) 1000 loops, best of 3: 1.73 ms per loop Best, eat > > > Cheers, > -- > Jérôme Kieffer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion