This problem is linear so probably Ram IO bound. I do not think I would benefit much for multiple cores. But I will give it a try. In the short term this is good enough for me.
On May 22, 2012, at 1:57 PM, Francesc Alted wrote: > On 5/22/12 8:47 PM, Dag Sverre Seljebotn wrote: >> On 05/22/2012 04:54 PM, Massimo DiPierro wrote: >>> For now I will be doing this: >>> >>> import numpy >>> import time >>> >>> a=numpy.zeros(2000000) >>> b=numpy.zeros(2000000) >>> c=1.0 >>> >>> # naive solution >>> t0 = time.time() >>> for i in xrange(len(a)): >>> a[i] += c*b[i] >>> print time.time()-t0 >>> >>> # possible solution >>> n=100000 >>> t0 = time.time() >>> for i in xrange(0,len(a),n): >>> a[i:i+n] += c*b[i:i+n] >>> print time.time()-t0 >>> >>> the second "possible" solution appears 1000x faster then the >>> former in my tests and uses little extra memory. It is only 2x >>> slower than b*=c. >>> >>> Any reason not to do it? >> No, this is perfectly fine, you just manually did what numexpr does. > > Yeah. You basically re-discovered the blocking technique. For a more > general example on how to apply the blocking technique with NumPy see > the section "CPU vs Memory Benchmark" in: > > https://python.g-node.org/python-autumnschool-2010/materials/starving_cpus > > Of course numexpr has less overhead (and can use multiple cores) than > using plain NumPy. > > -- > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
