The inefficiency comes in the generic iteration and construction of int objects needed by the builtin sum function. Using the native numarray sum method on each row is much much faster, summing over the axis directly even faster still:
t1=time.time() highEnough=myMat>0.6 greaterPerLine=[x.sum() for x in highEnough] elapsed1=time.time()-t1 print("method 1a took %f seconds"%elapsed1) t1=time.time() highEnough=myMat>0.6 greaterPerLine=highEnough.sum(axis=1) elapsed1=time.time()-t1 print("method 1b took %f seconds"%elapsed1) method 1 took 1.503523 seconds method 2 took 0.163641 seconds method 1a took 0.006665 seconds method 1b took 0.004070 seconds -Kevin On 3/11/07, Dan Becker <[EMAIL PROTECTED]> wrote:
As soon as I posted that I realized it's due to the type conversions from True to 1. For some reason, this --- myMat=scipy.randn(500,500) t1=time.time() highEnough=(myMat>0.6)+0 greaterPerLine=[sum(x) for x in highEnough] elapsed1=time.time()-t1 print("method 1 took %f seconds"%elapsed1) --- remedies that to some extent. It is only 20% slower than the map. Still, there must be some way for me to make the clean way faster than greaterPerLine2=map(lambda(x):len(filter(lambda(y):y>0.6,x)),myMat) I appreciate any advice on how to do that. Thanks again, Dan _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion