The inefficiency comes in the generic iteration and construction of int
objects needed by the builtin sum function.  Using the native numarray sum
method on each row is much much faster, summing over the axis directly even
faster still:

t1=time.time()
highEnough=myMat>0.6
greaterPerLine=[x.sum() for x in highEnough]
elapsed1=time.time()-t1
print("method 1a took %f seconds"%elapsed1)

t1=time.time()
highEnough=myMat>0.6
greaterPerLine=highEnough.sum(axis=1)
elapsed1=time.time()-t1
print("method 1b took %f seconds"%elapsed1)

method 1 took 1.503523 seconds
method 2 took 0.163641 seconds
method 1a took 0.006665 seconds
method 1b took 0.004070 seconds

-Kevin


On 3/11/07, Dan Becker <[EMAIL PROTECTED]> wrote:

As soon as I posted that I realized it's due to the type conversions from
True
to 1.  For some reason, this

---
myMat=scipy.randn(500,500)
t1=time.time()
highEnough=(myMat>0.6)+0
greaterPerLine=[sum(x) for x in highEnough]
elapsed1=time.time()-t1
print("method 1 took %f seconds"%elapsed1)
---

remedies that to some extent.  It is only 20% slower than the map.  Still,
there
must be some way for me to make the clean way faster than

greaterPerLine2=map(lambda(x):len(filter(lambda(y):y>0.6,x)),myMat)

I appreciate any advice on how to do that.

Thanks again,
Dan







_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to