Sturla Molden wrote: > IMHO, trying to beat Intel or AMD performance library developers with > Python, NumPy and multiprocessing is just silly. Nothing we do with > array operator * and np.sum is ever going to compare with BLAS functions > from these libraries.
I think the issue got confused -- the OP was not looking to speed up a matrix multiply, but rather to speed up a whole bunch of independent matrix multiplies. > Sometimes we need a little bit more course-grained parallelism. Then > it's time to think about Python threads and releasing the GIL or use > OpenMP with C or Fortran. > > multiprocessing is the last tool to think about. It is mostly > approproate for 'embarassingly parallel' paradigms, and certainly not > the tool for parallel matrix multiplication. I think this was, in fact, an embarrassingly parallel example. But when the OP put it on two processors, it was slower than one -- hence his question. I got the same result on my machine as well. I'm not sure he tried python threads, that may be worth a shot. It would also would be great if someone that actually understands this stuff could look at his code and explain why the slowdown occurs (hint, hint!) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [email protected] _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
