On 5/6/2010 6:33 PM, Daniel Carrera wrote: > >> That said, if PDL is relatively slower than another package at >> matmult (e.g.) then that could indicate a direction for >> optimization. > > Yes. I would really like to know how Octave and Scilab do matrix > multiplication. For a 200x200 matrix they are 3x faster than PDL, and > for a 2000x2000 matrix they are over a magnitude faster. What this > tells me is that they are probably using a different algorithm.
Well, if I cared about optimizing the matmul performance, I would put in an optimized kernel for that. If PDL is 3-10X slower than other packages on matrix multiplication, that is probably something we might want to work on. Well, I took a look at out default matrix multiply in Primitive.pm. This is the kernel: $a->dummy(1)->inner($b->xchg(0,1)->dummy(2),$c); so you can see that the kernel is an inner product operation so that the total op count is something like: O(N**3) memory ops O(N**3) float ops The optimal matrix multiply (ignoring fancy fft type algorithms or such) comes in at O(N**2) memory ops O(N**3) float ops so for large matrices, the PDL matmult will be dominated by memory access. As a side effect, it most likely breaks the caching so you lose some performance there too. I'm not doing a lot of matrix ops at the moment but anyone who does could be interested in an improved performance here. --Chris _______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
