Hi Dima,

On 05/22/2011 11:45 PM, Dima Kogan wrote:
The new functionality in PDL is able to distribute operations created
by PDL threading into separate processor threads. This takes effect
if, for example, you use PDL to multiply a 5000x5000x5 piddle by a
5000x5000 piddle. PDL threading treats this as 5 separate
multiplications of 5000x5000 matrices, and the new code will
parallelize this. However, if you're simply multiplying two 5000x5000
matrices together, there is no PDL threading involved, so the new patch
will do nothing.


Ah, thanks. That makes everything a lot more clear now.


It COULD do something if we define matrix multiplication as a bunch of
matrix-vector multiplications threaded together. Then the
parallelization will 'just work', but we don't define matrix
multiplication this way. (Sorta off-topic: should we change the
multiplication definition to this?)


This may not apply to PDL, but last year I tried something like this using OpenMP (i.e. threads) and Fortran, and the "parallel" code was actually slower.

In Fortran, when I just did "matmul(A,B)" the compiler wrote a loop that accessed memory very efficiently, and by forcing matrix-vector products I ruined that optimization and made the code slower. But I have no idea if this has any relevance to PDL.

--
I'm not overweight, I'm undertall.

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to