Hi Luis, Glad to hear that more substantial speedups are possible with matrix multiplication! What results are you seeing?
The “if you hack” bit was supposed to be a suggestion/request that you take my code and modify it to add set_autopthread_targ – sorry to be unclear. Best regards, Ed From: Luis Mochán<mailto:moc...@icf.unam.mx> Sent: 07 July 2022 22:43 To: Ed .<mailto:ej...@hotmail.com> Cc: Eric Wheeler<mailto:p...@lists.ewheeler.net>; pdl-general@lists.sourceforge.net<mailto:pdl-general@lists.sourceforge.net> Subject: Re: [Pdl-general] How do you create a set of cdouble matrices from (real, imag) values? Hi Ed, On Thu, Jul 07, 2022 at 03:38:23PM +0000, Ed . wrote: > ... > A wrinkle to your specified task is that as mentioned above, simple numerical > addition is a very memory-bound activity. If the task were a bit more complex > (using the registers / CPU caches more), the upper bound of beneficial > thread-use might well be higher than the 2 your results show. I see. It goes much better if I broadcast, for example, a small matrix multiplication (as in the case of Eric) instead of a simple scalar multiplication. > A very contrived example that I’ve used to explore this, while investigating > the floating point benchmark stuff > (https://github.com/Fourmilab/floating_point_benchmarks/pull/1/files) (if you > hack it up with your additional set_autopthread_targ that would be very > valuable): I'm not sure I understand what you mean 'if you hack...' > ... Regards, Luis -- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Av. Universidad s/n CP 62210 | (*)/\/ \ Cuernavaca, Morelos, México | moc...@fis.unam.mx /\_/\__/ GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB
_______________________________________________ pdl-general mailing list pdl-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pdl-general