Hello,

El S�bado, 2 de Abril de 2005 14:01, Greg Edwards escribi�:
> Thanks Guillermo, I didn't spot those options.
> But I'm still confused as to what they really mean.
> Can you split an L-L test into parallel threads over say 8 cpu's, and get
> the answer ~ 8 times faster ?

Actually, I splitted a L-L iteration into some steps. To begin next step, all 
threads have to finish the prior one. Most of these steps are related with 
FFT passes,  other performs last inverse decimation in time FFT pass, the 
carry phase, the DWT transform and first direct decimation in Frequency FFT.

In few words, all threads are working simultaneously in the same L-L 
iteration, but every thread is working over different memory area in the data 
array. And yes, using 8 cpu's it goes about 8 time faster (ideally) . When 
the number of cpu's go over 16-32 the eficiency per cpu decrease because of 
some problems still to resolve. (Memory bandwidth, OS stuffs ....)

> I had thought that the L-L algorithm was strictly sequential.
> I guess you can exploit OpenMP and get a 2x or 3x/4x speedup in the short
> pieces of code where stuff is really happening in classical "parallel", but
> my guess would have been that would end up getting you about a 1.2X - 1.5X
> speedup overall. I apologize I'm making these wild generalisations with no
> knowledge of the L-L code.
>
You are not wrong, L-L test is sequential :-)

Cheers,

Guillermo

-- 
Guillermo Ballester Valor
[EMAIL PROTECTED]
Ogijares, Granada  SPAIN
Linux user #117181. See http://counter.li.org/
Public GPG KEY http://www.oxixares.com/~gbv/pubgpg.html

 
_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to