Hello, El S�bado, 2 de Abril de 2005 14:01, Greg Edwards escribi�: > Thanks Guillermo, I didn't spot those options. > But I'm still confused as to what they really mean. > Can you split an L-L test into parallel threads over say 8 cpu's, and get > the answer ~ 8 times faster ?
Actually, I splitted a L-L iteration into some steps. To begin next step, all threads have to finish the prior one. Most of these steps are related with FFT passes, other performs last inverse decimation in time FFT pass, the carry phase, the DWT transform and first direct decimation in Frequency FFT. In few words, all threads are working simultaneously in the same L-L iteration, but every thread is working over different memory area in the data array. And yes, using 8 cpu's it goes about 8 time faster (ideally) . When the number of cpu's go over 16-32 the eficiency per cpu decrease because of some problems still to resolve. (Memory bandwidth, OS stuffs ....) > I had thought that the L-L algorithm was strictly sequential. > I guess you can exploit OpenMP and get a 2x or 3x/4x speedup in the short > pieces of code where stuff is really happening in classical "parallel", but > my guess would have been that would end up getting you about a 1.2X - 1.5X > speedup overall. I apologize I'm making these wild generalisations with no > knowledge of the L-L code. > You are not wrong, L-L test is sequential :-) Cheers, Guillermo -- Guillermo Ballester Valor [EMAIL PROTECTED] Ogijares, Granada SPAIN Linux user #117181. See http://counter.li.org/ Public GPG KEY http://www.oxixares.com/~gbv/pubgpg.html _______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime
