Hello Peter,
Le vendredi 01 septembre 2006 à 15:04 +0200, Moreton, Peter a écrit :
> I'm trying to compile Prime95, v24.14 using Visual Studio 2005, X64. The
> reason?
> - well, I'm playing about with a Quad-core Opteron workstation, and wanted to
> see if it might be possible to 'decompose' the Lucas-Lehmer code over 4 CPU
> cores.
> Also, I have an FPGA co-processor device to try out....
As Matthias and George said, "'decompose' the Prime95 Lucas-Lehmer code
over 4 CPU cores" will certainly be very difficult.
I've already suggested George to think about that because more and more
processors have several cores and because soon it will take many months
on a fast machine to crunch the Mersenne numbers with big exponents.
However, "'decompose' the C Lucas-Lehmer code over N CPU cores" has
already being done by Ernst Mayer (MLucas, with MPI/OpenMP) and
Guillermo Valor (GLucas, with POSIX threads and MPI or OpenMP, don't
remember which one).
I had helped Guillermo to fix some small problems with threads, and I've
done some measurements of the scalability of GLucas on ia64 and PPC NUMA
architecture.
If you look at thread:
http://www.mersenneforum.org/showthread.php?t=5427
and at figure in post 11, you'll see that //ing a FFT is not 100%
scalable, even on very scalable IBM PPC machines, because some work has
to be done by a single thread for each iteration.
So, at least, if you have time to spend on this, I guess you should talk
with Ernst, Guillermo and George.
I would be very pleased to use such a //ed Prime95 ! even more if
scalability is quite good (more than 95% on 4 CPUs).
Bon courage !
Tony
_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime