At 05:56 AM 9/5/2006, you wrote:
>But, I'm sure this is a problem that should be tackled sometime, since
>moores-law of increasing performance is now being realised not by ever
>higher clock rates, but instead by replication of processing cores. So,
>a fast PC in 2008/9 might still clock at 3Ghz or so, but could have
>eight CPU cores, each executing instructions at 3Ghz.
>
>It seems to me that as the prime numbers being searched become ever more
>gigantic, and the time to run a single Lucas-Lehmer test stretches into
>months, it would be really beneficial to be able to devote several CPU
>cores to the task of testing one number. But I can see this is
>technically quite a challenge...

It isn't too challenging.  Each FFT is done in two passes.  Each pass performs
several levels of the FFT on independent data blocks.  It isn't hard 
to use threads
here because of the data independence.  The pre-fetching is tricky but do-able.
The carry-propagation step is the hardest to use threads 
effectively.  I don't have
any great ideas here -- not that I've thought about it in any great detail.

If you are still interested in working on the code, try modifying the routine
xpass2_10_levels in xmult3a.asm.  Let us know what you learn.


_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to