Here is an idea that may be useful in increasing the throughput of GIMPS.

In the June 1999 issue of the IEEE Signal Processing Letters, Shay Moshe
and David Hertz present the article "On Computing DFT of Real N-Point
Vector and IDFT of DFT-Transformed Real N-Point Vector via Single DFT".

Basically, the article gives a simple method of computing an n-point DFT
and an n-point IDFT using only a single complex n-point DFT (and minor
additional computations of complexity O(n)).  

I was thinking that this might be useful for GIMPS, in the following scenario:

Suppose a user has two exponents, p and q, both of which use the same FFT
size for the squaring operations of the Lucas-Lehmer calculation.  Let
{L(i)} be the Lucas-Lehmer sequence for exponent p, and let {L'(i)} be the
Lucas-Lehmer sequence for exponent q.  Then both exponents could be tested
in parallel, using the above algorithm, as follows:

Exponent p        Exponent q
-----------       -----------
L(1) = 4          L'(1) = 4
perform FFT
square elements
perform IFFT      perform FFT     <--- perform in parallel
subtract 2        square elements
  to get L(2)
perform FFT       perform IFFT    <--- perform in parallel
square elements   subtact 2 
                    to get L'(2)
perform IFFT      perform FFT     <--- perform in parallel
subtract 2        square elements
  to get L(3)
perform FFT       perform IFFT    <--- perform in parallel
   ...               ...

When one of the two exponents is finished, a third exponent could be
started and its Lucas-Lehmer sequence calculated in parallel with the
remaining exponent, and so on.

In this way, two exponents could be tested using the same number of
FFT/IFFT operations as is currently being used to test 1 exponent.  So if I
haven't overlooked anything (a big "if"), this could nearly double the
throughput of GIMPS.  (I have assumed that the time for a single squaring
operation in the current implementation is approximately the time for an
FFT + the time for an IFFT-- correct me if I'm wrong.)

Mike

P.S. If there is sufficient interest, I can post the details of the
simultaneous FFT/IFFT algorithm to the list for those who don't have access
to the original article.


_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to