Quoting John R Pierce <[EMAIL PROTECTED]>:

> the am64 PMULUDQ (whatever it was, I closed those PDFs hours ago) claims to
> do two 32x32->64 in one instruction (using 128 bit XMR registers of which 
> there are 16), has a latency of 4 clocks, and can execute a pair of multiplies
> every two clocks.   the XMR adders and other stuff can run in parallel.   
> Did your integer FFT use the am64's SSE 128bit extensions?

Fast Galois Tansforms use complex multiplies; if the special prime is
2^61-1, then a complex multiply requires four 64x64->128 multiplies or
16 32x32->64 multiplies and extra instructions to handle carries if you use
SSE2. I didn't experiment with SSE2 at the time because

- the instruction issue rate to the ALUs was so high that I didn't think
  additional SSE2 instructions could issue fast enough
- those complex multiplies were enough of a problem without quadrupling
  the number of instructions (doubling the latency) in each one
- the low-level FGTs were built using a code generator that spit out
  huge amounts of C, and I didn't want to retool it. Heck, Matthias
  Waldhauer and I spent a *lot* of time getting it to generate efficient
  code in the first place

One idea I didn't try was to split the computations so that the part of
the FGT that did multiplies ran in the ALU and the part that did not
(half of each FGT) ran concurrently using SSE2 instructions. Doing that
right would have required dropping completely into assembly language.
I would think the increased instructions per cycle would save about 25%
of the time, assuming the processor can decode an SSE2 instruction and
all of the ALU instructions without slowing down.

A completely different option is to use two 31-bit primes, compute two
FGTs in parallel (using all SSE2), and combine with CRT reconstruction. 
It may be faster than a single 61-bit prime, but may not for a variety
of technical reasons.

jasonp

------------------------------------------------------
This message was sent using BOO.net's Webmail.
http://www.boo.net/
_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to