Quoting John R Pierce <[EMAIL PROTECTED]>: > the am64 PMULUDQ (whatever it was, I closed those PDFs hours ago) claims to > do two 32x32->64 in one instruction (using 128 bit XMR registers of which > there are 16), has a latency of 4 clocks, and can execute a pair of multiplies > every two clocks. the XMR adders and other stuff can run in parallel. > Did your integer FFT use the am64's SSE 128bit extensions?
Fast Galois Tansforms use complex multiplies; if the special prime is 2^61-1, then a complex multiply requires four 64x64->128 multiplies or 16 32x32->64 multiplies and extra instructions to handle carries if you use SSE2. I didn't experiment with SSE2 at the time because - the instruction issue rate to the ALUs was so high that I didn't think additional SSE2 instructions could issue fast enough - those complex multiplies were enough of a problem without quadrupling the number of instructions (doubling the latency) in each one - the low-level FGTs were built using a code generator that spit out huge amounts of C, and I didn't want to retool it. Heck, Matthias Waldhauer and I spent a *lot* of time getting it to generate efficient code in the first place One idea I didn't try was to split the computations so that the part of the FGT that did multiplies ran in the ALU and the part that did not (half of each FGT) ran concurrently using SSE2 instructions. Doing that right would have required dropping completely into assembly language. I would think the increased instructions per cycle would save about 25% of the time, assuming the processor can decode an SSE2 instruction and all of the ALU instructions without slowing down. A completely different option is to use two 31-bit primes, compute two FGTs in parallel (using all SSE2), and combine with CRT reconstruction. It may be faster than a single 61-bit prime, but may not for a variety of technical reasons. jasonp ------------------------------------------------------ This message was sent using BOO.net's Webmail. http://www.boo.net/ _______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime
