Jason Stratos Papadopoulos wrote:
> 
> There was a paper in (I believe) Appl. Math and Comp. from about two
> years ago that reformulated radix-2,3,4 and 5 FFT butterfiles to use
> multiply-adds wherever possible. The savings can be impressive: a radix-2
> FFT butterfly has 4 multiplies and six adds, but this boils down to
> six multiply-adds. Can't the RS6000 do two multiply-adds per clock?

   I have found a paper titled "Implementation of Efficient FFT Algorithms
on Fused Multiply-Add Architectures" by E. Linzer and E. Feig, IEEE Trans.
on Signal Processing, vol. 41, no 1, Jan 93.  They call their method
scaled FFT.

   For a radix-8, with size 8, Cooley-Tukey FFT has 56 m/a ops and their
method has 52 m/a ops.  The number of m/a ops for radix 8 is:

        - Cooley-Tukey:   7/2 n log_2(n) - 57/14 n + 32/7
        - scaled FFT:    11/4 n log_2(n) - 57/28 n + 16/7.

   Regarding the errors (root-mean-square error per point):
                              DFT        inverse DFT
        - Cooley-Tukey:  5.3500 10^-15   3.3999 10^-15
        - scaled FFT:    5.3441 10^-15   3.2386 10^-15

They say the impovement in error is probably due to the fast that a m/a
op is more precise than a mult followed by an add.

   Their experiments were done on an RS/6000 model 530.

   I can't comment on that article, since I'm not mathematically-gifted
enough to just read it without a lot of time with a pencil!


                Laurent
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to