Jason Stratos Papadopoulos wrote:
> For really big FFTs you can get major gains by using FFTW as a building
> block in a bigger program, rather than have it do your entire FFT with
> a single function call. As Ernst mentioned, the building block approach
> lets you fold some of the forward and inverse FFT computations together,
> and this saves loads of time in cache misses avoided. On the UltraSPARC,
> using FFTW for the pieces of your computation rather than the whole thing
> is somewhere between 2 and 10 times faster than FFTW alone.
>
It could be terrific!. I'll see that.
> On the Pentium, assembly-coded small FFTs run more than twice as fast
> as FFTW. Even from C, you can do better on the Pentium (do a web search
> for djbfft, a free Pentium-optimized FFT package). For a recursive
> split-radix, you need about 200 lines of assembly; surely this is worth
> twice the speed!
>
I would like to write some C-code for general proposes. For tuned
assembler we have the Woltman fantastic prime95/mprime code.
Thank you very much for your comments. It will help me a lot.
Have a nice weekend.
| Guillermo Ballester Valor |
| [EMAIL PROTECTED] |
| c/ cordoba, 19 |
| 18151-Ogijares (Spain) |
| (Linux registered user 1171811) |
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers