Jason Stratos Papadopoulos wrote:
> For really big FFTs you can get major gains by using FFTW as a building
> block in a bigger program, rather than have it do your entire FFT with
> a single function call. As Ernst mentioned, the building block approach
> lets you fold some of the forward and inverse FFT computations together,
> and this saves loads of time in cache misses avoided. On the UltraSPARC,
> using FFTW for the pieces of your computation rather than the whole thing
> is somewhere between 2 and 10 times faster than FFTW alone.
> 
It could be terrific!. I'll see that.

> On the Pentium, assembly-coded small FFTs run more than twice as fast
> as FFTW. Even from C, you can do better on the Pentium (do a web search
> for djbfft, a free Pentium-optimized FFT package). For a recursive
> split-radix, you need about 200 lines of assembly; surely this is worth
> twice the speed!
> 
I would like to write some C-code for general proposes. For tuned
assembler we have the Woltman fantastic prime95/mprime code.

Thank you very much for your comments. It will help me a lot.

Have a nice weekend.


| Guillermo Ballester Valor       |  
| [EMAIL PROTECTED]                      |  
| c/ cordoba, 19                  |
| 18151-Ogijares (Spain)          |
| (Linux registered user 1171811) |
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to