a: > I have updated the graph now.
Thank you, it's a very nice graph. The reference lines of FFTW help. From how much AVX improves the situation I see Intel engineers know their work; I didn't know AVX is able to bring such performance improvement (in carefully written code). > The C++ version of my FFT also supports AVX and gets to > about 24 GFLOPS when using it. So your FFT code is rather good, despite being so much simpler and shorter than FFTW code :-) Your code seems good to replace Phobos std.numeric.fft(), if you will ever want to donate your FFT code to Phobos. > If AVX types will be added to D, I will port that part too. Walter has added support for YMM registers too in D/DMD, so I presume having AVX1 instructions are quite an option. I presume we will have them too. But ask Walter for more details on this. And hopefully we'll see good implementations of this algorithm too :-) http://web.mit.edu/newsoffice/2012/faster-fourier-transforms-0118.html?tmpl=component&print=1 Bye, bearophile
