a:

> I have updated the graph now.

Thank you, it's a very nice graph. The reference lines of FFTW help. From how 
much AVX improves the situation I see Intel engineers know their work; I didn't 
know AVX is able to bring such performance improvement (in carefully written 
code).


> The C++ version of my FFT also supports AVX and gets to
> about 24 GFLOPS when using it.

So your FFT code is rather good, despite being so much simpler and shorter than 
FFTW code :-) Your code seems good to replace Phobos std.numeric.fft(), if you 
will ever want to donate your FFT code to Phobos. 


> If AVX types will be added to D, I will port that part too.

Walter has added support for YMM registers too in D/DMD, so I presume having 
AVX1 instructions are quite an option. I presume we will have them too. But ask 
Walter for more details on this.

And hopefully we'll see good implementations of this algorithm too :-)
http://web.mit.edu/newsoffice/2012/faster-fourier-transforms-0118.html?tmpl=component&print=1

Bye,
bearophile

Reply via email to