On Wed, Dec 9, 2015 at 5:57 AM, 博陈 <chenphysic...@gmail.com> wrote:

>
> <https://lh3.googleusercontent.com/-lTsIsN0BaAY/VmgIypsEQ2I/AAAAAAAAAAk/n-j-ZalGl5I/s1600/QQ%25E6%2588%25AA%25E5%259B%25BE20151209185519.png>
> the optimization strategy for fft given by the official documentation
> seems to fail. Why?
>

You didn't mention exactly what optimization strategy you are trying so I
would need to guess.

1. You should expect the first one to be no faster than the last one since
it's basically doing the same thing and the first one does it all in global
scope
2. In place op doesn't make too much a difference here since the operation
you are doing is already very expensive. (most of the time are spent in
FFTW)
3. It doesn't really matter for this one (since FFTW determines the
performance here) but you should benchmark the loop in a function and hoist
the creation of the plan out of the loop. For your actual code, you might
want to make the plan a global constant or a parametrized field of a type
since it has not been not particularly type stable.
4. You can use `plan_fft(...., flags=FFTW.MEASURE)` to let FFTW select the
best algorithm by actually measuring the time instead of guessing. It gives
me 20% to 30% speed up for your example and IIRC more speed up for small
problems.
5. You can use `FFTW.flops(p)` to figure out how much floating point
operations are needed to perform your transformation. On my computer, a
MEASURE'd plan takes 4.3s (100 times) and the naive estimation from
assuming one operation per clock cycle is 2s (100 times) so it's the right
order of magnitude.

Reply via email to