Re: [julia-users] FFTW.REDFT00 and FFT.RODFT00 are ~10x slower than other routines?

Sheehan Olver Wed, 23 Sep 2015 00:49:06 -0700

Hi Steven,

        OK that makes sense.  But why is Julia 2x slower than Matlab for some 
values of n  (see below, the timing difference is consistent when looped over)? 
 Shouldn’t they both be using FFTW?



julia> r=rand(2000000);plan=plan_fft(r); @time plan*r;
  0.080122 seconds (12 allocations: 61.035 MB, 12.91% gc time)

julia> r=rand(2000001);plan=plan_fft(r); @time plan*r;
  0.287002 seconds (12 allocations: 61.036 MB, 3.76% gc time)

julia> r=rand(1999999);plan=plan_fft(r); @time plan*r;
  0.218389 seconds (12 allocations: 61.035 MB, 4.76% gc time)


>> r=rand(2000000,1);tic();fft(r);toc()
Elapsed time is 0.023956 seconds.
>> r=rand(2000001,1);tic();fft(r);toc()
Elapsed time is 0.303320 seconds.
>> r=rand(1999999,1);tic();fft(r);toc()
Elapsed time is 0.131212 seconds.





> On 23 Sep 2015, at 3:52 am, Steven G. Johnson <[email protected]> wrote:
> 
> 
> 
> On Tuesday, September 22, 2015 at 6:51:50 AM UTC-4, Sheehan Olver wrote:
> I get the following timings with the various FFTW routines, where I ran each 
> line multiple times to make sure it was accurate.  Why are REDFT00 and 
> RODFT00 almost 10x slower?
> 
> This is because a REDFT00 (DCT-I) of n points is exactly equivalent to a DFT 
> of N=2(n-1) points with real-even data.  Although FFTW uses a different 
> algorithm, not a size-N complex FFT, the fast algorithms are essentially 
> equivalent to pruned versions of the FFT algorithms for N, and depend on the 
> factorization of N, not of n.  Similarly for RODFT00 (DST-I), except N=2(n+1) 
> in that case.  In contrast, the other DCT/DST types correspond to DFTs of 
> length N=2n or N=4n or N=8n, so their speed depends on the factorization of n.
> 
> If n=100000 as in your example, then n-1 has large prime factors (41, 271), 
> so the FFT algorithms are much slower, although they are still O(n log n).   
> If you try n=100001, you will notice that REDFT00 becomes much faster (but 
> other DCT types become slower).
> 
> This is mentioned in the FFTW manual:
> 
> http://www.fftw.org/doc/Real-even_002fodd-DFTs-_0028cosine_002fsine-transforms_0029.html
> 
> "FFTW is most efficient when N is a product of small factors; note that this 
> differs from the factorization of the physical size n for REDFT00 and 
> RODFT00!"
>

Re: [julia-users] FFTW.REDFT00 and FFT.RODFT00 are ~10x slower than other routines?

Reply via email to