Re: [phobos] FFT

Andrei Alexandrescu Mon, 02 Aug 2010 10:52:33 -0700

I'll defer answer to this to others, as I haven't used FFT for a long time.

I do remember, however, that the discrete cosine transform was actuallymore popular in the circles I frequented. Would it be difficult to adaptyour implementation to offer dct?



Andrei

David Simcha wrote:

BTW, I've started thinking a little more about big picture issues here,and I'm debating whether it's a higher priority to improve performanceon power of 2 sizes, or to try to support other radix values.

There are two use cases for an FFT that I'm familiar with. Thepower-of-two limitation isn't severe in either of them.

1. Getting an idea of what the spectrum of a signal looks like. Here,it's common to pad with zeros because the plots become clearer looking,even if your FFT lib doesn't require it.

2. Computing a convolution. Here, padding with zeros is necessaryanyhow to prevent the signal from being "interpreted" as periodic.

Are there any major use cases where the power of two limitation issevere, or should I just focus on optimizing powers of 2 and call it a day?

On Mon, Aug 2, 2010 at 10:23 AM, Don Clugston <[email protected]<mailto:[email protected]>> wrote:


    On 2 August 2010 15:41, David Simcha <[email protected]
    <mailto:[email protected]>> wrote:
     > Unfortunately I just downloaded the benchmark program for FFTW
    and my FFT is
     > a ton slower, depending on how you look at it.  Using size 2^20 as my
     > benchmark, FFTW takes about 131 seconds to create its plan, even
    using
     > -oestimate, the fastest planner.  However, the plan can be reused if
     > performing multiple FFTs of the same size, and once the plan is
    created, it
     > can do an FFT of size 2^20 in about 53 milliseconds (which I find
    almost
     > unbelievable because even sorting an array of size 2^20 using a
     > well-optimized quick sort takes almost that long, and FFT seems
    like it
     > should should have a much larger constant than quick sort),
    compared to my
     > latest improvements to around 730 on the hardware I'm
    benchmarking on.
     >
     > On the other hand, for one-off use cases, the lack of needing to
    create a
     > plan is a big win, both from a speed and a simplicity of API
    point of view.
     >  Benchmarking against R, which doesn't appear to use plans,
    making the
     > comparison somewhat more relevant, things look better for my FFT:
     R takes
     > about 610 milliseconds for a size 2^20 pure real FFT.

    All you're seeing is the L2 cache. Did you see the link I posted to
    the NG about the internals of FFTW? The graph at the top shows FFTW is
    40 times faster than the 'numerical recipes' code for 2^^20. So your
    factor of 13 isn't so bad. Based on that graph, if you reduce it to
    say 2^^15, the factor should drop significantly. Adding a little bit
    of cache awareness (using core.cpuid) should be able to avoid the
    performance cliff.
    Also, DMD's floating point optimiser is so primitive, you lose up to a
    factor of two immediately.
    _______________________________________________
    phobos mailing list
    [email protected] <mailto:[email protected]>
    http://lists.puremagic.com/mailman/listinfo/phobos



------------------------------------------------------------------------

_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

Re: [phobos] FFT

Reply via email to