> of course it won't have the ripple artifacts associated with FFT overlap
> windowing
>

What is the ripple artifact you are talking about? When using constant
overlap add (COLA) windows the STFT is a perfect reconstruction filterbank.
Likewise block FFT convolution can be used to implement any FIR filtering
operation.






> cheers,
> -ez
>
> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <g...@waxingwave.com>
> wrote:
>
>> Hello Spencer,
>>
>> You wrote:
>> > A while ago I read through some the literature [1] on implementing
>> > an invertible CQT as a special case of the Nonstationary Gabor
>> > Transform. It's implemented by the essentia library [2] among other
>> > places probably.
>> >
>> > The main idea is that you take the FFT of your whole signal, then
>> > apply the filter bank in the frequency domain (just
>> > multiplication). Then you IFFT each filtered signal, which gives you
>> > the time-domain samples for each band of the filter bank. Each
>> > frequency-domain filter has a different bandwidth, so your IFFT is a
>> > different length for each one, which gives you the different sample
>> > rates for each one.
>>
>> That's the basic idea, but the Gaborator rounds up each of the
>> per-band sample rates to the original sample rate divided by some
>> power of two.  This means all the FFT sizes can be powers of two,
>> which tend to be faster than arbitrary sizes.  It also results in a
>> nicely regular time-frequency sampling grid where many of the samples
>> coincide in time, as shown in the second plot on this page:
>>
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow&e=
>>
>> Also, the Gaborator makes use of multirate processing where the signal
>> is repeatedly decimated by 2 and the calculations for the lower
>> octaves run at successively lower sample rates.  These optimizations
>> help the Gaborator achieve a performance of millions of samples per
>> second per CPU core.
>>
>> > They also give an "online" version where you do
>> > the processing in chunks, but really for this to work I think you'd
>> > need large-ish chunks so the latency would be pretty bad.
>>
>> The Gaborator also works in chunks.  A typical chunk size might be
>> 8192 samples, but thanks to the multirate processing, in the lowest
>> frequency bands, each of those 8192 samples may represent the
>> low-frequency content of something like 1024 samples of the original
>> signal.  This gives an effective chunk size of some 8 million samples
>> without actually having to perform any FFTs that large.
>>
>> Latency is certainly high, but I would not say it is a consequence of
>> the chunk size as such.  Rather, both the high latency and the need
>> for a large (effective) chunk size are consequences of the lengths of
>> the band filter impulse responses, which get exponentially larger as
>> the constant-Q bands get narrower towards lower frequencies.
>>
>> Latency in the Gaborator is discussed in more detail here:
>>
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc&e=
>>
>> > The whole process is in some ways dual to the usual STFT process,
>> > where we first window and then FFT. in the NSGT you first FFT and
>> > then window, and then IFFT each band to get a Time-Frequency
>> > representation.
>>
>> Yes.
>>
>> > For resynthesis you end up with a similar window overlap constraint
>> > as in STFT, except now the windows are in the frequency domain. It's
>> > a little more complicated because the window centers aren't
>> > evenly-spaced, so creating COLA windows is complicated. There are
>> > some fancier approaches to designing a set of synthesis windows that
>> > are complementary (inverse) of the analysis windows, which is what
>> > the frame-theory folks like that Austrian group seem to like to use.
>>
>> The Gaborator was inspired by the papers from that Austrian group and
>> uses complementary resynthesis windows, or "duals" as frame theorists
>> like to call them.  The analysis windows are Gaussian, and the dual
>> windows used for resynthesis end up being slightly distorted
>> Gaussians.
>>
>> > One of the nice things about the NSGT is it lets you be really
>> > flexible in your filterbank design while still giving you
>> > invertibility.
>>
>> Agreed.
>>
>> In a later message, you wrote:
>> > Whoops, just clicked through to the documentation and it looks like
>> > this is the track you're on also. I'm curious if you have any
>> > insight into the window-selection for the analysis and synthesis
>> > process. It seems like the NSGT framework forces you to be a bit
>> > smarter with windows than just sticking to COLA, but the dual frame
>> > techniques should apply for regular STFT processing, right?
>>
>> I'm actually not that familiar with traditional STFTs and COLA, but as
>> far as I can tell, the STFT is a special case of the NSGT and the same
>> dual frame techniques should apply.
>> --
>> Andreas Gustafsson, g...@waxingwave.com
>> _______________________________________________
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=br6gIADk3PB9_kF8YoA7aZdcf5McFvCCOlyYso5D2BI&e=
>>
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to