hi again,
just wanted to chime in that this piece of software was released some time ago and is the traditional FIR/IIR equivalent of what's being discussed here, and is quite a breeze to use in the studio https://www.wavesfactory.com/trackspacer/ of course it won't have the ripple artifacts associated with FFT overlap windowing but i'm not sure how much delay there is or even what the phase distortion sounds like cheers, -ez On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <g...@waxingwave.com> wrote: > Hello Spencer, > > You wrote: > > A while ago I read through some the literature [1] on implementing > > an invertible CQT as a special case of the Nonstationary Gabor > > Transform. It's implemented by the essentia library [2] among other > > places probably. > > > > The main idea is that you take the FFT of your whole signal, then > > apply the filter bank in the frequency domain (just > > multiplication). Then you IFFT each filtered signal, which gives you > > the time-domain samples for each band of the filter bank. Each > > frequency-domain filter has a different bandwidth, so your IFFT is a > > different length for each one, which gives you the different sample > > rates for each one. > > That's the basic idea, but the Gaborator rounds up each of the > per-band sample rates to the original sample rate divided by some > power of two. This means all the FFT sizes can be powers of two, > which tend to be faster than arbitrary sizes. It also results in a > nicely regular time-frequency sampling grid where many of the samples > coincide in time, as shown in the second plot on this page: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow&e= > > Also, the Gaborator makes use of multirate processing where the signal > is repeatedly decimated by 2 and the calculations for the lower > octaves run at successively lower sample rates. These optimizations > help the Gaborator achieve a performance of millions of samples per > second per CPU core. > > > They also give an "online" version where you do > > the processing in chunks, but really for this to work I think you'd > > need large-ish chunks so the latency would be pretty bad. > > The Gaborator also works in chunks. A typical chunk size might be > 8192 samples, but thanks to the multirate processing, in the lowest > frequency bands, each of those 8192 samples may represent the > low-frequency content of something like 1024 samples of the original > signal. This gives an effective chunk size of some 8 million samples > without actually having to perform any FFTs that large. > > Latency is certainly high, but I would not say it is a consequence of > the chunk size as such. Rather, both the high latency and the need > for a large (effective) chunk size are consequences of the lengths of > the band filter impulse responses, which get exponentially larger as > the constant-Q bands get narrower towards lower frequencies. > > Latency in the Gaborator is discussed in more detail here: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc&e= > > > The whole process is in some ways dual to the usual STFT process, > > where we first window and then FFT. in the NSGT you first FFT and > > then window, and then IFFT each band to get a Time-Frequency > > representation. > > Yes. > > > For resynthesis you end up with a similar window overlap constraint > > as in STFT, except now the windows are in the frequency domain. It's > > a little more complicated because the window centers aren't > > evenly-spaced, so creating COLA windows is complicated. There are > > some fancier approaches to designing a set of synthesis windows that > > are complementary (inverse) of the analysis windows, which is what > > the frame-theory folks like that Austrian group seem to like to use. > > The Gaborator was inspired by the papers from that Austrian group and > uses complementary resynthesis windows, or "duals" as frame theorists > like to call them. The analysis windows are Gaussian, and the dual > windows used for resynthesis end up being slightly distorted > Gaussians. > > > One of the nice things about the NSGT is it lets you be really > > flexible in your filterbank design while still giving you > > invertibility. > > Agreed. > > In a later message, you wrote: > > Whoops, just clicked through to the documentation and it looks like > > this is the track you're on also. I'm curious if you have any > > insight into the window-selection for the analysis and synthesis > > process. It seems like the NSGT framework forces you to be a bit > > smarter with windows than just sticking to COLA, but the dual frame > > techniques should apply for regular STFT processing, right? > > I'm actually not that familiar with traditional STFTs and COLA, but as > far as I can tell, the STFT is a special case of the NSGT and the same > dual frame techniques should apply. > -- > Andreas Gustafsson, g...@waxingwave.com > _______________________________________________ > dupswapdrop: music-dsp mailing list > music-dsp@music.columbia.edu > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=br6gIADk3PB9_kF8YoA7aZdcf5McFvCCOlyYso5D2BI&e= >
_______________________________________________ dupswapdrop: music-dsp mailing list music-dsp@music.columbia.edu https://lists.columbia.edu/mailman/listinfo/music-dsp