Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

Zhiguang Eric Zhang Tue, 23 Jun 2020 08:18:35 -0700

hi again,


just wanted to chime in that this piece of software was released some time
ago and is the traditional FIR/IIR equivalent of what's being discussed
here, and is quite a breeze to use in the studio

https://www.wavesfactory.com/trackspacer/

of course it won't have the ripple artifacts associated with FFT overlap
windowing but i'm not sure how much delay there is or even what the phase
distortion sounds like


cheers,
-ez

On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <g...@waxingwave.com>
wrote:

> Hello Spencer,
>
> You wrote:
> > A while ago I read through some the literature [1] on implementing
> > an invertible CQT as a special case of the Nonstationary Gabor
> > Transform. It's implemented by the essentia library [2] among other
> > places probably.
> >
> > The main idea is that you take the FFT of your whole signal, then
> > apply the filter bank in the frequency domain (just
> > multiplication). Then you IFFT each filtered signal, which gives you
> > the time-domain samples for each band of the filter bank. Each
> > frequency-domain filter has a different bandwidth, so your IFFT is a
> > different length for each one, which gives you the different sample
> > rates for each one.
>
> That's the basic idea, but the Gaborator rounds up each of the
> per-band sample rates to the original sample rate divided by some
> power of two.  This means all the FFT sizes can be powers of two,
> which tend to be faster than arbitrary sizes.  It also results in a
> nicely regular time-frequency sampling grid where many of the samples
> coincide in time, as shown in the second plot on this page:
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow&e=
>
> Also, the Gaborator makes use of multirate processing where the signal
> is repeatedly decimated by 2 and the calculations for the lower
> octaves run at successively lower sample rates.  These optimizations
> help the Gaborator achieve a performance of millions of samples per
> second per CPU core.
>
> > They also give an "online" version where you do
> > the processing in chunks, but really for this to work I think you'd
> > need large-ish chunks so the latency would be pretty bad.
>
> The Gaborator also works in chunks.  A typical chunk size might be
> 8192 samples, but thanks to the multirate processing, in the lowest
> frequency bands, each of those 8192 samples may represent the
> low-frequency content of something like 1024 samples of the original
> signal.  This gives an effective chunk size of some 8 million samples
> without actually having to perform any FFTs that large.
>
> Latency is certainly high, but I would not say it is a consequence of
> the chunk size as such.  Rather, both the high latency and the need
> for a large (effective) chunk size are consequences of the lengths of
> the band filter impulse responses, which get exponentially larger as
> the constant-Q bands get narrower towards lower frequencies.
>
> Latency in the Gaborator is discussed in more detail here:
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc&e=
>
> > The whole process is in some ways dual to the usual STFT process,
> > where we first window and then FFT. in the NSGT you first FFT and
> > then window, and then IFFT each band to get a Time-Frequency
> > representation.
>
> Yes.
>
> > For resynthesis you end up with a similar window overlap constraint
> > as in STFT, except now the windows are in the frequency domain. It's
> > a little more complicated because the window centers aren't
> > evenly-spaced, so creating COLA windows is complicated. There are
> > some fancier approaches to designing a set of synthesis windows that
> > are complementary (inverse) of the analysis windows, which is what
> > the frame-theory folks like that Austrian group seem to like to use.
>
> The Gaborator was inspired by the papers from that Austrian group and
> uses complementary resynthesis windows, or "duals" as frame theorists
> like to call them.  The analysis windows are Gaussian, and the dual
> windows used for resynthesis end up being slightly distorted
> Gaussians.
>
> > One of the nice things about the NSGT is it lets you be really
> > flexible in your filterbank design while still giving you
> > invertibility.
>
> Agreed.
>
> In a later message, you wrote:
> > Whoops, just clicked through to the documentation and it looks like
> > this is the track you're on also. I'm curious if you have any
> > insight into the window-selection for the analysis and synthesis
> > process. It seems like the NSGT framework forces you to be a bit
> > smarter with windows than just sticking to COLA, but the dual frame
> > techniques should apply for regular STFT processing, right?
>
> I'm actually not that familiar with traditional STFTs and COLA, but as
> far as I can tell, the STFT is a special case of the NSGT and the same
> dual frame techniques should apply.
> --
> Andreas Gustafsson, g...@waxingwave.com
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY&s=br6gIADk3PB9_kF8YoA7aZdcf5McFvCCOlyYso5D2BI&e=
>

_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

Reply via email to