Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-26 Thread Zhiguang Zhang
Here's a Fourier joke about Joseph Fourier that you might not like, i heard
he rode a bike!

On Thu, Jun 25, 2020, 6:59 PM Emanuel Landeholm 
wrote:

> Sorry for being a slowpoke! Is this an efficient implementation of STFT
> (short time fourier transform)?
>
> On Thu, Jun 25, 2020 at 8:49 AM STEFFAN DIEDRICHSEN 
> wrote:
>
>> I think, Robert had its morning coffee after his reply …. ;-)
>>
>> Steffan
>>
>> On 24.06.2020|KW26, at 23:03, Zhiguang Eric Zhang 
>> wrote:
>>
>> it was Alan Wolfe's thread?
>>
>> i don't want to argue and/or discuss the intricacies of sampling theory,
>> but this is the DSP forum, no?  isn't this a place to discuss such
>> technical things?  even a plug-in?  i'm rather confused lol
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-25 Thread Emanuel Landeholm
Sorry for being a slowpoke! Is this an efficient implementation of STFT
(short time fourier transform)?

On Thu, Jun 25, 2020 at 8:49 AM STEFFAN DIEDRICHSEN 
wrote:

> I think, Robert had its morning coffee after his reply …. ;-)
>
> Steffan
>
> On 24.06.2020|KW26, at 23:03, Zhiguang Eric Zhang 
> wrote:
>
> it was Alan Wolfe's thread?
>
> i don't want to argue and/or discuss the intricacies of sampling theory,
> but this is the DSP forum, no?  isn't this a place to discuss such
> technical things?  even a plug-in?  i'm rather confused lol
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-25 Thread STEFFAN DIEDRICHSEN
I think, Robert had its morning coffee after his reply …. ;-)

Steffan 

> On 24.06.2020|KW26, at 23:03, Zhiguang Eric Zhang  
> wrote:
> 
> it was Alan Wolfe's thread?
> 
> i don't want to argue and/or discuss the intricacies of sampling theory, but 
> this is the DSP forum, no?  isn't this a place to discuss such technical 
> things?  even a plug-in?  i'm rather confused lol

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
yes i wholeheartedly agree.  ripple/ringing/Gibbs is part of the definition
of bandlimited filtering, from the LPF to the ADC/DAC to even contributing
to the sound of the piece of gear

On Wed, Jun 24, 2020 at 5:03 PM Greg Maxwell  wrote:

> On Wed, Jun 24, 2020 at 8:56 PM Zhiguang Zhang 
> wrote:
> > the Gibbs "nastiness' is ever present in both hardware and software
> implementations.  It's just there in the underlying physics of sampling
> theory, even in the analog domain it seems :)
>
> It's not really related to sampling.  A bandlimited analog continuous
> time signal has 'ringing'--  it's part of the definition of being band
> limited.  Sometimes in the sampled context the ringing is hidden
> between the samples, but will show up in the analog reconstruction or
> after correct digital processing that applies a fractional sample
> phase shift.
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=0xDLIjvZU2F6dH2jcRhPl9jK20T5a7L2dv2NFOy-VOE=J8iVA8e10vzQacphzhAtnRB-mgteKA9jf7ivlvG3Rqw=
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Greg Maxwell
On Wed, Jun 24, 2020 at 8:56 PM Zhiguang Zhang  wrote:
> the Gibbs "nastiness' is ever present in both hardware and software 
> implementations.  It's just there in the underlying physics of sampling 
> theory, even in the analog domain it seems :)

It's not really related to sampling.  A bandlimited analog continuous
time signal has 'ringing'--  it's part of the definition of being band
limited.  Sometimes in the sampled context the ringing is hidden
between the samples, but will show up in the analog reconstruction or
after correct digital processing that applies a fractional sample
phase shift.
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
it was Alan Wolfe's thread?

i don't want to argue and/or discuss the intricacies of sampling theory,
but this is the DSP forum, no?  isn't this a place to discuss such
technical things?  even a plug-in?  i'm rather confused lol

On Wed, Jun 24, 2020 at 5:00 PM robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
> > On June 24, 2020 4:53 PM Zhiguang Zhang  wrote:
> >
> >
> > I don't think there's any issue - I just posted about the TrackSpacer
> plugin and the thread started up again. Actually what I've been trying to
> get across is that the Gibbs "nastiness' is ever present in both hardware
> and software implementations. It's just there in the underlying physics of
> sampling theory, even in the analog domain it seems :)
> >
>
> Gibbs phenomena has nothing to do with sampling theory.  You only need to
> guarantee that the input is bandlimited to Fs/2.  you need not necessarily
> be applying a brickwall LPF before sampling.
>
> why is "Phase Vocoder" and "FIR" in the Subject: line of the thread?
>
>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread robert bristow-johnson



> On June 24, 2020 4:53 PM Zhiguang Zhang  wrote:
> 
> 
> I don't think there's any issue - I just posted about the TrackSpacer plugin 
> and the thread started up again. Actually what I've been trying to get across 
> is that the Gibbs "nastiness' is ever present in both hardware and software 
> implementations. It's just there in the underlying physics of sampling 
> theory, even in the analog domain it seems :)
> 

Gibbs phenomena has nothing to do with sampling theory.  You only need to 
guarantee that the input is bandlimited to Fs/2.  you need not necessarily be 
applying a brickwall LPF before sampling.

why is "Phase Vocoder" and "FIR" in the Subject: line of the thread?


-- 

r b-j  r...@audioimagination.com 

"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Zhang
I don't think there's any issue - I just posted about the TrackSpacer
plugin and the thread started up again.  Actually what I've been trying to
get across is that the Gibbs "nastiness' is ever present in both hardware
and software implementations.  It's just there in the underlying physics of
sampling theory, even in the analog domain it seems :)

On Wed, Jun 24, 2020, 4:46 PM robert bristow-johnson <
r...@audioimagination.com> wrote:

>
> is this the same thing we were discussing in March?  wasn't that three
> months ago?
>
> what, exactly, is the issue?
>
> there *are* some things in common between OLA phase vocoder and OLA fast
> convolution.  in fact, if you're willing to make your fast convolution less
> fast than optimal, you can use a Hann or some other complementary window
> but you *still* have to zero-pad it to prevent circular aliasing in the
> time domain.  the length of the FFT, N, must still be at least as large as
> the non-zero length of the window, L, plus the length of the impulse
> response, M, minus 1.
>
>N ≥ L + M - 1
>
> the number of zero samples padded must be at least M-1 samples.
>
> the difference is, if a rectangular window is used for overlap-add fast
> convolution, the processing frame advances by L samples every frame.  but
> if a Hann window is used (or another complementary window which requires
> 50% overlap), then the frame advances only by L/2 samples, even though the
> burden of computation involved in the frame is the same.  but things will
> look nicer in the frequency domain with the Hann window than they will with
> the rectangular window (this Gibbs stuff).  but the effect of any nastiness
> is canceled if you're doing FIR fast convolution.  but if you're doing
> non-LTI stuff in the phase vocoder, then that friendly frequency-domain
> behavior is more salient.
>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
>
>
> > On June 24, 2020 3:49 PM Zhiguang Zhang  wrote:
> >
> >
> > Hi Russ,
> >
> >
> > Yes. In the previous reference, there is no example for overlap-add. A
> sine/cosine framework is a relatively simple one for OLA and fulfills the
> necessary requirements. In the case of audio coding, various filterbanks
> with different types of windows have been designed for 'perfect
> reconstruction', and even window switching. Please refer to Bosi for a more
> thorough treatment.
> >
> >
> > Best,
> > Eric Z
> >
> >
> > On Wed, Jun 24, 2020, 3:44 PM Russell Wedelich 
> wrote:
> > > Respectively Eric, I think you may be confusing two different use
> cases for windows. Your recent reference is referring to constructing FIR
> filters via the Windowing method of ideal brickwall filters. This is
> different from a frequency domain convolution implementation of an FIR
> filter (which may or may not explicitly apply a smooth window) which as far
> as I can tell is the origin of this part of the discussion.
> > >
> > > -Russ
> > >
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread robert bristow-johnson

is this the same thing we were discussing in March?  wasn't that three months 
ago?

what, exactly, is the issue?

there *are* some things in common between OLA phase vocoder and OLA fast 
convolution.  in fact, if you're willing to make your fast convolution less 
fast than optimal, you can use a Hann or some other complementary window but 
you *still* have to zero-pad it to prevent circular aliasing in the time 
domain.  the length of the FFT, N, must still be at least as large as the 
non-zero length of the window, L, plus the length of the impulse response, M, 
minus 1.

   N ≥ L + M - 1

the number of zero samples padded must be at least M-1 samples.

the difference is, if a rectangular window is used for overlap-add fast 
convolution, the processing frame advances by L samples every frame.  but if a 
Hann window is used (or another complementary window which requires 50% 
overlap), then the frame advances only by L/2 samples, even though the burden 
of computation involved in the frame is the same.  but things will look nicer 
in the frequency domain with the Hann window than they will with the 
rectangular window (this Gibbs stuff).  but the effect of any nastiness is 
canceled if you're doing FIR fast convolution.  but if you're doing non-LTI 
stuff in the phase vocoder, then that friendly frequency-domain behavior is 
more salient.

-- 

r b-j  r...@audioimagination.com 

"Imagination is more important than knowledge."


> On June 24, 2020 3:49 PM Zhiguang Zhang  wrote:
> 
> 
> Hi Russ,
> 
>   
> Yes. In the previous reference, there is no example for overlap-add. A 
> sine/cosine framework is a relatively simple one for OLA and fulfills the 
> necessary requirements. In the case of audio coding, various filterbanks with 
> different types of windows have been designed for 'perfect reconstruction', 
> and even window switching. Please refer to Bosi for a more thorough treatment.
> 
> 
> Best,
> Eric Z
> 
> 
> On Wed, Jun 24, 2020, 3:44 PM Russell Wedelich  wrote:
> > Respectively Eric, I think you may be confusing two different use cases for 
> > windows. Your recent reference is referring to constructing FIR filters via 
> > the Windowing method of ideal brickwall filters. This is different from a 
> > frequency domain convolution implementation of an FIR filter (which may or 
> > may not explicitly apply a smooth window) which as far as I can tell is the 
> > origin of this part of the discussion.
> > 
> > -Russ
> >
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
hi Greg,

yes but taking circuit depth to mean circuit path and assumptively related
to window size, a smaller window almost certainly equals more ripple
distortion

-ez

On Wed, Jun 24, 2020 at 3:57 PM Greg Maxwell  wrote:

> On Wed, Jun 24, 2020 at 7:46 PM Russell Wedelich 
> wrote:
>
>> Respectively Eric, I think you may be confusing two different use cases
>> for windows. Your recent reference is referring to constructing FIR filters
>> via the Windowing method of ideal brickwall filters. This is different from
>> a frequency domain convolution implementation of an FIR filter (which may
>> or may not explicitly apply a smooth window) which as far as I can tell is
>> the origin of this part of the discussion.
>>
>
> And for the convolution implementation of a FIR filter if you were to
> compare _correct_ implementations of each approach which had adequate(*)
> internal precision relative to the output, the results would be _bit
> identical_.
>
> (*) In practice, digital implementations of *both* time domain FIR and
> convolution typically lack enough internal precision such that their output
> is exact, and as a result they won't be bit identical. Though I wouldn't be
> surprised if, for a given internal precision, a WOLA implementation using
> FFTs wasn't *more* faithful to a infinite precision FIR than the same FIR
> implemented with limited precision, due to the smaller circuit depth for
> the frequency domain approach.
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Greg Maxwell
On Wed, Jun 24, 2020 at 7:46 PM Russell Wedelich  wrote:

> Respectively Eric, I think you may be confusing two different use cases
> for windows. Your recent reference is referring to constructing FIR filters
> via the Windowing method of ideal brickwall filters. This is different from
> a frequency domain convolution implementation of an FIR filter (which may
> or may not explicitly apply a smooth window) which as far as I can tell is
> the origin of this part of the discussion.
>

And for the convolution implementation of a FIR filter if you were to
compare _correct_ implementations of each approach which had adequate(*)
internal precision relative to the output, the results would be _bit
identical_.

(*) In practice, digital implementations of *both* time domain FIR and
convolution typically lack enough internal precision such that their output
is exact, and as a result they won't be bit identical. Though I wouldn't be
surprised if, for a given internal precision, a WOLA implementation using
FFTs wasn't *more* faithful to a infinite precision FIR than the same FIR
implemented with limited precision, due to the smaller circuit depth for
the frequency domain approach.
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Zhang
Hi Russ,


Yes.  In the previous reference, there is no example for overlap-add.  A
sine/cosine framework is a relatively simple one for OLA and fulfills the
necessary requirements.  In the case of audio coding, various filterbanks
with different types of windows have been designed for 'perfect
reconstruction', and even window switching.  Please refer to Bosi for a
more thorough treatment.


Best,
Eric Z

On Wed, Jun 24, 2020, 3:44 PM Russell Wedelich  wrote:

> Respectively Eric, I think you may be confusing two different use cases
> for windows. Your recent reference is referring to constructing FIR filters
> via the Windowing method of ideal brickwall filters. This is different from
> a frequency domain convolution implementation of an FIR filter (which may
> or may not explicitly apply a smooth window) which as far as I can tell is
> the origin of this part of the discussion.
>
> -Russ
>
> On Wed, Jun 24, 2020 at 2:28 PM Zhiguang Eric Zhang 
> wrote:
>
>> not to beat a dead horse but you get more stats here:
>>
>> https://www.dspguide.com/ch16/1.htm
>>
>> (c) shows that the Blackman has a better *stopband attenuation*. To be
>> exact, the stopband attenuation for the Blackman is -74dB (∼0.02%), while
>> the Hamming is only -53dB (∼0.2%). Although it cannot be seen in these
>> graphs, the Blackman has a *passband ripple* of only about 0.02%, while
>> the Hamming is typically 0.2%. In general, the Blackman should be your
>> first choice; a slow roll-off is easier to handle than poor stopband
>> attenuation.
>>
>> On Wed, Jun 24, 2020 at 12:17 PM Zhiguang Eric Zhang 
>> wrote:
>>
>>> https://community.sw.siemens.com/s/article/the-gibbs-phenomenon
>>>
>>> "*Addendum #2: Analog to Digital Converters*
>>>
>>> Sometimes there is confusion about a Successive Approximation Register
>>> (SAR) versus Sigma-Delta analog to digital converters and Gibbs phenomenon.
>>> Many Sigma-Delta converters have sharp anti-aliasing filters which prevent
>>> alias errors. But these sharp filters are not inherent to Sigma-Delta
>>> converters, any type of filter can be used.
>>>
>>> Using a filter with a gradual roll-off with any analog to digital
>>> converter reduces or eliminates the Gibbs phenomenon. The effect of the
>>> filter should not be confused with the type of analog to digital converter.
>>> In fact, a lowpass filter can even be used after the acquisition on a
>>> digitized signal containing Gibbs to remove the phenomenon."
>>>
>>>
>>> seems like your LPF in your ADC should 'remove' this artifact to
>>> undetectable levels if you're sampling from an analog source, but in
>>> software, it just depends on your windowing
>>>
>>> On Wed, Jun 24, 2020 at 11:47 AM Zhiguang Eric Zhang 
>>> wrote:
>>>
 here:

 https://community.sw.siemens.com/s/article/the-gibbs-phenomenon

 "*The Gibbs Phenomenon*

 [image: User-added image]

 To describe a signal with a discontinuity in the time domain requires
 infinite frequency content. In practice, it is not possible to sample
 infinite frequency content. The truncation of frequency content causes a
 time domain ringing artifact on the signal, which is called the “Gibbs
 phenomenon”."



 in order to eliminate the ringing artifact altogether, you'd need a
 hell of an ADC, one that doesn't exist today (nor shall one ever exist to
 eliminate the artifact).  it is part sampling theory and there's no way
 around it.

 On Wed, Jun 24, 2020 at 11:45 AM Corey K  wrote:

> You don't have to sample the STFT that often. In fact block based FFT
> convolution uses non-overlapping blocks on the input (although the output
> windows do overlap). Anyway, I digress...
>
> On Wed., Jun. 24, 2020, 1:06 p.m. Zhiguang Eric Zhang, 
> wrote:
>
>> It's not just about zero-padding.  Say you could sample the signal
>> and window at, say, fs, but why the hell would you want to window at fs?
>> At any rate, if you look at the Hamming window, the ringing artifact is
>> rather negligible.
>>
>>
>> On Wed, Jun 24, 2020, 11:15 AM STEFFAN DIEDRICHSEN <
>> sdiedrich...@me.com> wrote:
>>
>>> Phew, thank you for confirming that! We use it in several products.
>>>
>>> Cheers,
>>>
>>> Steffan
>>>
>>> On 24.06.2020|KW26, at 17:07, Corey K  wrote:
>>>
>>> But the end result is that we can perform filtering using STFT
>>> filterbanks just fine, there are no artifacts.
>>>
>>>
>>> ___
>>> dupswapdrop: music-dsp mailing list
>>> music-dsp@music.columbia.edu
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=yZ5_ZeKmy3sswDToK2aeereR1JOnaPvsIiQIFV61n3s=CCun327jvXB9hliV_3DHUKwKgRVRi8xdsZyOtqHPAfw=
>>
>> 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Russell Wedelich
Respectively Eric, I think you may be confusing two different use cases for
windows. Your recent reference is referring to constructing FIR filters via
the Windowing method of ideal brickwall filters. This is different from a
frequency domain convolution implementation of an FIR filter (which may or
may not explicitly apply a smooth window) which as far as I can tell is the
origin of this part of the discussion.

-Russ

On Wed, Jun 24, 2020 at 2:28 PM Zhiguang Eric Zhang  wrote:

> not to beat a dead horse but you get more stats here:
>
> https://www.dspguide.com/ch16/1.htm
>
> (c) shows that the Blackman has a better *stopband attenuation*. To be
> exact, the stopband attenuation for the Blackman is -74dB (∼0.02%), while
> the Hamming is only -53dB (∼0.2%). Although it cannot be seen in these
> graphs, the Blackman has a *passband ripple* of only about 0.02%, while
> the Hamming is typically 0.2%. In general, the Blackman should be your
> first choice; a slow roll-off is easier to handle than poor stopband
> attenuation.
>
> On Wed, Jun 24, 2020 at 12:17 PM Zhiguang Eric Zhang 
> wrote:
>
>> https://community.sw.siemens.com/s/article/the-gibbs-phenomenon
>>
>> "*Addendum #2: Analog to Digital Converters*
>>
>> Sometimes there is confusion about a Successive Approximation Register
>> (SAR) versus Sigma-Delta analog to digital converters and Gibbs phenomenon.
>> Many Sigma-Delta converters have sharp anti-aliasing filters which prevent
>> alias errors. But these sharp filters are not inherent to Sigma-Delta
>> converters, any type of filter can be used.
>>
>> Using a filter with a gradual roll-off with any analog to digital
>> converter reduces or eliminates the Gibbs phenomenon. The effect of the
>> filter should not be confused with the type of analog to digital converter.
>> In fact, a lowpass filter can even be used after the acquisition on a
>> digitized signal containing Gibbs to remove the phenomenon."
>>
>>
>> seems like your LPF in your ADC should 'remove' this artifact to
>> undetectable levels if you're sampling from an analog source, but in
>> software, it just depends on your windowing
>>
>> On Wed, Jun 24, 2020 at 11:47 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> here:
>>>
>>> https://community.sw.siemens.com/s/article/the-gibbs-phenomenon
>>>
>>> "*The Gibbs Phenomenon*
>>>
>>> [image: User-added image]
>>>
>>> To describe a signal with a discontinuity in the time domain requires
>>> infinite frequency content. In practice, it is not possible to sample
>>> infinite frequency content. The truncation of frequency content causes a
>>> time domain ringing artifact on the signal, which is called the “Gibbs
>>> phenomenon”."
>>>
>>>
>>>
>>> in order to eliminate the ringing artifact altogether, you'd need a hell
>>> of an ADC, one that doesn't exist today (nor shall one ever exist to
>>> eliminate the artifact).  it is part sampling theory and there's no way
>>> around it.
>>>
>>> On Wed, Jun 24, 2020 at 11:45 AM Corey K  wrote:
>>>
 You don't have to sample the STFT that often. In fact block based FFT
 convolution uses non-overlapping blocks on the input (although the output
 windows do overlap). Anyway, I digress...

 On Wed., Jun. 24, 2020, 1:06 p.m. Zhiguang Eric Zhang, 
 wrote:

> It's not just about zero-padding.  Say you could sample the signal and
> window at, say, fs, but why the hell would you want to window at fs?  At
> any rate, if you look at the Hamming window, the ringing artifact is 
> rather
> negligible.
>
>
> On Wed, Jun 24, 2020, 11:15 AM STEFFAN DIEDRICHSEN <
> sdiedrich...@me.com> wrote:
>
>> Phew, thank you for confirming that! We use it in several products.
>>
>> Cheers,
>>
>> Steffan
>>
>> On 24.06.2020|KW26, at 17:07, Corey K  wrote:
>>
>> But the end result is that we can perform filtering using STFT
>> filterbanks just fine, there are no artifacts.
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=yZ5_ZeKmy3sswDToK2aeereR1JOnaPvsIiQIFV61n3s=CCun327jvXB9hliV_3DHUKwKgRVRi8xdsZyOtqHPAfw=
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
> 

 ___
 dupswapdrop: music-dsp mailing list
 music-dsp@music.columbia.edu

 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread STEFFAN DIEDRICHSEN
Phew, thank you for confirming that! We use it in several products.

Cheers,

Steffan 

> On 24.06.2020|KW26, at 17:07, Corey K  wrote:
> 
> But the end result is that we can perform filtering using STFT filterbanks 
> just fine, there are no artifacts.

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread STEFFAN DIEDRICHSEN
Here’s the beef from that paper:


(The reader should realize that an appropriate change

must be made to the analysis-i.e., padding the windowed in- put signal with a 
sufficient number of zero valued samples-to prevent time aliasing when 
implementing the analysis and syn- thesis operations with FFT's, which have 
length L . If a modification P(eJWk) has a time response which is effectively 
No points long, the analysis length L must be at least N +No - 1 where the 
window length is N.)

That’s more or less describing the limits of that approach by using the 
identity of spectral multiplication and time-domain convolution. For a 
convolution of N input samples with M filter samples, the result is L= N+M-1. 
So, if you use an FFT with size L, you can use M-1-L input samples. So you need 
to zero-pad to avoid artefacts. 

Best,

Steffan 

> On 24.06.2020|KW26, at 16:10, Corey K  wrote:
> 
> I think you're mistaken, unfortunately. Block FFT convolution has been around 
> for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his paper "A 
> Unified Approach to Short-Time Fourier Analysis" how you can perform FIR 
> filtering perfectly with the FFT, of COLA windows are used. See equation 
> 5.2.5 in that paper, and the analysis that precedes it. 
> 
> 
> 
> 
> 
> On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang  > wrote:
> that's not true.  with FFT/COLA you will necessarily have the Gibbs 
> phenomenon / ringing / ripple artifacts.  certain window types will minimize 
> this but you will get this phenomenon nonetheless.
> 
> On Wed, Jun 24, 2020 at 9:44 AM Corey K  > wrote:
> I see what you're getting at, I suppose. However, in the context of FIR 
> filtering I wouldn't refer to this as an artifact. Let's say you gave me an 
> FIR filter with N-taps and asked me to write a program to implement that 
> filter. I could implement this using a direct form structure (in the 
> time-domain), or with the FFT using OLA. Both would give the exact same 
> results down to numerical precision, with no "artifacts". That's why it 
> intrigued me when you said "of course it won't have the ripple artifacts 
> associated with FFT overlap windowing" when referring to software that does 
> filtering.
> 
> 
> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang  > wrote:
> ripple is just a known artifactual component of a windowing operation.  it's 
> also known as the Gibbs phenomenon
> 
> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html 
> 
> 
> i'm not referring to any equivalency between time/freq domain filtering
> 
> 
> On Wed, Jun 24, 2020 at 9:21 AM Corey K  > wrote:
> Not totally understanding you, unfortunately. But if what you are describing 
> is part of the normal filter response/ringing I guess I wouldn't refer to it 
> as "artifacts"? FIR filtering can be performed equivalently in the time or 
> frequency domain. Do you disagree with that statement? 
> 
> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang  > wrote:
> yes but any windowing operation is akin to taking a dirac delta function on X 
> number of samples and thus you will get ringing/ripple artifacts as a 
> necessary part of the filter response
> 
> On Wed, Jun 24, 2020 at 6:30 AM Corey K  > wrote:
> 
> of course it won't have the ripple artifacts associated with FFT overlap 
> windowing
> 
> What is the ripple artifact you are talking about? When using constant 
> overlap add (COLA) windows the STFT is a perfect reconstruction filterbank. 
> Likewise block FFT convolution can be used to implement any FIR filtering 
> operation. 
> 
> 
> 
> 
> 
> 
> cheers,
> -ez
> 
> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson  > wrote:
> Hello Spencer,
> 
> You wrote:
> > A while ago I read through some the literature [1] on implementing
> > an invertible CQT as a special case of the Nonstationary Gabor
> > Transform. It's implemented by the essentia library [2] among other
> > places probably.
> > 
> > The main idea is that you take the FFT of your whole signal, then
> > apply the filter bank in the frequency domain (just
> > multiplication). Then you IFFT each filtered signal, which gives you
> > the time-domain samples for each band of the filter bank. Each
> > frequency-domain filter has a different bandwidth, so your IFFT is a
> > different length for each one, which gives you the different sample
> > rates for each one.
> 
> That's the basic idea, but the Gaborator rounds up each of the
> per-band sample rates to the original sample rate divided by some
> power of two.  This means all the FFT sizes can be 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
not to beat a dead horse but you get more stats here:

https://www.dspguide.com/ch16/1.htm

(c) shows that the Blackman has a better *stopband attenuation*. To be
exact, the stopband attenuation for the Blackman is -74dB (∼0.02%), while
the Hamming is only -53dB (∼0.2%). Although it cannot be seen in these
graphs, the Blackman has a *passband ripple* of only about 0.02%, while the
Hamming is typically 0.2%. In general, the Blackman should be your first
choice; a slow roll-off is easier to handle than poor stopband attenuation.

On Wed, Jun 24, 2020 at 12:17 PM Zhiguang Eric Zhang  wrote:

> https://community.sw.siemens.com/s/article/the-gibbs-phenomenon
>
> "*Addendum #2: Analog to Digital Converters*
>
> Sometimes there is confusion about a Successive Approximation Register
> (SAR) versus Sigma-Delta analog to digital converters and Gibbs phenomenon.
> Many Sigma-Delta converters have sharp anti-aliasing filters which prevent
> alias errors. But these sharp filters are not inherent to Sigma-Delta
> converters, any type of filter can be used.
>
> Using a filter with a gradual roll-off with any analog to digital
> converter reduces or eliminates the Gibbs phenomenon. The effect of the
> filter should not be confused with the type of analog to digital converter.
> In fact, a lowpass filter can even be used after the acquisition on a
> digitized signal containing Gibbs to remove the phenomenon."
>
>
> seems like your LPF in your ADC should 'remove' this artifact to
> undetectable levels if you're sampling from an analog source, but in
> software, it just depends on your windowing
>
> On Wed, Jun 24, 2020 at 11:47 AM Zhiguang Eric Zhang 
> wrote:
>
>> here:
>>
>> https://community.sw.siemens.com/s/article/the-gibbs-phenomenon
>>
>> "*The Gibbs Phenomenon*
>>
>> [image: User-added image]
>>
>> To describe a signal with a discontinuity in the time domain requires
>> infinite frequency content. In practice, it is not possible to sample
>> infinite frequency content. The truncation of frequency content causes a
>> time domain ringing artifact on the signal, which is called the “Gibbs
>> phenomenon”."
>>
>>
>>
>> in order to eliminate the ringing artifact altogether, you'd need a hell
>> of an ADC, one that doesn't exist today (nor shall one ever exist to
>> eliminate the artifact).  it is part sampling theory and there's no way
>> around it.
>>
>> On Wed, Jun 24, 2020 at 11:45 AM Corey K  wrote:
>>
>>> You don't have to sample the STFT that often. In fact block based FFT
>>> convolution uses non-overlapping blocks on the input (although the output
>>> windows do overlap). Anyway, I digress...
>>>
>>> On Wed., Jun. 24, 2020, 1:06 p.m. Zhiguang Eric Zhang, 
>>> wrote:
>>>
 It's not just about zero-padding.  Say you could sample the signal and
 window at, say, fs, but why the hell would you want to window at fs?  At
 any rate, if you look at the Hamming window, the ringing artifact is rather
 negligible.


 On Wed, Jun 24, 2020, 11:15 AM STEFFAN DIEDRICHSEN 
 wrote:

> Phew, thank you for confirming that! We use it in several products.
>
> Cheers,
>
> Steffan
>
> On 24.06.2020|KW26, at 17:07, Corey K  wrote:
>
> But the end result is that we can perform filtering using STFT
> filterbanks just fine, there are no artifacts.
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=yZ5_ZeKmy3sswDToK2aeereR1JOnaPvsIiQIFV61n3s=CCun327jvXB9hliV_3DHUKwKgRVRi8xdsZyOtqHPAfw=

 ___
 dupswapdrop: music-dsp mailing list
 music-dsp@music.columbia.edu
 https://lists.columbia.edu/mailman/listinfo/music-dsp
 
>>>
>>> ___
>>> dupswapdrop: music-dsp mailing list
>>> music-dsp@music.columbia.edu
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=s95xHGQ6LAiwB1zMrc7VeayM-fKr760Or7TCJlScQfc=BZnqxM9Q-zTNeu_a0p-Ga9nEXBm6SL6VKtfeCmHRL80=
>>
>>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
https://community.sw.siemens.com/s/article/the-gibbs-phenomenon

"*Addendum #2: Analog to Digital Converters*

Sometimes there is confusion about a Successive Approximation Register
(SAR) versus Sigma-Delta analog to digital converters and Gibbs phenomenon.
Many Sigma-Delta converters have sharp anti-aliasing filters which prevent
alias errors. But these sharp filters are not inherent to Sigma-Delta
converters, any type of filter can be used.

Using a filter with a gradual roll-off with any analog to digital converter
reduces or eliminates the Gibbs phenomenon. The effect of the filter should
not be confused with the type of analog to digital converter. In fact, a
lowpass filter can even be used after the acquisition on a digitized signal
containing Gibbs to remove the phenomenon."


seems like your LPF in your ADC should 'remove' this artifact to
undetectable levels if you're sampling from an analog source, but in
software, it just depends on your windowing

On Wed, Jun 24, 2020 at 11:47 AM Zhiguang Eric Zhang  wrote:

> here:
>
> https://community.sw.siemens.com/s/article/the-gibbs-phenomenon
>
> "*The Gibbs Phenomenon*
>
> [image: User-added image]
>
> To describe a signal with a discontinuity in the time domain requires
> infinite frequency content. In practice, it is not possible to sample
> infinite frequency content. The truncation of frequency content causes a
> time domain ringing artifact on the signal, which is called the “Gibbs
> phenomenon”."
>
>
>
> in order to eliminate the ringing artifact altogether, you'd need a hell
> of an ADC, one that doesn't exist today (nor shall one ever exist to
> eliminate the artifact).  it is part sampling theory and there's no way
> around it.
>
> On Wed, Jun 24, 2020 at 11:45 AM Corey K  wrote:
>
>> You don't have to sample the STFT that often. In fact block based FFT
>> convolution uses non-overlapping blocks on the input (although the output
>> windows do overlap). Anyway, I digress...
>>
>> On Wed., Jun. 24, 2020, 1:06 p.m. Zhiguang Eric Zhang, 
>> wrote:
>>
>>> It's not just about zero-padding.  Say you could sample the signal and
>>> window at, say, fs, but why the hell would you want to window at fs?  At
>>> any rate, if you look at the Hamming window, the ringing artifact is rather
>>> negligible.
>>>
>>>
>>> On Wed, Jun 24, 2020, 11:15 AM STEFFAN DIEDRICHSEN 
>>> wrote:
>>>
 Phew, thank you for confirming that! We use it in several products.

 Cheers,

 Steffan

 On 24.06.2020|KW26, at 17:07, Corey K  wrote:

 But the end result is that we can perform filtering using STFT
 filterbanks just fine, there are no artifacts.


 ___
 dupswapdrop: music-dsp mailing list
 music-dsp@music.columbia.edu

 https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=yZ5_ZeKmy3sswDToK2aeereR1JOnaPvsIiQIFV61n3s=CCun327jvXB9hliV_3DHUKwKgRVRi8xdsZyOtqHPAfw=
>>>
>>> ___
>>> dupswapdrop: music-dsp mailing list
>>> music-dsp@music.columbia.edu
>>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>> 
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=s95xHGQ6LAiwB1zMrc7VeayM-fKr760Or7TCJlScQfc=BZnqxM9Q-zTNeu_a0p-Ga9nEXBm6SL6VKtfeCmHRL80=
>
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
here:

https://community.sw.siemens.com/s/article/the-gibbs-phenomenon

"*The Gibbs Phenomenon*

[image: User-added image]

To describe a signal with a discontinuity in the time domain requires
infinite frequency content. In practice, it is not possible to sample
infinite frequency content. The truncation of frequency content causes a
time domain ringing artifact on the signal, which is called the “Gibbs
phenomenon”."



in order to eliminate the ringing artifact altogether, you'd need a hell of
an ADC, one that doesn't exist today (nor shall one ever exist to eliminate
the artifact).  it is part sampling theory and there's no way around it.

On Wed, Jun 24, 2020 at 11:45 AM Corey K  wrote:

> You don't have to sample the STFT that often. In fact block based FFT
> convolution uses non-overlapping blocks on the input (although the output
> windows do overlap). Anyway, I digress...
>
> On Wed., Jun. 24, 2020, 1:06 p.m. Zhiguang Eric Zhang, 
> wrote:
>
>> It's not just about zero-padding.  Say you could sample the signal and
>> window at, say, fs, but why the hell would you want to window at fs?  At
>> any rate, if you look at the Hamming window, the ringing artifact is rather
>> negligible.
>>
>>
>> On Wed, Jun 24, 2020, 11:15 AM STEFFAN DIEDRICHSEN 
>> wrote:
>>
>>> Phew, thank you for confirming that! We use it in several products.
>>>
>>> Cheers,
>>>
>>> Steffan
>>>
>>> On 24.06.2020|KW26, at 17:07, Corey K  wrote:
>>>
>>> But the end result is that we can perform filtering using STFT
>>> filterbanks just fine, there are no artifacts.
>>>
>>>
>>> ___
>>> dupswapdrop: music-dsp mailing list
>>> music-dsp@music.columbia.edu
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=yZ5_ZeKmy3sswDToK2aeereR1JOnaPvsIiQIFV61n3s=CCun327jvXB9hliV_3DHUKwKgRVRi8xdsZyOtqHPAfw=
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>> 
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=s95xHGQ6LAiwB1zMrc7VeayM-fKr760Or7TCJlScQfc=BZnqxM9Q-zTNeu_a0p-Ga9nEXBm6SL6VKtfeCmHRL80=
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
You don't have to sample the STFT that often. In fact block based FFT
convolution uses non-overlapping blocks on the input (although the output
windows do overlap). Anyway, I digress...

On Wed., Jun. 24, 2020, 1:06 p.m. Zhiguang Eric Zhang, 
wrote:

> It's not just about zero-padding.  Say you could sample the signal and
> window at, say, fs, but why the hell would you want to window at fs?  At
> any rate, if you look at the Hamming window, the ringing artifact is rather
> negligible.
>
>
> On Wed, Jun 24, 2020, 11:15 AM STEFFAN DIEDRICHSEN 
> wrote:
>
>> Phew, thank you for confirming that! We use it in several products.
>>
>> Cheers,
>>
>> Steffan
>>
>> On 24.06.2020|KW26, at 17:07, Corey K  wrote:
>>
>> But the end result is that we can perform filtering using STFT
>> filterbanks just fine, there are no artifacts.
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=yZ5_ZeKmy3sswDToK2aeereR1JOnaPvsIiQIFV61n3s=CCun327jvXB9hliV_3DHUKwKgRVRi8xdsZyOtqHPAfw=
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
It's not just about zero-padding.  Say you could sample the signal and
window at, say, fs, but why the hell would you want to window at fs?  At
any rate, if you look at the Hamming window, the ringing artifact is rather
negligible.


On Wed, Jun 24, 2020, 11:15 AM STEFFAN DIEDRICHSEN 
wrote:

> Phew, thank you for confirming that! We use it in several products.
>
> Cheers,
>
> Steffan
>
> On 24.06.2020|KW26, at 17:07, Corey K  wrote:
>
> But the end result is that we can perform filtering using STFT filterbanks
> just fine, there are no artifacts.
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=yZ5_ZeKmy3sswDToK2aeereR1JOnaPvsIiQIFV61n3s=CCun327jvXB9hliV_3DHUKwKgRVRi8xdsZyOtqHPAfw=
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
(when I say satisfy the left hand side, I mean make the sum of shifted
windows add up to a constant)

On Wed, Jun 24, 2020 at 12:37 PM Corey K  wrote:

> Regarding e.q 4.5 it is easy to satisfy the left hand side of that
> equation exactly (which is all that is needed) -- any COLA window will do
> it.
>
> Steffan's point is critically important. The FFT has to be appropriately
> zero-padded so the convolution is linear rather than circular.
>
> But the end result is that we can perform filtering using STFT filterbanks
> just fine, there are no artifacts.
>
> On Wed, Jun 24, 2020 at 12:30 PM STEFFAN DIEDRICHSEN 
> wrote:
>
>> Here’s the beef from that paper:
>>
>>
>> (The reader should realize that an appropriate change
>>
>> must be made to the analysis-i.e., padding the windowed
>> in- put signal with a sufficient number of zero valued samples-to prevent
>> time aliasing when implementing the analysis and syn- thesis operations
>> with FFT's, which have length L . If a modification P(eJWk) has a time
>> response which is effectively No points long, the analysis length L must be
>> at least N +No - 1 where the window length is N.)
>>
>> That’s more or less describing the limits of that approach by using the
>> identity of spectral multiplication and time-domain convolution. For a
>> convolution of N input samples with M filter samples, the result is L=
>> N+M-1. So, if you use an FFT with size L, you can use M-1-L input samples.
>> So you need to zero-pad to avoid artefacts.
>>
>> Best,
>>
>> Steffan
>>
>> On 24.06.2020|KW26, at 16:10, Corey K  wrote:
>>
>> I think you're mistaken, unfortunately. Block FFT convolution has been
>> around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
>> paper "A Unified Approach to Short-Time Fourier Analysis" how you can
>> perform FIR filtering perfectly with the FFT, of COLA windows are used. See
>> equation 5.2.5 in that paper, and the analysis that precedes it.
>>
>>
>>
>>
>>
>> On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> that's not true.  with FFT/COLA you will necessarily have the Gibbs
>>> phenomenon / ringing / ripple artifacts.  certain window types will
>>> minimize this but you will get this phenomenon nonetheless.
>>>
>>> On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:
>>>
 I see what you're getting at, I suppose. However, in the context of FIR
 filtering I wouldn't refer to this as an artifact. Let's say you gave me an
 FIR filter with N-taps and asked me to write a program to implement that
 filter. I could implement this using a direct form structure (in the
 time-domain), or with the FFT using OLA. Both would give the exact same
 results down to numerical precision, with no "artifacts". That's why it
 intrigued me when you said "of course it won't have the ripple artifacts
 associated with FFT overlap windowing" when referring to software that does
 filtering.


 On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
 wrote:

> ripple is just a known artifactual component of a windowing
> operation.  it's also known as the Gibbs phenomenon
>
> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
> 
>
> i'm not referring to any equivalency between time/freq domain filtering
>
>
> On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:
>
>> Not totally understanding you, unfortunately. But if what you are
>> describing is part of the normal filter response/ringing I guess I 
>> wouldn't
>> refer to it as "artifacts"? FIR filtering can be performed equivalently 
>> in
>> the time or frequency domain. Do you disagree with that statement?
>>
>> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> yes but any windowing operation is akin to taking a dirac delta
>>> function on X number of samples and thus you will get ringing/ripple
>>> artifacts as a necessary part of the filter response
>>>
>>> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>>>

 of course it won't have the ripple artifacts associated with FFT
> overlap windowing
>

 What is the ripple artifact you are talking about? When using
 constant overlap add (COLA) windows the STFT is a perfect 
 reconstruction
 filterbank. Likewise block FFT convolution can be used to implement 
 any FIR
 filtering operation.






> cheers,
> -ez
>
> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <
> g...@waxingwave.com> wrote:
>
>> Hello Spencer,
>>

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
Regarding e.q 4.5 it is easy to satisfy the left hand side of that equation
exactly (which is all that is needed) -- any COLA window will do it.

Steffan's point is critically important. The FFT has to be appropriately
zero-padded so the convolution is linear rather than circular.

But the end result is that we can perform filtering using STFT filterbanks
just fine, there are no artifacts.

On Wed, Jun 24, 2020 at 12:30 PM STEFFAN DIEDRICHSEN 
wrote:

> Here’s the beef from that paper:
>
>
> (The reader should realize that an appropriate change
>
> must be made to the analysis-i.e., padding the windowed
> in- put signal with a sufficient number of zero valued samples-to prevent
> time aliasing when implementing the analysis and syn- thesis operations
> with FFT's, which have length L . If a modification P(eJWk) has a time
> response which is effectively No points long, the analysis length L must be
> at least N +No - 1 where the window length is N.)
>
> That’s more or less describing the limits of that approach by using the
> identity of spectral multiplication and time-domain convolution. For a
> convolution of N input samples with M filter samples, the result is L=
> N+M-1. So, if you use an FFT with size L, you can use M-1-L input samples.
> So you need to zero-pad to avoid artefacts.
>
> Best,
>
> Steffan
>
> On 24.06.2020|KW26, at 16:10, Corey K  wrote:
>
> I think you're mistaken, unfortunately. Block FFT convolution has been
> around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
> paper "A Unified Approach to Short-Time Fourier Analysis" how you can
> perform FIR filtering perfectly with the FFT, of COLA windows are used. See
> equation 5.2.5 in that paper, and the analysis that precedes it.
>
>
>
>
>
> On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang 
> wrote:
>
>> that's not true.  with FFT/COLA you will necessarily have the Gibbs
>> phenomenon / ringing / ripple artifacts.  certain window types will
>> minimize this but you will get this phenomenon nonetheless.
>>
>> On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:
>>
>>> I see what you're getting at, I suppose. However, in the context of FIR
>>> filtering I wouldn't refer to this as an artifact. Let's say you gave me an
>>> FIR filter with N-taps and asked me to write a program to implement that
>>> filter. I could implement this using a direct form structure (in the
>>> time-domain), or with the FFT using OLA. Both would give the exact same
>>> results down to numerical precision, with no "artifacts". That's why it
>>> intrigued me when you said "of course it won't have the ripple artifacts
>>> associated with FFT overlap windowing" when referring to software that does
>>> filtering.
>>>
>>>
>>> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
>>> wrote:
>>>
 ripple is just a known artifactual component of a windowing operation.
 it's also known as the Gibbs phenomenon

 http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
 

 i'm not referring to any equivalency between time/freq domain filtering


 On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:

> Not totally understanding you, unfortunately. But if what you are
> describing is part of the normal filter response/ringing I guess I 
> wouldn't
> refer to it as "artifacts"? FIR filtering can be performed equivalently in
> the time or frequency domain. Do you disagree with that statement?
>
> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
> wrote:
>
>> yes but any windowing operation is akin to taking a dirac delta
>> function on X number of samples and thus you will get ringing/ripple
>> artifacts as a necessary part of the filter response
>>
>> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>>
>>>
>>> of course it won't have the ripple artifacts associated with FFT
 overlap windowing

>>>
>>> What is the ripple artifact you are talking about? When using
>>> constant overlap add (COLA) windows the STFT is a perfect reconstruction
>>> filterbank. Likewise block FFT convolution can be used to implement any 
>>> FIR
>>> filtering operation.
>>>
>>>
>>>
>>>
>>>
>>>
 cheers,
 -ez

 On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <
 g...@waxingwave.com> wrote:

> Hello Spencer,
>
> You wrote:
> > A while ago I read through some the literature [1] on
> implementing
> > an invertible CQT as a special case of the Nonstationary Gabor
> > Transform. It's implemented by the essentia library [2] among
> other
> > places 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
"

The term Zmw(m-n)of (4.4)isseen to be the sum of the window shifted by
m samples.
By recognizing that the ex- pression&,w(m-n)issimplyasumofthevaluesofalow-
passwindow,itcanbeshown[8]thatifw(n)issampledata sufficiently dense rate,
then

w(m-n)= w(ejo) (4.5) m

independentofthewindowoffsetn,whereW(ejo)isthevalue of W(e'"), the
transform of the window,evaluated at zero frequency. Thus(4.4)becomes

Signal

xm(ejwS ejwP]

showing that the synthesis rule of (4.2) will lead to exact re-
construction of x ( n ) boyverlap-adding sections of the waveform.

The entire synthesis procedure depends on the sampling re- lation of
(4.5). This
relationshipisvalid to withinan aliasing error which can be made
negligiably smd for sufficiently high sampling rates of the window-i.e., as the
sampling rate of the short-time Fourier transform estimates increases, the
aliasing error decreases monotonically to zero."


so theoretically there is an error that decreases to zero if you sample and
window the signal at a sufficiently high rate.  this means that, at a
practical rate, the error will present itself

On Wed, Jun 24, 2020 at 10:49 AM Corey K  wrote:

> Ok, if Allen can't convince you, how about Julius Smith:
> https://ccrma.stanford.edu/~jos/sasp/FFT_Filter_Banks.html
> 
>  ?
>
>
>
> On Wed, Jun 24, 2020 at 12:13 PM Zhiguang Eric Zhang 
> wrote:
>
>> Thank you.  Yes it seems very theoretical and math heavy.  In practice
>> you will get this frequency response artifact no matter how small.  It
>> should factor into the math in some way, perhaps they are not looking at
>> the laplacian
>>
>> On Wed, Jun 24, 2020, 10:41 AM Corey K  wrote:
>>
>>> It's a classic paper. Google scholar shows it has been cited over 1000
>>> times. There's a link to it here here:
>>> https://jontalle.web.engr.illinois.edu/uploads/537/Papers/Public/AllenRabiner77-ProcIEEE.pdf
>>> 
>>>
>>>
>>> On Wed, Jun 24, 2020 at 11:56 AM Zhiguang Eric Zhang 
>>> wrote:
>>>
 unfortunately, i'm not familiar with that paper.  could you please
 attach it or provide a link for reference?  the Gibbs phenomenon is
 actually a very well-known and thoroughly characterized signal processing
 artifact that has been approached from a variety of angles as far as trying
 to find a solution.  iit can be thought of as an unavoidable digital filter
 response of having to take X number of samples in one snapshot while
 capturing a finite instance in time (as you might know the Dirac delta is
 centered on DC)

 https://en.wikipedia.org/wiki/Ringing_artifacts
 

 On Wed, Jun 24, 2020 at 10:12 AM Corey K  wrote:

> I think you're mistaken, unfortunately. Block FFT convolution has been
> around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
> paper "A Unified Approach to Short-Time Fourier Analysis" how you can
> perform FIR filtering perfectly with the FFT, of COLA windows are used. 
> See
> equation 5.2.5 in that paper, and the analysis that precedes it.
>
>
>
>
>
> On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang 
> wrote:
>
>> that's not true.  with FFT/COLA you will necessarily have the Gibbs
>> phenomenon / ringing / ripple artifacts.  certain window types will
>> minimize this but you will get this phenomenon nonetheless.
>>
>> On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:
>>
>>> I see what you're getting at, I suppose. However, in the context of
>>> FIR filtering I wouldn't refer to this as an artifact. Let's say you 
>>> gave
>>> me an FIR filter with N-taps and asked me to write a program to 
>>> implement
>>> that filter. I could implement this using a direct form structure (in 
>>> the
>>> time-domain), or with the FFT using OLA. Both would give the exact same
>>> results down to numerical precision, with no "artifacts". That's why it
>>> intrigued me when you said "of course it won't have the ripple artifacts
>>> associated with FFT overlap windowing" when referring to software that 
>>> does
>>> filtering.
>>>
>>>
>>> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
>>> wrote:

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
Ok, if Allen can't convince you, how about Julius Smith:
https://ccrma.stanford.edu/~jos/sasp/FFT_Filter_Banks.html ?



On Wed, Jun 24, 2020 at 12:13 PM Zhiguang Eric Zhang  wrote:

> Thank you.  Yes it seems very theoretical and math heavy.  In practice you
> will get this frequency response artifact no matter how small.  It should
> factor into the math in some way, perhaps they are not looking at the
> laplacian
>
> On Wed, Jun 24, 2020, 10:41 AM Corey K  wrote:
>
>> It's a classic paper. Google scholar shows it has been cited over 1000
>> times. There's a link to it here here:
>> https://jontalle.web.engr.illinois.edu/uploads/537/Papers/Public/AllenRabiner77-ProcIEEE.pdf
>> 
>>
>>
>> On Wed, Jun 24, 2020 at 11:56 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> unfortunately, i'm not familiar with that paper.  could you please
>>> attach it or provide a link for reference?  the Gibbs phenomenon is
>>> actually a very well-known and thoroughly characterized signal processing
>>> artifact that has been approached from a variety of angles as far as trying
>>> to find a solution.  iit can be thought of as an unavoidable digital filter
>>> response of having to take X number of samples in one snapshot while
>>> capturing a finite instance in time (as you might know the Dirac delta is
>>> centered on DC)
>>>
>>> https://en.wikipedia.org/wiki/Ringing_artifacts
>>> 
>>>
>>> On Wed, Jun 24, 2020 at 10:12 AM Corey K  wrote:
>>>
 I think you're mistaken, unfortunately. Block FFT convolution has been
 around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
 paper "A Unified Approach to Short-Time Fourier Analysis" how you can
 perform FIR filtering perfectly with the FFT, of COLA windows are used. See
 equation 5.2.5 in that paper, and the analysis that precedes it.





 On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang 
 wrote:

> that's not true.  with FFT/COLA you will necessarily have the Gibbs
> phenomenon / ringing / ripple artifacts.  certain window types will
> minimize this but you will get this phenomenon nonetheless.
>
> On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:
>
>> I see what you're getting at, I suppose. However, in the context of
>> FIR filtering I wouldn't refer to this as an artifact. Let's say you gave
>> me an FIR filter with N-taps and asked me to write a program to implement
>> that filter. I could implement this using a direct form structure (in the
>> time-domain), or with the FFT using OLA. Both would give the exact same
>> results down to numerical precision, with no "artifacts". That's why it
>> intrigued me when you said "of course it won't have the ripple artifacts
>> associated with FFT overlap windowing" when referring to software that 
>> does
>> filtering.
>>
>>
>> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> ripple is just a known artifactual component of a windowing
>>> operation.  it's also known as the Gibbs phenomenon
>>>
>>> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
>>> 
>>>
>>> i'm not referring to any equivalency between time/freq domain
>>> filtering
>>>
>>>
>>> On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:
>>>
 Not totally understanding you, unfortunately. But if what you are
 describing is part of the normal filter response/ringing I guess I 
 wouldn't
 refer to it as "artifacts"? FIR filtering can be performed 
 equivalently in
 the time or frequency domain. Do you disagree with that statement?

 On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang <
 zez...@nyu.edu> wrote:

> yes but any windowing operation is akin to taking a dirac delta
> function on X number of samples and thus you will get ringing/ripple
> artifacts as a necessary part of the filter response
>
> On Wed, Jun 24, 2020 at 6:30 AM Corey K 
> wrote:
>
>>
>> of course it won't have the ripple artifacts associated with FFT
>>> overlap windowing

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
Thank you.  Yes it seems very theoretical and math heavy.  In practice you
will get this frequency response artifact no matter how small.  It should
factor into the math in some way, perhaps they are not looking at the
laplacian

On Wed, Jun 24, 2020, 10:41 AM Corey K  wrote:

> It's a classic paper. Google scholar shows it has been cited over 1000
> times. There's a link to it here here:
> https://jontalle.web.engr.illinois.edu/uploads/537/Papers/Public/AllenRabiner77-ProcIEEE.pdf
> 
>
>
> On Wed, Jun 24, 2020 at 11:56 AM Zhiguang Eric Zhang 
> wrote:
>
>> unfortunately, i'm not familiar with that paper.  could you please attach
>> it or provide a link for reference?  the Gibbs phenomenon is actually a
>> very well-known and thoroughly characterized signal processing artifact
>> that has been approached from a variety of angles as far as trying to find
>> a solution.  iit can be thought of as an unavoidable digital filter
>> response of having to take X number of samples in one snapshot while
>> capturing a finite instance in time (as you might know the Dirac delta is
>> centered on DC)
>>
>> https://en.wikipedia.org/wiki/Ringing_artifacts
>> 
>>
>> On Wed, Jun 24, 2020 at 10:12 AM Corey K  wrote:
>>
>>> I think you're mistaken, unfortunately. Block FFT convolution has been
>>> around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
>>> paper "A Unified Approach to Short-Time Fourier Analysis" how you can
>>> perform FIR filtering perfectly with the FFT, of COLA windows are used. See
>>> equation 5.2.5 in that paper, and the analysis that precedes it.
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang 
>>> wrote:
>>>
 that's not true.  with FFT/COLA you will necessarily have the Gibbs
 phenomenon / ringing / ripple artifacts.  certain window types will
 minimize this but you will get this phenomenon nonetheless.

 On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:

> I see what you're getting at, I suppose. However, in the context of
> FIR filtering I wouldn't refer to this as an artifact. Let's say you gave
> me an FIR filter with N-taps and asked me to write a program to implement
> that filter. I could implement this using a direct form structure (in the
> time-domain), or with the FFT using OLA. Both would give the exact same
> results down to numerical precision, with no "artifacts". That's why it
> intrigued me when you said "of course it won't have the ripple artifacts
> associated with FFT overlap windowing" when referring to software that 
> does
> filtering.
>
>
> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
> wrote:
>
>> ripple is just a known artifactual component of a windowing
>> operation.  it's also known as the Gibbs phenomenon
>>
>> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
>> 
>>
>> i'm not referring to any equivalency between time/freq domain
>> filtering
>>
>>
>> On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:
>>
>>> Not totally understanding you, unfortunately. But if what you are
>>> describing is part of the normal filter response/ringing I guess I 
>>> wouldn't
>>> refer to it as "artifacts"? FIR filtering can be performed equivalently 
>>> in
>>> the time or frequency domain. Do you disagree with that statement?
>>>
>>> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
>>> wrote:
>>>
 yes but any windowing operation is akin to taking a dirac delta
 function on X number of samples and thus you will get ringing/ripple
 artifacts as a necessary part of the filter response

 On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:

>
> of course it won't have the ripple artifacts associated with FFT
>> overlap windowing
>>
>
> What is the ripple artifact you are talking about? When using
> constant overlap add (COLA) windows the STFT is a perfect 
> reconstruction
> filterbank. Likewise block FFT convolution can be used to implement 
> any FIR
> filtering operation.

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
It's a classic paper. Google scholar shows it has been cited over 1000
times. There's a link to it here here:
https://jontalle.web.engr.illinois.edu/uploads/537/Papers/Public/AllenRabiner77-ProcIEEE.pdf


On Wed, Jun 24, 2020 at 11:56 AM Zhiguang Eric Zhang  wrote:

> unfortunately, i'm not familiar with that paper.  could you please attach
> it or provide a link for reference?  the Gibbs phenomenon is actually a
> very well-known and thoroughly characterized signal processing artifact
> that has been approached from a variety of angles as far as trying to find
> a solution.  iit can be thought of as an unavoidable digital filter
> response of having to take X number of samples in one snapshot while
> capturing a finite instance in time (as you might know the Dirac delta is
> centered on DC)
>
> https://en.wikipedia.org/wiki/Ringing_artifacts
>
> On Wed, Jun 24, 2020 at 10:12 AM Corey K  wrote:
>
>> I think you're mistaken, unfortunately. Block FFT convolution has been
>> around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
>> paper "A Unified Approach to Short-Time Fourier Analysis" how you can
>> perform FIR filtering perfectly with the FFT, of COLA windows are used. See
>> equation 5.2.5 in that paper, and the analysis that precedes it.
>>
>>
>>
>>
>>
>> On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> that's not true.  with FFT/COLA you will necessarily have the Gibbs
>>> phenomenon / ringing / ripple artifacts.  certain window types will
>>> minimize this but you will get this phenomenon nonetheless.
>>>
>>> On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:
>>>
 I see what you're getting at, I suppose. However, in the context of FIR
 filtering I wouldn't refer to this as an artifact. Let's say you gave me an
 FIR filter with N-taps and asked me to write a program to implement that
 filter. I could implement this using a direct form structure (in the
 time-domain), or with the FFT using OLA. Both would give the exact same
 results down to numerical precision, with no "artifacts". That's why it
 intrigued me when you said "of course it won't have the ripple artifacts
 associated with FFT overlap windowing" when referring to software that does
 filtering.


 On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
 wrote:

> ripple is just a known artifactual component of a windowing
> operation.  it's also known as the Gibbs phenomenon
>
> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
> 
>
> i'm not referring to any equivalency between time/freq domain filtering
>
>
> On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:
>
>> Not totally understanding you, unfortunately. But if what you are
>> describing is part of the normal filter response/ringing I guess I 
>> wouldn't
>> refer to it as "artifacts"? FIR filtering can be performed equivalently 
>> in
>> the time or frequency domain. Do you disagree with that statement?
>>
>> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> yes but any windowing operation is akin to taking a dirac delta
>>> function on X number of samples and thus you will get ringing/ripple
>>> artifacts as a necessary part of the filter response
>>>
>>> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>>>

 of course it won't have the ripple artifacts associated with FFT
> overlap windowing
>

 What is the ripple artifact you are talking about? When using
 constant overlap add (COLA) windows the STFT is a perfect 
 reconstruction
 filterbank. Likewise block FFT convolution can be used to implement 
 any FIR
 filtering operation.






> cheers,
> -ez
>
> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <
> g...@waxingwave.com> wrote:
>
>> Hello Spencer,
>>
>> You wrote:
>> > A while ago I read through some the literature [1] on
>> implementing
>> > an invertible CQT as a special case of the Nonstationary Gabor
>> > Transform. It's implemented by the essentia library [2] among
>> other
>> > places probably.
>> >
>> > The main idea is that you take the FFT of your whole signal,
>> then
>> > apply the filter bank in the frequency domain (just
>> > multiplication). Then you IFFT each filtered signal, which
>> gives you
>> > the time-domain samples for each band of the filter bank. Each
>> > 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
unfortunately, i'm not familiar with that paper.  could you please attach
it or provide a link for reference?  the Gibbs phenomenon is actually a
very well-known and thoroughly characterized signal processing artifact
that has been approached from a variety of angles as far as trying to find
a solution.  iit can be thought of as an unavoidable digital filter
response of having to take X number of samples in one snapshot while
capturing a finite instance in time (as you might know the Dirac delta is
centered on DC)

https://en.wikipedia.org/wiki/Ringing_artifacts

On Wed, Jun 24, 2020 at 10:12 AM Corey K  wrote:

> I think you're mistaken, unfortunately. Block FFT convolution has been
> around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
> paper "A Unified Approach to Short-Time Fourier Analysis" how you can
> perform FIR filtering perfectly with the FFT, of COLA windows are used. See
> equation 5.2.5 in that paper, and the analysis that precedes it.
>
>
>
>
>
> On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang 
> wrote:
>
>> that's not true.  with FFT/COLA you will necessarily have the Gibbs
>> phenomenon / ringing / ripple artifacts.  certain window types will
>> minimize this but you will get this phenomenon nonetheless.
>>
>> On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:
>>
>>> I see what you're getting at, I suppose. However, in the context of FIR
>>> filtering I wouldn't refer to this as an artifact. Let's say you gave me an
>>> FIR filter with N-taps and asked me to write a program to implement that
>>> filter. I could implement this using a direct form structure (in the
>>> time-domain), or with the FFT using OLA. Both would give the exact same
>>> results down to numerical precision, with no "artifacts". That's why it
>>> intrigued me when you said "of course it won't have the ripple artifacts
>>> associated with FFT overlap windowing" when referring to software that does
>>> filtering.
>>>
>>>
>>> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
>>> wrote:
>>>
 ripple is just a known artifactual component of a windowing operation.
 it's also known as the Gibbs phenomenon

 http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
 

 i'm not referring to any equivalency between time/freq domain filtering


 On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:

> Not totally understanding you, unfortunately. But if what you are
> describing is part of the normal filter response/ringing I guess I 
> wouldn't
> refer to it as "artifacts"? FIR filtering can be performed equivalently in
> the time or frequency domain. Do you disagree with that statement?
>
> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
> wrote:
>
>> yes but any windowing operation is akin to taking a dirac delta
>> function on X number of samples and thus you will get ringing/ripple
>> artifacts as a necessary part of the filter response
>>
>> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>>
>>>
>>> of course it won't have the ripple artifacts associated with FFT
 overlap windowing

>>>
>>> What is the ripple artifact you are talking about? When using
>>> constant overlap add (COLA) windows the STFT is a perfect reconstruction
>>> filterbank. Likewise block FFT convolution can be used to implement any 
>>> FIR
>>> filtering operation.
>>>
>>>
>>>
>>>
>>>
>>>
 cheers,
 -ez

 On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <
 g...@waxingwave.com> wrote:

> Hello Spencer,
>
> You wrote:
> > A while ago I read through some the literature [1] on
> implementing
> > an invertible CQT as a special case of the Nonstationary Gabor
> > Transform. It's implemented by the essentia library [2] among
> other
> > places probably.
> >
> > The main idea is that you take the FFT of your whole signal, then
> > apply the filter bank in the frequency domain (just
> > multiplication). Then you IFFT each filtered signal, which gives
> you
> > the time-domain samples for each band of the filter bank. Each
> > frequency-domain filter has a different bandwidth, so your IFFT
> is a
> > different length for each one, which gives you the different
> sample
> > rates for each one.
>
> That's the basic idea, but the Gaborator rounds up each of the
> per-band sample rates to the original sample rate divided by some
> power of two.  This means all the FFT sizes can be powers 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
I think you're mistaken, unfortunately. Block FFT convolution has been
around for 30+ years. In 1977 (43 years ago now), Jont Allen showed in his
paper "A Unified Approach to Short-Time Fourier Analysis" how you can
perform FIR filtering perfectly with the FFT, of COLA windows are used. See
equation 5.2.5 in that paper, and the analysis that precedes it.





On Wed, Jun 24, 2020 at 11:16 AM Zhiguang Eric Zhang  wrote:

> that's not true.  with FFT/COLA you will necessarily have the Gibbs
> phenomenon / ringing / ripple artifacts.  certain window types will
> minimize this but you will get this phenomenon nonetheless.
>
> On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:
>
>> I see what you're getting at, I suppose. However, in the context of FIR
>> filtering I wouldn't refer to this as an artifact. Let's say you gave me an
>> FIR filter with N-taps and asked me to write a program to implement that
>> filter. I could implement this using a direct form structure (in the
>> time-domain), or with the FFT using OLA. Both would give the exact same
>> results down to numerical precision, with no "artifacts". That's why it
>> intrigued me when you said "of course it won't have the ripple artifacts
>> associated with FFT overlap windowing" when referring to software that does
>> filtering.
>>
>>
>> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> ripple is just a known artifactual component of a windowing operation.
>>> it's also known as the Gibbs phenomenon
>>>
>>> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
>>> 
>>>
>>> i'm not referring to any equivalency between time/freq domain filtering
>>>
>>>
>>> On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:
>>>
 Not totally understanding you, unfortunately. But if what you are
 describing is part of the normal filter response/ringing I guess I wouldn't
 refer to it as "artifacts"? FIR filtering can be performed equivalently in
 the time or frequency domain. Do you disagree with that statement?

 On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
 wrote:

> yes but any windowing operation is akin to taking a dirac delta
> function on X number of samples and thus you will get ringing/ripple
> artifacts as a necessary part of the filter response
>
> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>
>>
>> of course it won't have the ripple artifacts associated with FFT
>>> overlap windowing
>>>
>>
>> What is the ripple artifact you are talking about? When using
>> constant overlap add (COLA) windows the STFT is a perfect reconstruction
>> filterbank. Likewise block FFT convolution can be used to implement any 
>> FIR
>> filtering operation.
>>
>>
>>
>>
>>
>>
>>> cheers,
>>> -ez
>>>
>>> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <
>>> g...@waxingwave.com> wrote:
>>>
 Hello Spencer,

 You wrote:
 > A while ago I read through some the literature [1] on implementing
 > an invertible CQT as a special case of the Nonstationary Gabor
 > Transform. It's implemented by the essentia library [2] among
 other
 > places probably.
 >
 > The main idea is that you take the FFT of your whole signal, then
 > apply the filter bank in the frequency domain (just
 > multiplication). Then you IFFT each filtered signal, which gives
 you
 > the time-domain samples for each band of the filter bank. Each
 > frequency-domain filter has a different bandwidth, so your IFFT
 is a
 > different length for each one, which gives you the different
 sample
 > rates for each one.

 That's the basic idea, but the Gaborator rounds up each of the
 per-band sample rates to the original sample rate divided by some
 power of two.  This means all the FFT sizes can be powers of two,
 which tend to be faster than arbitrary sizes.  It also results in a
 nicely regular time-frequency sampling grid where many of the
 samples
 coincide in time, as shown in the second plot on this page:


 https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=

 Also, the Gaborator makes use of multirate processing where the
 signal
 is repeatedly decimated by 2 and the calculations for the lower
 octaves run at successively lower sample rates.  

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
that's not true.  with FFT/COLA you will necessarily have the Gibbs
phenomenon / ringing / ripple artifacts.  certain window types will
minimize this but you will get this phenomenon nonetheless.

On Wed, Jun 24, 2020 at 9:44 AM Corey K  wrote:

> I see what you're getting at, I suppose. However, in the context of FIR
> filtering I wouldn't refer to this as an artifact. Let's say you gave me an
> FIR filter with N-taps and asked me to write a program to implement that
> filter. I could implement this using a direct form structure (in the
> time-domain), or with the FFT using OLA. Both would give the exact same
> results down to numerical precision, with no "artifacts". That's why it
> intrigued me when you said "of course it won't have the ripple artifacts
> associated with FFT overlap windowing" when referring to software that does
> filtering.
>
>
> On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang 
> wrote:
>
>> ripple is just a known artifactual component of a windowing operation.
>> it's also known as the Gibbs phenomenon
>>
>> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
>> 
>>
>> i'm not referring to any equivalency between time/freq domain filtering
>>
>>
>> On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:
>>
>>> Not totally understanding you, unfortunately. But if what you are
>>> describing is part of the normal filter response/ringing I guess I wouldn't
>>> refer to it as "artifacts"? FIR filtering can be performed equivalently in
>>> the time or frequency domain. Do you disagree with that statement?
>>>
>>> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
>>> wrote:
>>>
 yes but any windowing operation is akin to taking a dirac delta
 function on X number of samples and thus you will get ringing/ripple
 artifacts as a necessary part of the filter response

 On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:

>
> of course it won't have the ripple artifacts associated with FFT
>> overlap windowing
>>
>
> What is the ripple artifact you are talking about? When using constant
> overlap add (COLA) windows the STFT is a perfect reconstruction 
> filterbank.
> Likewise block FFT convolution can be used to implement any FIR filtering
> operation.
>
>
>
>
>
>
>> cheers,
>> -ez
>>
>> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <
>> g...@waxingwave.com> wrote:
>>
>>> Hello Spencer,
>>>
>>> You wrote:
>>> > A while ago I read through some the literature [1] on implementing
>>> > an invertible CQT as a special case of the Nonstationary Gabor
>>> > Transform. It's implemented by the essentia library [2] among other
>>> > places probably.
>>> >
>>> > The main idea is that you take the FFT of your whole signal, then
>>> > apply the filter bank in the frequency domain (just
>>> > multiplication). Then you IFFT each filtered signal, which gives
>>> you
>>> > the time-domain samples for each band of the filter bank. Each
>>> > frequency-domain filter has a different bandwidth, so your IFFT is
>>> a
>>> > different length for each one, which gives you the different sample
>>> > rates for each one.
>>>
>>> That's the basic idea, but the Gaborator rounds up each of the
>>> per-band sample rates to the original sample rate divided by some
>>> power of two.  This means all the FFT sizes can be powers of two,
>>> which tend to be faster than arbitrary sizes.  It also results in a
>>> nicely regular time-frequency sampling grid where many of the samples
>>> coincide in time, as shown in the second plot on this page:
>>>
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=
>>>
>>> Also, the Gaborator makes use of multirate processing where the
>>> signal
>>> is repeatedly decimated by 2 and the calculations for the lower
>>> octaves run at successively lower sample rates.  These optimizations
>>> help the Gaborator achieve a performance of millions of samples per
>>> second per CPU core.
>>>
>>> > They also give an "online" version where you do
>>> > the processing in chunks, but really for this to work I think you'd
>>> > need large-ish chunks so the latency would be pretty bad.
>>>
>>> The Gaborator also works in chunks.  A typical chunk size might be
>>> 8192 samples, but thanks to the multirate processing, in the lowest
>>> frequency bands, each of those 8192 samples may 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
I see what you're getting at, I suppose. However, in the context of FIR
filtering I wouldn't refer to this as an artifact. Let's say you gave me an
FIR filter with N-taps and asked me to write a program to implement that
filter. I could implement this using a direct form structure (in the
time-domain), or with the FFT using OLA. Both would give the exact same
results down to numerical precision, with no "artifacts". That's why it
intrigued me when you said "of course it won't have the ripple artifacts
associated with FFT overlap windowing" when referring to software that does
filtering.


On Wed, Jun 24, 2020 at 10:59 AM Zhiguang Eric Zhang  wrote:

> ripple is just a known artifactual component of a windowing operation.
> it's also known as the Gibbs phenomenon
>
> http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html
>
> i'm not referring to any equivalency between time/freq domain filtering
>
>
> On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:
>
>> Not totally understanding you, unfortunately. But if what you are
>> describing is part of the normal filter response/ringing I guess I wouldn't
>> refer to it as "artifacts"? FIR filtering can be performed equivalently in
>> the time or frequency domain. Do you disagree with that statement?
>>
>> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
>> wrote:
>>
>>> yes but any windowing operation is akin to taking a dirac delta function
>>> on X number of samples and thus you will get ringing/ripple artifacts as a
>>> necessary part of the filter response
>>>
>>> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>>>

 of course it won't have the ripple artifacts associated with FFT
> overlap windowing
>

 What is the ripple artifact you are talking about? When using constant
 overlap add (COLA) windows the STFT is a perfect reconstruction filterbank.
 Likewise block FFT convolution can be used to implement any FIR filtering
 operation.






> cheers,
> -ez
>
> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson <
> g...@waxingwave.com> wrote:
>
>> Hello Spencer,
>>
>> You wrote:
>> > A while ago I read through some the literature [1] on implementing
>> > an invertible CQT as a special case of the Nonstationary Gabor
>> > Transform. It's implemented by the essentia library [2] among other
>> > places probably.
>> >
>> > The main idea is that you take the FFT of your whole signal, then
>> > apply the filter bank in the frequency domain (just
>> > multiplication). Then you IFFT each filtered signal, which gives you
>> > the time-domain samples for each band of the filter bank. Each
>> > frequency-domain filter has a different bandwidth, so your IFFT is a
>> > different length for each one, which gives you the different sample
>> > rates for each one.
>>
>> That's the basic idea, but the Gaborator rounds up each of the
>> per-band sample rates to the original sample rate divided by some
>> power of two.  This means all the FFT sizes can be powers of two,
>> which tend to be faster than arbitrary sizes.  It also results in a
>> nicely regular time-frequency sampling grid where many of the samples
>> coincide in time, as shown in the second plot on this page:
>>
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=
>>
>> Also, the Gaborator makes use of multirate processing where the signal
>> is repeatedly decimated by 2 and the calculations for the lower
>> octaves run at successively lower sample rates.  These optimizations
>> help the Gaborator achieve a performance of millions of samples per
>> second per CPU core.
>>
>> > They also give an "online" version where you do
>> > the processing in chunks, but really for this to work I think you'd
>> > need large-ish chunks so the latency would be pretty bad.
>>
>> The Gaborator also works in chunks.  A typical chunk size might be
>> 8192 samples, but thanks to the multirate processing, in the lowest
>> frequency bands, each of those 8192 samples may represent the
>> low-frequency content of something like 1024 samples of the original
>> signal.  This gives an effective chunk size of some 8 million samples
>> without actually having to perform any FFTs that large.
>>
>> Latency is certainly high, but I would not say it is a consequence of
>> the chunk size as such.  Rather, both the high latency and the need
>> for a large (effective) chunk size are consequences of the lengths of
>> the band filter impulse responses, which get exponentially larger as
>> the constant-Q bands get narrower towards lower frequencies.
>>
>> Latency in the 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
ripple is just a known artifactual component of a windowing operation.
it's also known as the Gibbs phenomenon

http://matlab.izmiran.ru/help/toolbox/signal/filterd8.html

i'm not referring to any equivalency between time/freq domain filtering


On Wed, Jun 24, 2020 at 9:21 AM Corey K  wrote:

> Not totally understanding you, unfortunately. But if what you are
> describing is part of the normal filter response/ringing I guess I wouldn't
> refer to it as "artifacts"? FIR filtering can be performed equivalently in
> the time or frequency domain. Do you disagree with that statement?
>
> On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang 
> wrote:
>
>> yes but any windowing operation is akin to taking a dirac delta function
>> on X number of samples and thus you will get ringing/ripple artifacts as a
>> necessary part of the filter response
>>
>> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>>
>>>
>>> of course it won't have the ripple artifacts associated with FFT overlap
 windowing

>>>
>>> What is the ripple artifact you are talking about? When using constant
>>> overlap add (COLA) windows the STFT is a perfect reconstruction filterbank.
>>> Likewise block FFT convolution can be used to implement any FIR filtering
>>> operation.
>>>
>>>
>>>
>>>
>>>
>>>
 cheers,
 -ez

 On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson 
 wrote:

> Hello Spencer,
>
> You wrote:
> > A while ago I read through some the literature [1] on implementing
> > an invertible CQT as a special case of the Nonstationary Gabor
> > Transform. It's implemented by the essentia library [2] among other
> > places probably.
> >
> > The main idea is that you take the FFT of your whole signal, then
> > apply the filter bank in the frequency domain (just
> > multiplication). Then you IFFT each filtered signal, which gives you
> > the time-domain samples for each band of the filter bank. Each
> > frequency-domain filter has a different bandwidth, so your IFFT is a
> > different length for each one, which gives you the different sample
> > rates for each one.
>
> That's the basic idea, but the Gaborator rounds up each of the
> per-band sample rates to the original sample rate divided by some
> power of two.  This means all the FFT sizes can be powers of two,
> which tend to be faster than arbitrary sizes.  It also results in a
> nicely regular time-frequency sampling grid where many of the samples
> coincide in time, as shown in the second plot on this page:
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=
>
> Also, the Gaborator makes use of multirate processing where the signal
> is repeatedly decimated by 2 and the calculations for the lower
> octaves run at successively lower sample rates.  These optimizations
> help the Gaborator achieve a performance of millions of samples per
> second per CPU core.
>
> > They also give an "online" version where you do
> > the processing in chunks, but really for this to work I think you'd
> > need large-ish chunks so the latency would be pretty bad.
>
> The Gaborator also works in chunks.  A typical chunk size might be
> 8192 samples, but thanks to the multirate processing, in the lowest
> frequency bands, each of those 8192 samples may represent the
> low-frequency content of something like 1024 samples of the original
> signal.  This gives an effective chunk size of some 8 million samples
> without actually having to perform any FFTs that large.
>
> Latency is certainly high, but I would not say it is a consequence of
> the chunk size as such.  Rather, both the high latency and the need
> for a large (effective) chunk size are consequences of the lengths of
> the band filter impulse responses, which get exponentially larger as
> the constant-Q bands get narrower towards lower frequencies.
>
> Latency in the Gaborator is discussed in more detail here:
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc=
>
> > The whole process is in some ways dual to the usual STFT process,
> > where we first window and then FFT. in the NSGT you first FFT and
> > then window, and then IFFT each band to get a Time-Frequency
> > representation.
>
> Yes.
>
> > For resynthesis you end up with a similar window overlap constraint
> > as in STFT, except now the windows are in the frequency domain. It's
> > a little more complicated because the 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
Not totally understanding you, unfortunately. But if what you are
describing is part of the normal filter response/ringing I guess I wouldn't
refer to it as "artifacts"? FIR filtering can be performed equivalently in
the time or frequency domain. Do you disagree with that statement?

On Wed, Jun 24, 2020 at 10:02 AM Zhiguang Eric Zhang  wrote:

> yes but any windowing operation is akin to taking a dirac delta function
> on X number of samples and thus you will get ringing/ripple artifacts as a
> necessary part of the filter response
>
> On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:
>
>>
>> of course it won't have the ripple artifacts associated with FFT overlap
>>> windowing
>>>
>>
>> What is the ripple artifact you are talking about? When using constant
>> overlap add (COLA) windows the STFT is a perfect reconstruction filterbank.
>> Likewise block FFT convolution can be used to implement any FIR filtering
>> operation.
>>
>>
>>
>>
>>
>>
>>> cheers,
>>> -ez
>>>
>>> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson 
>>> wrote:
>>>
 Hello Spencer,

 You wrote:
 > A while ago I read through some the literature [1] on implementing
 > an invertible CQT as a special case of the Nonstationary Gabor
 > Transform. It's implemented by the essentia library [2] among other
 > places probably.
 >
 > The main idea is that you take the FFT of your whole signal, then
 > apply the filter bank in the frequency domain (just
 > multiplication). Then you IFFT each filtered signal, which gives you
 > the time-domain samples for each band of the filter bank. Each
 > frequency-domain filter has a different bandwidth, so your IFFT is a
 > different length for each one, which gives you the different sample
 > rates for each one.

 That's the basic idea, but the Gaborator rounds up each of the
 per-band sample rates to the original sample rate divided by some
 power of two.  This means all the FFT sizes can be powers of two,
 which tend to be faster than arbitrary sizes.  It also results in a
 nicely regular time-frequency sampling grid where many of the samples
 coincide in time, as shown in the second plot on this page:


 https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=

 Also, the Gaborator makes use of multirate processing where the signal
 is repeatedly decimated by 2 and the calculations for the lower
 octaves run at successively lower sample rates.  These optimizations
 help the Gaborator achieve a performance of millions of samples per
 second per CPU core.

 > They also give an "online" version where you do
 > the processing in chunks, but really for this to work I think you'd
 > need large-ish chunks so the latency would be pretty bad.

 The Gaborator also works in chunks.  A typical chunk size might be
 8192 samples, but thanks to the multirate processing, in the lowest
 frequency bands, each of those 8192 samples may represent the
 low-frequency content of something like 1024 samples of the original
 signal.  This gives an effective chunk size of some 8 million samples
 without actually having to perform any FFTs that large.

 Latency is certainly high, but I would not say it is a consequence of
 the chunk size as such.  Rather, both the high latency and the need
 for a large (effective) chunk size are consequences of the lengths of
 the band filter impulse responses, which get exponentially larger as
 the constant-Q bands get narrower towards lower frequencies.

 Latency in the Gaborator is discussed in more detail here:


 https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc=

 > The whole process is in some ways dual to the usual STFT process,
 > where we first window and then FFT. in the NSGT you first FFT and
 > then window, and then IFFT each band to get a Time-Frequency
 > representation.

 Yes.

 > For resynthesis you end up with a similar window overlap constraint
 > as in STFT, except now the windows are in the frequency domain. It's
 > a little more complicated because the window centers aren't
 > evenly-spaced, so creating COLA windows is complicated. There are
 > some fancier approaches to designing a set of synthesis windows that
 > are complementary (inverse) of the analysis windows, which is what
 > the frame-theory folks like that Austrian group seem to like to use.

 The Gaborator was inspired by the papers from that Austrian group and

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Zhiguang Eric Zhang
yes but any windowing operation is akin to taking a dirac delta function on
X number of samples and thus you will get ringing/ripple artifacts as a
necessary part of the filter response

On Wed, Jun 24, 2020 at 6:30 AM Corey K  wrote:

>
> of course it won't have the ripple artifacts associated with FFT overlap
>> windowing
>>
>
> What is the ripple artifact you are talking about? When using constant
> overlap add (COLA) windows the STFT is a perfect reconstruction filterbank.
> Likewise block FFT convolution can be used to implement any FIR filtering
> operation.
>
>
>
>
>
>
>> cheers,
>> -ez
>>
>> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson 
>> wrote:
>>
>>> Hello Spencer,
>>>
>>> You wrote:
>>> > A while ago I read through some the literature [1] on implementing
>>> > an invertible CQT as a special case of the Nonstationary Gabor
>>> > Transform. It's implemented by the essentia library [2] among other
>>> > places probably.
>>> >
>>> > The main idea is that you take the FFT of your whole signal, then
>>> > apply the filter bank in the frequency domain (just
>>> > multiplication). Then you IFFT each filtered signal, which gives you
>>> > the time-domain samples for each band of the filter bank. Each
>>> > frequency-domain filter has a different bandwidth, so your IFFT is a
>>> > different length for each one, which gives you the different sample
>>> > rates for each one.
>>>
>>> That's the basic idea, but the Gaborator rounds up each of the
>>> per-band sample rates to the original sample rate divided by some
>>> power of two.  This means all the FFT sizes can be powers of two,
>>> which tend to be faster than arbitrary sizes.  It also results in a
>>> nicely regular time-frequency sampling grid where many of the samples
>>> coincide in time, as shown in the second plot on this page:
>>>
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=
>>>
>>> Also, the Gaborator makes use of multirate processing where the signal
>>> is repeatedly decimated by 2 and the calculations for the lower
>>> octaves run at successively lower sample rates.  These optimizations
>>> help the Gaborator achieve a performance of millions of samples per
>>> second per CPU core.
>>>
>>> > They also give an "online" version where you do
>>> > the processing in chunks, but really for this to work I think you'd
>>> > need large-ish chunks so the latency would be pretty bad.
>>>
>>> The Gaborator also works in chunks.  A typical chunk size might be
>>> 8192 samples, but thanks to the multirate processing, in the lowest
>>> frequency bands, each of those 8192 samples may represent the
>>> low-frequency content of something like 1024 samples of the original
>>> signal.  This gives an effective chunk size of some 8 million samples
>>> without actually having to perform any FFTs that large.
>>>
>>> Latency is certainly high, but I would not say it is a consequence of
>>> the chunk size as such.  Rather, both the high latency and the need
>>> for a large (effective) chunk size are consequences of the lengths of
>>> the band filter impulse responses, which get exponentially larger as
>>> the constant-Q bands get narrower towards lower frequencies.
>>>
>>> Latency in the Gaborator is discussed in more detail here:
>>>
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc=
>>>
>>> > The whole process is in some ways dual to the usual STFT process,
>>> > where we first window and then FFT. in the NSGT you first FFT and
>>> > then window, and then IFFT each band to get a Time-Frequency
>>> > representation.
>>>
>>> Yes.
>>>
>>> > For resynthesis you end up with a similar window overlap constraint
>>> > as in STFT, except now the windows are in the frequency domain. It's
>>> > a little more complicated because the window centers aren't
>>> > evenly-spaced, so creating COLA windows is complicated. There are
>>> > some fancier approaches to designing a set of synthesis windows that
>>> > are complementary (inverse) of the analysis windows, which is what
>>> > the frame-theory folks like that Austrian group seem to like to use.
>>>
>>> The Gaborator was inspired by the papers from that Austrian group and
>>> uses complementary resynthesis windows, or "duals" as frame theorists
>>> like to call them.  The analysis windows are Gaussian, and the dual
>>> windows used for resynthesis end up being slightly distorted
>>> Gaussians.
>>>
>>> > One of the nice things about the NSGT is it lets you be really
>>> > flexible in your filterbank design while still giving you
>>> > invertibility.
>>>
>>> Agreed.
>>>
>>> In a later message, you wrote:
>>> > 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-24 Thread Corey K
> of course it won't have the ripple artifacts associated with FFT overlap
> windowing
>

What is the ripple artifact you are talking about? When using constant
overlap add (COLA) windows the STFT is a perfect reconstruction filterbank.
Likewise block FFT convolution can be used to implement any FIR filtering
operation.






> cheers,
> -ez
>
> On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson 
> wrote:
>
>> Hello Spencer,
>>
>> You wrote:
>> > A while ago I read through some the literature [1] on implementing
>> > an invertible CQT as a special case of the Nonstationary Gabor
>> > Transform. It's implemented by the essentia library [2] among other
>> > places probably.
>> >
>> > The main idea is that you take the FFT of your whole signal, then
>> > apply the filter bank in the frequency domain (just
>> > multiplication). Then you IFFT each filtered signal, which gives you
>> > the time-domain samples for each band of the filter bank. Each
>> > frequency-domain filter has a different bandwidth, so your IFFT is a
>> > different length for each one, which gives you the different sample
>> > rates for each one.
>>
>> That's the basic idea, but the Gaborator rounds up each of the
>> per-band sample rates to the original sample rate divided by some
>> power of two.  This means all the FFT sizes can be powers of two,
>> which tend to be faster than arbitrary sizes.  It also results in a
>> nicely regular time-frequency sampling grid where many of the samples
>> coincide in time, as shown in the second plot on this page:
>>
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=
>>
>> Also, the Gaborator makes use of multirate processing where the signal
>> is repeatedly decimated by 2 and the calculations for the lower
>> octaves run at successively lower sample rates.  These optimizations
>> help the Gaborator achieve a performance of millions of samples per
>> second per CPU core.
>>
>> > They also give an "online" version where you do
>> > the processing in chunks, but really for this to work I think you'd
>> > need large-ish chunks so the latency would be pretty bad.
>>
>> The Gaborator also works in chunks.  A typical chunk size might be
>> 8192 samples, but thanks to the multirate processing, in the lowest
>> frequency bands, each of those 8192 samples may represent the
>> low-frequency content of something like 1024 samples of the original
>> signal.  This gives an effective chunk size of some 8 million samples
>> without actually having to perform any FFTs that large.
>>
>> Latency is certainly high, but I would not say it is a consequence of
>> the chunk size as such.  Rather, both the high latency and the need
>> for a large (effective) chunk size are consequences of the lengths of
>> the band filter impulse responses, which get exponentially larger as
>> the constant-Q bands get narrower towards lower frequencies.
>>
>> Latency in the Gaborator is discussed in more detail here:
>>
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc=
>>
>> > The whole process is in some ways dual to the usual STFT process,
>> > where we first window and then FFT. in the NSGT you first FFT and
>> > then window, and then IFFT each band to get a Time-Frequency
>> > representation.
>>
>> Yes.
>>
>> > For resynthesis you end up with a similar window overlap constraint
>> > as in STFT, except now the windows are in the frequency domain. It's
>> > a little more complicated because the window centers aren't
>> > evenly-spaced, so creating COLA windows is complicated. There are
>> > some fancier approaches to designing a set of synthesis windows that
>> > are complementary (inverse) of the analysis windows, which is what
>> > the frame-theory folks like that Austrian group seem to like to use.
>>
>> The Gaborator was inspired by the papers from that Austrian group and
>> uses complementary resynthesis windows, or "duals" as frame theorists
>> like to call them.  The analysis windows are Gaussian, and the dual
>> windows used for resynthesis end up being slightly distorted
>> Gaussians.
>>
>> > One of the nice things about the NSGT is it lets you be really
>> > flexible in your filterbank design while still giving you
>> > invertibility.
>>
>> Agreed.
>>
>> In a later message, you wrote:
>> > Whoops, just clicked through to the documentation and it looks like
>> > this is the track you're on also. I'm curious if you have any
>> > insight into the window-selection for the analysis and synthesis
>> > process. It seems like the NSGT framework forces you to be a bit
>> > smarter with windows than just sticking to COLA, but the dual frame
>> 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-06-23 Thread Zhiguang Eric Zhang
hi again,


just wanted to chime in that this piece of software was released some time
ago and is the traditional FIR/IIR equivalent of what's being discussed
here, and is quite a breeze to use in the studio

https://www.wavesfactory.com/trackspacer/

of course it won't have the ripple artifacts associated with FFT overlap
windowing but i'm not sure how much delay there is or even what the phase
distortion sounds like


cheers,
-ez

On Mon, Apr 13, 2020 at 4:55 PM Andreas Gustafsson 
wrote:

> Hello Spencer,
>
> You wrote:
> > A while ago I read through some the literature [1] on implementing
> > an invertible CQT as a special case of the Nonstationary Gabor
> > Transform. It's implemented by the essentia library [2] among other
> > places probably.
> >
> > The main idea is that you take the FFT of your whole signal, then
> > apply the filter bank in the frequency domain (just
> > multiplication). Then you IFFT each filtered signal, which gives you
> > the time-domain samples for each band of the filter bank. Each
> > frequency-domain filter has a different bandwidth, so your IFFT is a
> > different length for each one, which gives you the different sample
> > rates for each one.
>
> That's the basic idea, but the Gaborator rounds up each of the
> per-band sample rates to the original sample rate divided by some
> power of two.  This means all the FFT sizes can be powers of two,
> which tend to be faster than arbitrary sizes.  It also results in a
> nicely regular time-frequency sampling grid where many of the samples
> coincide in time, as shown in the second plot on this page:
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_overview.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=FG-ZGfFa09T-Y7nLajB8evbCy9WIADFrUqPwjz-LHow=
>
> Also, the Gaborator makes use of multirate processing where the signal
> is repeatedly decimated by 2 and the calculations for the lower
> octaves run at successively lower sample rates.  These optimizations
> help the Gaborator achieve a performance of millions of samples per
> second per CPU core.
>
> > They also give an "online" version where you do
> > the processing in chunks, but really for this to work I think you'd
> > need large-ish chunks so the latency would be pretty bad.
>
> The Gaborator also works in chunks.  A typical chunk size might be
> 8192 samples, but thanks to the multirate processing, in the lowest
> frequency bands, each of those 8192 samples may represent the
> low-frequency content of something like 1024 samples of the original
> signal.  This gives an effective chunk size of some 8 million samples
> without actually having to perform any FFTs that large.
>
> Latency is certainly high, but I would not say it is a consequence of
> the chunk size as such.  Rather, both the high latency and the need
> for a large (effective) chunk size are consequences of the lengths of
> the band filter impulse responses, which get exponentially larger as
> the constant-Q bands get narrower towards lower frequencies.
>
> Latency in the Gaborator is discussed in more detail here:
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.gaborator.com_gaborator-2D1.4_doc_realtime.html=DwICAg=slrrB7dE8n7gBJbeO0g-IQ=w_CiiFx8eb9uUtrPcg7_DA=4rIFY1X4fS1G8-882xM72jF9DvsY6-Z2ckeHxjPPfTY=uuRzi0taGcXI9Sq63G_xTTrCjaz9cu3ewu8jfzIUcVc=
>
> > The whole process is in some ways dual to the usual STFT process,
> > where we first window and then FFT. in the NSGT you first FFT and
> > then window, and then IFFT each band to get a Time-Frequency
> > representation.
>
> Yes.
>
> > For resynthesis you end up with a similar window overlap constraint
> > as in STFT, except now the windows are in the frequency domain. It's
> > a little more complicated because the window centers aren't
> > evenly-spaced, so creating COLA windows is complicated. There are
> > some fancier approaches to designing a set of synthesis windows that
> > are complementary (inverse) of the analysis windows, which is what
> > the frame-theory folks like that Austrian group seem to like to use.
>
> The Gaborator was inspired by the papers from that Austrian group and
> uses complementary resynthesis windows, or "duals" as frame theorists
> like to call them.  The analysis windows are Gaussian, and the dual
> windows used for resynthesis end up being slightly distorted
> Gaussians.
>
> > One of the nice things about the NSGT is it lets you be really
> > flexible in your filterbank design while still giving you
> > invertibility.
>
> Agreed.
>
> In a later message, you wrote:
> > Whoops, just clicked through to the documentation and it looks like
> > this is the track you're on also. I'm curious if you have any
> > insight into the window-selection for the analysis and synthesis
> > process. It seems like the NSGT framework forces you to be a bit
> > smarter with windows than just sticking to COLA, but the dual frame
> > 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-04-13 Thread Andreas Gustafsson
Hello Spencer,

You wrote:
> A while ago I read through some the literature [1] on implementing
> an invertible CQT as a special case of the Nonstationary Gabor
> Transform. It's implemented by the essentia library [2] among other
> places probably.
> 
> The main idea is that you take the FFT of your whole signal, then
> apply the filter bank in the frequency domain (just
> multiplication). Then you IFFT each filtered signal, which gives you
> the time-domain samples for each band of the filter bank. Each
> frequency-domain filter has a different bandwidth, so your IFFT is a
> different length for each one, which gives you the different sample
> rates for each one.

That's the basic idea, but the Gaborator rounds up each of the
per-band sample rates to the original sample rate divided by some
power of two.  This means all the FFT sizes can be powers of two,
which tend to be faster than arbitrary sizes.  It also results in a
nicely regular time-frequency sampling grid where many of the samples
coincide in time, as shown in the second plot on this page:

  https://www.gaborator.com/gaborator-1.4/doc/overview.html

Also, the Gaborator makes use of multirate processing where the signal
is repeatedly decimated by 2 and the calculations for the lower
octaves run at successively lower sample rates.  These optimizations
help the Gaborator achieve a performance of millions of samples per
second per CPU core.

> They also give an "online" version where you do
> the processing in chunks, but really for this to work I think you'd
> need large-ish chunks so the latency would be pretty bad.

The Gaborator also works in chunks.  A typical chunk size might be
8192 samples, but thanks to the multirate processing, in the lowest
frequency bands, each of those 8192 samples may represent the
low-frequency content of something like 1024 samples of the original
signal.  This gives an effective chunk size of some 8 million samples
without actually having to perform any FFTs that large.

Latency is certainly high, but I would not say it is a consequence of
the chunk size as such.  Rather, both the high latency and the need
for a large (effective) chunk size are consequences of the lengths of
the band filter impulse responses, which get exponentially larger as
the constant-Q bands get narrower towards lower frequencies.

Latency in the Gaborator is discussed in more detail here:

  https://www.gaborator.com/gaborator-1.4/doc/realtime.html

> The whole process is in some ways dual to the usual STFT process,
> where we first window and then FFT. in the NSGT you first FFT and
> then window, and then IFFT each band to get a Time-Frequency
> representation.

Yes.

> For resynthesis you end up with a similar window overlap constraint
> as in STFT, except now the windows are in the frequency domain. It's
> a little more complicated because the window centers aren't
> evenly-spaced, so creating COLA windows is complicated. There are
> some fancier approaches to designing a set of synthesis windows that
> are complementary (inverse) of the analysis windows, which is what
> the frame-theory folks like that Austrian group seem to like to use.

The Gaborator was inspired by the papers from that Austrian group and
uses complementary resynthesis windows, or "duals" as frame theorists
like to call them.  The analysis windows are Gaussian, and the dual
windows used for resynthesis end up being slightly distorted
Gaussians.

> One of the nice things about the NSGT is it lets you be really
> flexible in your filterbank design while still giving you
> invertibility.

Agreed.

In a later message, you wrote:
> Whoops, just clicked through to the documentation and it looks like
> this is the track you're on also. I'm curious if you have any
> insight into the window-selection for the analysis and synthesis
> process. It seems like the NSGT framework forces you to be a bit
> smarter with windows than just sticking to COLA, but the dual frame
> techniques should apply for regular STFT processing, right?

I'm actually not that familiar with traditional STFTs and COLA, but as
far as I can tell, the STFT is a special case of the NSGT and the same
dual frame techniques should apply.
-- 
Andreas Gustafsson, g...@waxingwave.com
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-04-13 Thread Spencer Russell



On Mon, Apr 13, 2020, at 1:36 PM, Spencer Russell wrote:
> 
> Andreas - is this the general approach you use for Gaborator?
> 

Whoops, just clicked through to the documentation and it looks like this is the 
track you're on also. I'm curious if you have any insight into the 
window-selection for the analysis and synthesis process. It seems like the NSGT 
framework forces you to be a bit smarter with windows than just sticking to 
COLA, but the dual frame techniques should apply for regular STFT processing, 
right?
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-04-13 Thread Spencer Russell
On Fri, Mar 20, 2020, at 4:58 PM, Andreas Gustafsson wrote:
> robert bristow-johnson wrote:
but i would be excited to see a good
> > implementation of constant Q filterbank that is very close to
> > perfect reconstruction if the modification in the frequency domain
> > is null. 
> 
> Isn't this pretty much what my Gaborator library (gaborator.com) does?
> It performs constant Q analysis using Gaussian windows, and resynthesis
> that reconstructs the original signal to within about -115 dB using
> single precision floats.

A while ago I read through some the literature [1] on implementing an 
invertible CQT as a special case of the Nonstationary Gabor Transform. It's 
implemented by the essentia library [2] among other places probably.

The main idea is that you take the FFT of your whole signal, then apply the 
filter bank in the frequency domain (just multiplication). Then you IFFT each 
filtered signal, which gives you the time-domain samples for each band of the 
filter bank. Each frequency-domain filter has a different bandwidth, so your 
IFFT is a different length for each one, which gives you the different sample 
rates for each one. They also give an "online" version where you do the 
processing in chunks, but really for this to work I think you'd need large-ish 
chunks so the latency would be pretty bad.

The whole process is in some ways dual to the usual STFT process, where we 
first window and then FFT. in the NSGT you first FFT and then window, and then 
IFFT each band to get a Time-Frequency representation.

For resynthesis you end up with a similar window overlap constraint as in STFT, 
except now the windows are in the frequency domain. It's a little more 
complicated because the window centers aren't evenly-spaced, so creating COLA 
windows is complicated. There are some fancier approaches to designing a set of 
synthesis windows that are complementary (inverse) of the analysis windows, 
which is what the frame-theory folks like that Austrian group seem to like to 
use.

One of the nice things about the NSGT is it lets you be really flexible in your 
filterbank design while still giving you invertibility.

Andreas - is this the general approach you use for Gaborator?


[1]: Balazs, P., Dörfler, M., Jaillet, F., Holighaus, N., & Velasco, G. (2011). 
Theory, implementation and applications of nonstationary Gabor frames. Journal 
of Computational and Applied Mathematics, 236(6), 1481–1496.
[2]: https://mtg.github.io/essentia-labs/news/2019/02/07/invertible-constant-q/
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-29 Thread zhiguang zhang
hi music-dsp,

just a disclosure that I worked on this whilst studying for my master's
degree at NYU, and was also a summer intern at Eventide.  incidentally, one
of the founders at Eventide, John Agnello, has a patent that is similar to
what is being discussed here.

https://patents.justia.com/patent/5228093


best wishes,
z eric zhang

PS - does anyone know if Dan Gillespie from Columbia is on this list?

On Sat, Mar 21, 2020 at 3:55 AM Andreas Gustafsson 
wrote:

> robert bristow-johnson wrote:
> > i've also fiddled with Gaussian windows and STFT around the turn of
> > the century.  i like that the Fourier Transform of a Gaussian is
> > another Gaussian, so each frequency component will generate a
> > Gaussian pulse in the frequency domain.
>
> Yes.  The Gaussian also has many other nice properties such as being
> free of side lobes, having faster than exponential fall-off, and being
> both separable and circularly symmetric in the 2-D case.
>
> > how many of these do you have per octave, Andreas?  looks like it
> > could be 24 or 48.
>
> The demos use 48 frequency bands per octave, but the underlying
> library can handle any integer number of bands per octave from 6
> up to several hundred.
>
> > does this make the pixel density along the t-axis be the same for
> > higher octaves as it is for lower?  because for constant-Q, you can
> > have more pixels per second for the high pitched bins.  but drawing
> > that would be a little weird.
>
> There's a distinction between spectrogram coefficients (filter bank
> output samples) and display pixels.  The density of the coefficients
> along the time axis is indeed higher in the higher octaves.
>
> Converting the coefficients to display pixels involves taking their
> magnitudes and then resampling the magnitudes to the density of pixels
> per time unit implied by the display zoom factor.  This means
> different octaves get resampled by different factors; typically an
> octave somewhere in the middle ends up with a one-to-one
> correspondence between coefficients and pixels, while higher octaves
> are decimated and lower octaves are interpolated.
>
> > which is sorta the wavelet thing.
>
> Yes, that is one way of looking at it.
>
> > Andreas, for each pixel, what parameters do you have?  like an
> > amplitude and phase, or do you have more data such as frequency
> > sweep rate or amplitude ramp rate?  Using a Gaussian window, you can
> > extract all that data out of the windowed samples that are used for
> > the pixel.
>
> Again distinguishing between coefficients and pixels, the coefficients
> of each frequency band are just complex quadrature samples of the
> band's signal downmixed to baseband (using radio terminology), so to
> determine the sweep rate or amplitude ramp rate you have to look at
> more than one coefficient.
>
> In the displayed pixels, the only parameter is the magnitude; the
> phase has been discarded (and therefore the original signal can only
> be reconstructed from the coefficients, not from the pixels).  The
> coloring in the demo is made by constructing a separate spectrogram
> for each stereo channel or track, tinting them differently and adding
> them together in RGB space.
> --
> Andreas Gustafsson, g...@waxingwave.com
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-21 Thread Andreas Gustafsson
robert bristow-johnson wrote:
> i've also fiddled with Gaussian windows and STFT around the turn of
> the century.  i like that the Fourier Transform of a Gaussian is
> another Gaussian, so each frequency component will generate a
> Gaussian pulse in the frequency domain.

Yes.  The Gaussian also has many other nice properties such as being
free of side lobes, having faster than exponential fall-off, and being
both separable and circularly symmetric in the 2-D case.

> how many of these do you have per octave, Andreas?  looks like it
> could be 24 or 48.

The demos use 48 frequency bands per octave, but the underlying
library can handle any integer number of bands per octave from 6
up to several hundred.

> does this make the pixel density along the t-axis be the same for
> higher octaves as it is for lower?  because for constant-Q, you can
> have more pixels per second for the high pitched bins.  but drawing
> that would be a little weird.

There's a distinction between spectrogram coefficients (filter bank
output samples) and display pixels.  The density of the coefficients
along the time axis is indeed higher in the higher octaves.

Converting the coefficients to display pixels involves taking their
magnitudes and then resampling the magnitudes to the density of pixels
per time unit implied by the display zoom factor.  This means
different octaves get resampled by different factors; typically an
octave somewhere in the middle ends up with a one-to-one
correspondence between coefficients and pixels, while higher octaves
are decimated and lower octaves are interpolated.

> which is sorta the wavelet thing.

Yes, that is one way of looking at it.

> Andreas, for each pixel, what parameters do you have?  like an
> amplitude and phase, or do you have more data such as frequency
> sweep rate or amplitude ramp rate?  Using a Gaussian window, you can
> extract all that data out of the windowed samples that are used for
> the pixel.

Again distinguishing between coefficients and pixels, the coefficients
of each frequency band are just complex quadrature samples of the
band's signal downmixed to baseband (using radio terminology), so to
determine the sweep rate or amplitude ramp rate you have to look at
more than one coefficient.

In the displayed pixels, the only parameter is the magnitude; the
phase has been discarded (and therefore the original signal can only
be reconstructed from the coefficients, not from the pixels).  The
coloring in the demo is made by constructing a separate spectrogram
for each stereo channel or track, tinting them differently and adding
them together in RGB space.
-- 
Andreas Gustafsson, g...@waxingwave.com
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread robert bristow-johnson



> On March 20, 2020 4:58 PM Andreas Gustafsson  wrote:
> 
>  
> robert bristow-johnson wrote:
> > anyway, while i have done this sliding Hann window before, i haven't
> > done it for a sliding DFT.  but i would be excited to see a good
> > implementation of constant Q filterbank that is very close to
> > perfect reconstruction if the modification in the frequency domain
> > is null. one could make a Hann windowed DTFT evaluated at a finite
> > number of arbitrary frequencies.  i just wonder if a sliding Hann
> > window would be best.  but, using a truncated cosine as the impulse
> > response of the TIIR, whatever shape of the window would have to be
> > a sum of truncated cosines (plus the constant term).  but you could
> > make a nice frequency analyzer of log-spaced, constant-Q,
> > filterbanks with a bank of truncated IIRs and pre-multiplying the
> > input to each filter by e^(-j omega n).  making them add up to a
> > wire is a harder problem.
> 
> Isn't this pretty much what my Gaborator library (gaborator.com) does?
> It performs constant Q analysis using Gaussian windows, and resynthesis
> that reconstructs the original signal to within about -115 dB using
> single precision floats.

wow, i really like your spectrogram display.  quite elegant.

i've also fiddled with Gaussian windows and STFT around the turn of the 
century.  i like that the Fourier Transform of a Gaussian is another Gaussian, 
so each frequency component will generate a Gaussian pulse in the frequency 
domain.  how many of these do you have per octave, Andreas?  looks like it 
could be 24 or 48. 

but that's really an impressive spectrogram demo.  (and the Subliminal track is 
interesting music.)

> 
> It's not exactly "sliding" since the output samples of the filters are
> not at the original sample rate but decimated by powers of two
> depending on the bandwidth of each filter,

does this make the pixel density along the t-axis be the same for higher 
octaves as it is for lower?  because for constant-Q, you can have more pixels 
per second for the high pitched bins.  but drawing that would be a little weird.

> but that could be seen as a
> feature since it means any frequency-domain modifications can run more
> efficiently as they have fewer samples to process.

which is sorta the wavelet thing.  higher pitched wavelets are shorter.  but 
that's not the case with the normal STFT.

>  For example, if
> you are modifying a frequency band with a center frequency of 50 Hz
> and a bandwidth of 10 Hz, there is little point in running that
> modification at a full 44.1 or 48 kHz sample rate.

Andreas, for each pixel, what parameters do you have?  like an amplitude and 
phase, or do you have more data such as frequency sweep rate or amplitude ramp 
rate?  Using a Gaussian window, you can extract all that data out of the 
windowed samples that are used for the pixel.


--
 
r b-j  r...@audioimagination.com
 
"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread Andreas Gustafsson
robert bristow-johnson wrote:
> anyway, while i have done this sliding Hann window before, i haven't
> done it for a sliding DFT.  but i would be excited to see a good
> implementation of constant Q filterbank that is very close to
> perfect reconstruction if the modification in the frequency domain
> is null. one could make a Hann windowed DTFT evaluated at a finite
> number of arbitrary frequencies.  i just wonder if a sliding Hann
> window would be best.  but, using a truncated cosine as the impulse
> response of the TIIR, whatever shape of the window would have to be
> a sum of truncated cosines (plus the constant term).  but you could
> make a nice frequency analyzer of log-spaced, constant-Q,
> filterbanks with a bank of truncated IIRs and pre-multiplying the
> input to each filter by e^(-j omega n).  making them add up to a
> wire is a harder problem.

Isn't this pretty much what my Gaborator library (gaborator.com) does?
It performs constant Q analysis using Gaussian windows, and resynthesis
that reconstructs the original signal to within about -115 dB using
single precision floats.

It's not exactly "sliding" since the output samples of the filters are
not at the original sample rate but decimated by powers of two
depending on the bandwidth of each filter, but that could be seen as a
feature since it means any frequency-domain modifications can run more
efficiently as they have fewer samples to process.  For example, if
you are modifying a frequency band with a center frequency of 50 Hz
and a bandwidth of 10 Hz, there is little point in running that
modification at a full 44.1 or 48 kHz sample rate.
-- 
Andreas Gustafsson, g...@waxingwave.com
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread zhiguang zhang
So RBJ thinks by himself and drinks by himself from '98 until.  Let them
know it's real son if it's really real, understandable, self-explainable

On Fri, Mar 20, 2020, 3:47 PM robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
> > On March 20, 2020 2:45 PM STEFFAN DIEDRICHSEN 
> wrote:
> >
> >
> > Actually, you can do a  window size per bin and an arbitrary spacing of
> the frequencies and create a “true” constant Q SDFT. Somehow, it reminds me
> on the modal synthesis stuff, which can be used to create weird processing.
> >
>
> yeah. there would be no Cooley-Tukey goin' on here. essentially, it's a
> bank of resonant filters.  perhaps one could come up with complementary
> spectral envelopes in the log(f) scale that would add to 1.  but i dunno if
> you could get such a sliding window using truncated IIR filters, so the
> Fourier Transform of it looks like one of these complementary spectral
> envelopes.  if it doesn't, adding the results of these will not get you
> perfect reconstruction.
>
> > Regarding Corona: I’m doing home office with a view on my garden.
> Vermont is not the worst place to holed in, right?  How is the supply with
> TP?
>
> before the crisis onset, we had two 12-roll packages, one hasn't been
> opened.  i'm not too uncomfortable about provisions.  i haven't yet seen
> the store yet, but i might this afternoon (that'll be sobering).  i need to
> buy some booze.
>
> i live at the mouth of a river into Lake Champlain (44.527966 lat,
> -73.270829 long).  it's 2-dimensional, but i have a 200 acre natural
> waterpark in my back yard.  with the snow cover in the watershed melting,
> i'll be canoeing in maybe 24 hours or less (and remain within the city
> limits of the most populous city in the state.)
>
> but i am pretty concerned about the present thing and i was around for
> cold war, Cuban missile crisis, JFK assassination, 1968 (two
> assassinations, burning streets, Chicago DNC), U.S. president forced out of
> office mid-term, 911 and this is worrisome.  i should have converted my
> TIAA-CREF retirement investment from stock to bonds last January.  There's
> not that much in there, but it looks like the curves of the markets.
>
>
> >
> > Steffan
> >
> > PS.: Did somebody not see the formulas in my post? They were embedded
> pdfs made in Grapher.
> >
>
> i edited it out.
>
> anyway, while i have done this sliding Hann window before, i haven't done
> it for a sliding DFT.  but i would be excited to see a good implementation
> of constant Q filterbank that is very close to perfect reconstruction if
> the modification in the frequency domain is null.  one could make a Hann
> windowed DTFT evaluated at a finite number of arbitrary frequencies.  i
> just wonder if a sliding Hann window would be best.  but, using a truncated
> cosine as the impulse response of the TIIR, whatever shape of the window
> would have to be a sum of truncated cosines (plus the constant term).  but
> you could make a nice frequency analyzer of log-spaced, constant-Q,
> filterbanks with a bank of truncated IIRs and pre-multiplying the input to
> each filter by e^(-j omega n).  making them add up to a wire is a harder
> problem.
>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread robert bristow-johnson


> On March 20, 2020 2:45 PM STEFFAN DIEDRICHSEN  wrote:
> 
>  
> Actually, you can do a  window size per bin and an arbitrary spacing of the 
> frequencies and create a “true” constant Q SDFT. Somehow, it reminds me on 
> the modal synthesis stuff, which can be used to create weird processing. 
> 

yeah. there would be no Cooley-Tukey goin' on here. essentially, it's a bank of 
resonant filters.  perhaps one could come up with complementary spectral 
envelopes in the log(f) scale that would add to 1.  but i dunno if you could 
get such a sliding window using truncated IIR filters, so the Fourier Transform 
of it looks like one of these complementary spectral envelopes.  if it doesn't, 
adding the results of these will not get you perfect reconstruction.

> Regarding Corona: I’m doing home office with a view on my garden. Vermont is 
> not the worst place to holed in, right?  How is the supply with TP?

before the crisis onset, we had two 12-roll packages, one hasn't been opened.  
i'm not too uncomfortable about provisions.  i haven't yet seen the store yet, 
but i might this afternoon (that'll be sobering).  i need to buy some booze.

i live at the mouth of a river into Lake Champlain (44.527966 lat, -73.270829 
long).  it's 2-dimensional, but i have a 200 acre natural waterpark in my back 
yard.  with the snow cover in the watershed melting, i'll be canoeing in maybe 
24 hours or less (and remain within the city limits of the most populous city 
in the state.)

but i am pretty concerned about the present thing and i was around for cold 
war, Cuban missile crisis, JFK assassination, 1968 (two assassinations, burning 
streets, Chicago DNC), U.S. president forced out of office mid-term, 911 and 
this is worrisome.  i should have converted my TIAA-CREF retirement investment 
from stock to bonds last January.  There's not that much in there, but it looks 
like the curves of the markets.


> 
> Steffan
> 
> PS.: Did somebody not see the formulas in my post? They were embedded pdfs 
> made in Grapher. 
> 

i edited it out.

anyway, while i have done this sliding Hann window before, i haven't done it 
for a sliding DFT.  but i would be excited to see a good implementation of 
constant Q filterbank that is very close to perfect reconstruction if the 
modification in the frequency domain is null.  one could make a Hann windowed 
DTFT evaluated at a finite number of arbitrary frequencies.  i just wonder if a 
sliding Hann window would be best.  but, using a truncated cosine as the 
impulse response of the TIIR, whatever shape of the window would have to be a 
sum of truncated cosines (plus the constant term).  but you could make a nice 
frequency analyzer of log-spaced, constant-Q, filterbanks with a bank of 
truncated IIRs and pre-multiplying the input to each filter by e^(-j omega n).  
making them add up to a wire is a harder problem.

--
 
r b-j  r...@audioimagination.com
 
"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread STEFFAN DIEDRICHSEN
Actually, you can do a  window size per bin and an arbitrary spacing of the 
frequencies and create a “true” constant Q SDFT. Somehow, it reminds me on the 
modal synthesis stuff, which can be used to create weird processing. 

Regarding Corona: I’m doing home office with a view on my garden. Vermont is 
not the worst place to holed in, right?  How is the supply with TP?

Steffan

PS.: Did somebody not see the formulas in my post? They were embedded pdfs made 
in Grapher. 


> On 20.03.2020|KW12, at 19:26, robert bristow-johnson 
>  wrote:
> 
> 
> so the "implicit" sliding rectangular window has just as much mathematical 
> meaning as if it were explicit.
> 
> as Steffan points out, this implicit sliding rectangular window used in the 
> sliding DFT is essentially the same implicit sliding rectangular window used 
> in the efficient method of computing moving average over a contiguous 
> interval.
> 
> the efficient moving average filter is a specific example of Truncated IIR 
> filters (remember that topic?).  you can use the same theory to design a 
> sliding Hann window that is *efficient* (O(1) instead of O(N)), where you 
> need not do sample-by-sample multiplication over the entire interval.  so the 
> sliding DFT would be the same multiply by e^{-j 2 pi n k/N} (where k is the 
> bin number) followed by a sliding weighted sum where the weighting function 
> is a Hann window instead of a rectangular window.
> 
> hope y'all are doing okay under this Coronavirus thing.  i am holed up in 
> Vermont.
> 
> --
> 
> r b-j  r...@audioimagination.com
> 
> "Imagination is more important than knowledge."
> 
> 
>> On March 20, 2020 4:46 AM STEFFAN DIEDRICHSEN  wrote:
>> 
>> Hello Richard,
>> 
>> Sure the window has a meaning. The window is pulled into the integration and 
>> exists there as its differentiated form.If you rewrite formula [1] of your 
>> paper: 
>> 
>> 
> 
> ...
>> It’s a lot more to process, but hey, we all have Mac Pros, don’t we?
>> 
>> 
>> 
>>> On 19.03.2020|KW12, at 19:11, Richard Dobson < rich...@rwdobson.com> wrote:
>>> 
>>> 
>>> So the rectangular window is at best implicit - I'm not sure it even has 
>>> any meaning in this situation.
>> 

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread robert bristow-johnson

so the "implicit" sliding rectangular window has just as much mathematical 
meaning as if it were explicit.

as Steffan points out, this implicit sliding rectangular window used in the 
sliding DFT is essentially the same implicit sliding rectangular window used in 
the efficient method of computing moving average over a contiguous interval.

the efficient moving average filter is a specific example of Truncated IIR 
filters (remember that topic?).  you can use the same theory to design a 
sliding Hann window that is *efficient* (O(1) instead of O(N)), where you need 
not do sample-by-sample multiplication over the entire interval.  so the 
sliding DFT would be the same multiply by e^{-j 2 pi n k/N} (where k is the bin 
number) followed by a sliding weighted sum where the weighting function is a 
Hann window instead of a rectangular window.

hope y'all are doing okay under this Coronavirus thing.  i am holed up in 
Vermont.

--
 
r b-j  r...@audioimagination.com
 
"Imagination is more important than knowledge."


> On March 20, 2020 4:46 AM STEFFAN DIEDRICHSEN  wrote:
> 
> Hello Richard,
>  
> Sure the window has a meaning. The window is pulled into the integration and 
> exists there as its differentiated form.If you rewrite formula [1] of your 
> paper: 
> 
>  

...
> It’s a lot more to process, but hey, we all have Mac Pros, don’t we?
>  
> 
>  
> > On 19.03.2020|KW12, at 19:11, Richard Dobson < rich...@rwdobson.com> wrote:
> >  
> >  
> > So the rectangular window is at best implicit - I'm not sure it even has 
> > any meaning in this situation.
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread john
I thought I would join in a little. I was the PI for the short time we were 
funded to research this area.  Richard was the initial idea man and did much of 
the work.  Russell Bradford did the complex mathematics.  In total there were 5 
papers on this topic(BibTeX entries below).


The basic sliding DFT was implemented in C and again adapted for use in Csound 
where it still exists.  Russell produced the GPU version in the last paper as 
again standalone.  Since then as our work was not funded beyond the initial 12 
months my colleagues at National University of Ireland, Maynooth took some 
interest and implemented a GPU version as an experimental module in Csound 
where it continues to be distributed.


When our funding dried up we had plans to implement the Constant Q DFT on a 
GPU.  Our calculations indicated that the commodity GPU we used would allow 
that version in real time with some headroom, and I am still disappointed that 
that project never happened.  I never had a GPU myself so I was relying on 
Richard with a budget for this.


@InProceedings{JPF84,
  author =   {Russell Bradford and Richard Dobson and John ffitch},
  title ={{Sliding is Smoother than Jumping}},
  booktitle ={ICMC 2005 free sound},
  pages ={287--290},
  year = {2005},
  editor =   {{SuviSoft Oy Ltd, Tampere, Finland}},
  organization = {Escola Superior de M\'usica de Catalunya},
  note = 
{\url{http://www.cs.bath.ac.uk/~jpff/PAPERS/BradfordDobsonffitch05.pdf}},

  pure = {yes}
}

@InProceedings{JPF92,
  author =   {Russell Bradford and Richard Dobson and John ffitch},
  title ={The Sliding Phase Vocoder},
  booktitle ={Proceedings of the 2007 International Computer Music 
Conference},

  pages ={449--452},
  year = 2007,
  editor =   {Suvisoft~Oy~Ltd},
  volume =   {II},
  month ={August},
  publisher ={ICMA and Re:New},
  note = {ISBN 0-9713192-5-1},
  annote = {\url{http://cs.bath.ac.uk/jpff/PAPERS/spv-icmc2007.pdf}},
 a pure = {yes}
}

@InProceedings{JPF95,
  author =   {John ffitch and Richard Dobson and Russell Bradford},
  title ={{Sliding DFT for Fun and Musical Profit}},
  booktitle ={6th International Linux Audio Conference},
  pages ={118--124},
  year = {2008},
  editor =   {Frank Barknecht and Martin Rumori},
  address =  {Kunsthochscule f\"ur Medien K\"oln},
  month ={March},
  organization = {LAC2008},
  publisher ={Tribun EU, Gorkeho 41, Bruno 602 00},
  note = {ISBN 978-80-7399-362-7},
  annote = {\url{http://lac.linuxaudio.org/2008/download/papers/10.pdf}},
  pure = {yes}
}

@InProceedings{JPF98,
  author =   {Russell Bradford and Richard Dobson and John ffitch},
  title ={{Sliding with a Constant $Q$}},
  booktitle ={Proc. of the Int. Conf. on Digital Audio Effects (DAFx-08)},
  pages ={363--369},
  year = {2008},
  address =  {Espoo, Finland},
  month ={Sep 1-4},
  organization = {DAFx08},
  note = {ISBN 978-951-22-9517-3},
  pure = {yes}
}

@InProceedings{JPF109,
  author =   {Russell Bradford and John ffitch and Richard Dobson},
  title ={{Real-time Sliding Phase Vocoder using a Commodity GPU}},
  booktitle ={Proceedings of ICMC2011},
  pages ={587--590},
  year = {2011},
  series =   {ICMC},
  month ={August},
  organization = {University of Huddersfield and ICMA},
  note = {ISBN 978-0-9845274-0-3},
  pure = {yes}
}

As well as the Pure repository at the University of Bath I have copies of all 
these papers and in some cases the presentation slides and audio examples if 
anyone wants a copy.


On Thu, 19 Mar 2020, Richard Dobson wrote:

In  my original C programs it was all implemented in double precision f/p, 
and the results were pretty clean (but we never assessed it formally at the 
time), but as the computational burden was substantial on a standard PC, 
there was no way to run them in real time to perform a   soak test.


However, we received some advanced (at the time) highly parallel accelerator 
cards frm a Bristol company "Clearspeed" which did offer the opportunity to 
perform real-time oscillator bank synthesis (by making a rudimentary VST 
synth). For example, to generate band-limited square and sawtooth waves. With 
single precision, and real-time generation it did not take long at all (I ran 
it one time for 20mins, monitoring on an oscilloscope) for phases to degrade 
and thus the waveform shape degraded. Conversely, with double precision 
(which those cards fully supported, most unusually for the time), I was able 
to leave it running for some hours, with no visible degradation of the 
waveform or audible increase in noise.


It doesn't fully answer your question, but I hope it offers some indication 
of the potential of the process.


Later on, colleagues at Bath University got the SPV 

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-20 Thread STEFFAN DIEDRICHSEN
Hello Richard, Sure the window  has a meaning. The window is pulled into the integration and exists there as its differentiated form.If you rewrite formula [1] of your paper:

PastedGraphic-2.pdf
Description: Adobe PDF document
or plain text: Ft+1(n)=(Ft(n) + (-1.)*f(t) + (1.)*ft+N)you have the 1 and -1 of the discretely differentiated rectangular window height 1. So you’re able to use otters windows, you just need to differentiate them and the equation becomes this:

PastedGraphic-3.pdf
Description: Adobe PDF document
with dWi being the discretely differentiated window function. It’s a lot more to process, but hey, we all have Mac Pros, don’t we?Best,SteffanOn 19.03.2020|KW12, at 19:11, Richard Dobson  wrote: So the rectangular window is at best implicit - I'm not sure it even has any meaning in this situation. ___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-19 Thread Eric Brombaugh

Thanks for the clarification.

Sorry for the confusion re twiddle factors - I meant the per-bin complex 
rotations, so I believe we're on the same page.


It's good to know that you found single precision floating pt to be 
insufficient for long-term stability. The low-resolution fixed point 
SDFT I built wasn't designed to run for more than a few tens of 
milliseconds before being reset, but I did see some error build-up over 
that period so it's not surprising that high resolution is required for 
long run times. It might be interesting to play with this in an FPGA 
context, so it's good to set the expectations properly at the outset.


Eric

On 3/19/20 11:11 AM, Richard Dobson wrote:

(caveat - 13 years since I worked on this)

This is a real single-sample update Sliding DFT, not a block-based 
method. The sample comes in, and used to perform a complex rotation to 
each bin, followed by the frequency-domain convolution. There are no 
twiddle factors as such. So the rectangular window is at best implicit - 
I'm not sure it even has any meaning in this situation. The approach 
from the outset was for the goal of real-time processing - i.e. 
potentially for hours non-stop. We found (in the Cleaspeed project) that 
single-precision floats would not support that; I don't know whether 
anything less than double precision is required - those were the only 
choices available.


It's "embarrassingly parallel" as an algorithm, so very suited to 
dedicated massively parallel hardware. I know FPGAs are pretty powerful 
these days so might well do the job (but some transformations are pretty 
cpu-intensive too!). The Bath Uni team said they were using a 
"mid-range" graphic card (on a Linux workstation).


Richard Dobson

On 19/03/2020 17:45, Eric Brombaugh wrote:

Wow - interesting discussion.

I've implemented a real-time SDFT on an FPGA for use in carrier 
acquisition of communications signals. It was surprisingly easy to do 
and didn't require particularly massive resources, although FPGAs 
naturally facilitate a degree of low-level parallelism that you can't 
easily achieve in CPU-based systems.


Based on this it might be feasible to build the SPV on a modest FPGA 
rather than resorting to GPUs or specialized parallel CPU systems. The 
main stumbling block that I see was your use of double-precision 
floating point. If that level of accuracy is really necessary then a 
higher end FPGA would be needed as most mid-range devices are geared 
more for fixed point or single-precision floating point.


I was a bit confused by the ICMC paper when it came to windowing. The 
SDFT structure I'm used to seeing (as discussed in the Lyons/Jacobsen 
article you referenced) involves a rectangular window applied prior to 
the twiddle calculations using a comb-filter structure. Is this window 
replaced by your frequency domain convolutions, or are the 
cosine-based windows applied in addition to the rectangular one?


Eric


___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-19 Thread Richard Dobson

(caveat - 13 years since I worked on this)

This is a real single-sample update Sliding DFT, not a block-based 
method. The sample comes in, and used to perform a complex rotation to 
each bin, followed by the frequency-domain convolution. There are no 
twiddle factors as such. So the rectangular window is at best implicit - 
I'm not sure it even has any meaning in this situation. The approach 
from the outset was for the goal of real-time processing - i.e. 
potentially for hours non-stop. We found (in the Cleaspeed project) that 
single-precision floats would not support that; I don't know whether 
anything less than double precision is required - those were the only 
choices available.


It's "embarrassingly parallel" as an algorithm, so very suited to 
dedicated massively parallel hardware. I know FPGAs are pretty powerful 
these days so might well do the job (but some transformations are pretty 
cpu-intensive too!). The Bath Uni team said they were using a 
"mid-range" graphic card (on a Linux workstation).


Richard Dobson

On 19/03/2020 17:45, Eric Brombaugh wrote:

Wow - interesting discussion.

I've implemented a real-time SDFT on an FPGA for use in carrier 
acquisition of communications signals. It was surprisingly easy to do 
and didn't require particularly massive resources, although FPGAs 
naturally facilitate a degree of low-level parallelism that you can't 
easily achieve in CPU-based systems.


Based on this it might be feasible to build the SPV on a modest FPGA 
rather than resorting to GPUs or specialized parallel CPU systems. The 
main stumbling block that I see was your use of double-precision 
floating point. If that level of accuracy is really necessary then a 
higher end FPGA would be needed as most mid-range devices are geared 
more for fixed point or single-precision floating point.


I was a bit confused by the ICMC paper when it came to windowing. The 
SDFT structure I'm used to seeing (as discussed in the Lyons/Jacobsen 
article you referenced) involves a rectangular window applied prior to 
the twiddle calculations using a comb-filter structure. Is this window 
replaced by your frequency domain convolutions, or are the cosine-based 
windows applied in addition to the rectangular one?


Eric


___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-19 Thread Eric Brombaugh

Wow - interesting discussion.

I've implemented a real-time SDFT on an FPGA for use in carrier 
acquisition of communications signals. It was surprisingly easy to do 
and didn't require particularly massive resources, although FPGAs 
naturally facilitate a degree of low-level parallelism that you can't 
easily achieve in CPU-based systems.


Based on this it might be feasible to build the SPV on a modest FPGA 
rather than resorting to GPUs or specialized parallel CPU systems. The 
main stumbling block that I see was your use of double-precision 
floating point. If that level of accuracy is really necessary then a 
higher end FPGA would be needed as most mid-range devices are geared 
more for fixed point or single-precision floating point.


I was a bit confused by the ICMC paper when it came to windowing. The 
SDFT structure I'm used to seeing (as discussed in the Lyons/Jacobsen 
article you referenced) involves a rectangular window applied prior to 
the twiddle calculations using a comb-filter structure. Is this window 
replaced by your frequency domain convolutions, or are the cosine-based 
windows applied in addition to the rectangular one?


Eric

On 3/19/20 10:23 AM, Richard Dobson wrote:
In  my original C programs it was all implemented in double precision 
f/p, and the results were pretty clean (but we never assessed it 
formally at the time), but as the computational burden was substantial 
on a standard PC, there was no way to run them in real time to perform a 
   soak test.


However, we received some advanced (at the time) highly parallel 
accelerator cards frm a Bristol company "Clearspeed" which did offer the 
opportunity to perform real-time oscillator bank synthesis (by making a 
rudimentary VST synth). For example, to generate band-limited square and 
sawtooth waves. With single precision, and real-time generation it did 
not take long at all (I ran it one time for 20mins, monitoring on an 
oscilloscope) for phases to degrade and thus the waveform shape 
degraded. Conversely, with double precision (which those cards fully 
supported, most unusually for the time), I was able to leave it running 
for some hours, with no visible degradation of the waveform or audible 
increase in noise.


It doesn't fully answer your question, but I hope it offers some 
indication of the potential of the process.


Later on, colleagues at Bath University got the SPV fully running in 
real time on Nvidia GPU cards programmed using CUDA, fed with real-time 
audio input, and this was presented (I think) at either ICMC or DaFX. If 
John Fitch is following this, he will be able to give more details. GPUs 
are definitely the way to go for SPV in real time. I estimated 
(back-of-an-envelope-style) demands of the order of 50GFlops. Of course 
there remain many unanswered questions!


Richard Dobson

On 19/03/2020 16:18, Ethan Duni wrote:



On Tue, Mar 10, 2020 at 1:05 PM Richard Dobson > wrote:



    Our ICMC paper can be found here, along with a few beguiling sound
    examples:

    http://dream.cs.bath.ac.uk/SDFT/


So this is pretty cool stuff. I can't say I've digested the whole idea 
yet, but I had a couple of obvious questions.


In particular, the analyzer is defined by a recursive formula, and I 
gather that the synthesizer effectively becomes an oscillator bank. 
So, are special numerical techniques required to implement this, in 
order to avoid the build-up of round-off noise over time?


Ethan

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp


___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-19 Thread Richard Dobson

sorry for the repeats - don't know how that happened!
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-19 Thread Richard Dobson
In  my original C programs it was all implemented in double precision 
f/p, and the results were pretty clean (but we never assessed it 
formally at the time), but as the computational burden was substantial 
on a standard PC, there was no way to run them in real time to perform a 
  soak test.


However, we received some advanced (at the time) highly parallel 
accelerator cards frm a Bristol company "Clearspeed" which did offer the 
opportunity to perform real-time oscillator bank synthesis (by making a 
rudimentary VST synth). For example, to generate band-limited square and 
sawtooth waves. With single precision, and real-time generation it did 
not take long at all (I ran it one time for 20mins, monitoring on an 
oscilloscope) for phases to degrade and thus the waveform shape 
degraded. Conversely, with double precision (which those cards fully 
supported, most unusually for the time), I was able to leave it running 
for some hours, with no visible degradation of the waveform or audible 
increase in noise.


It doesn't fully answer your question, but I hope it offers some 
indication of the potential of the process.


Later on, colleagues at Bath University got the SPV fully running in 
real time on Nvidia GPU cards programmed using CUDA, fed with real-time 
audio input, and this was presented (I think) at either ICMC or DaFX. If 
John Fitch is following this, he will be able to give more details. GPUs 
are definitely the way to go for SPV in real time. I estimated 
(back-of-an-envelope-style) demands of the order of 50GFlops. Of course 
there remain many unanswered questions!


Richard Dobson

On 19/03/2020 16:18, Ethan Duni wrote:



On Tue, Mar 10, 2020 at 1:05 PM Richard Dobson > wrote:



Our ICMC paper can be found here, along with a few beguiling sound
examples:

http://dream.cs.bath.ac.uk/SDFT/


So this is pretty cool stuff. I can't say I've digested the whole idea 
yet, but I had a couple of obvious questions.


In particular, the analyzer is defined by a recursive formula, and I 
gather that the synthesizer effectively becomes an oscillator bank. So, 
are special numerical techniques required to implement this, in order to 
avoid the build-up of round-off noise over time?


Ethan

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp


___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp