It is certainly possible to combine STFT with fast convolution in various ways. 
But doing so imposes significant overhead costs and constrains the overall 
design in strong ways. 

For example, this approach:

> On Mar 9, 2020, at 7:16 AM, Spencer Russell <s...@media.mit.edu> wrote:
> 
> 
> if you have an KxN STFT (K frequency components and N frames) then then 
> zero-padding each frame by K-1 should still eliminate any time-aliasing even 
> if your filter has hard edges in the frequency domain, right?

Right, but if you are using length K FFT and zero-padding by K-1, then the hop 
size is 1 sample and there are no windows. 

This is just applying the raw IDFT of the response as an FIR, which is not 
appropriate for something estimated in a windowed filterbank domain. Deriving 
an equivalent FIR from, say, an estimated noise reduction mask is not trivial.

> 
> I understand the role of time-domain windowing in STFT processing to be 
> mostly:
> 1. Reduce frequency-domain ripple (side-lobes in each band)

Right, this is the “analysis” aspect, where the window controls the spectral 
characteristics (frequency selectivity, bandwidth, leakage, etc.)

> 2. Provide a sort of cross-fade from frame-to-frame to smooth out framing 
> effects

And that is the “synthesis” aspect, where the window controls the 
characteristics of the artifacts introduced by processing. Note that “framing 
effects” are by definition time-variant: this is a form of aliasing.

Ethan
_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to