It is certainly possible to combine STFT with fast convolution in various ways. But doing so imposes significant overhead costs and constrains the overall design in strong ways.
For example, this approach: > On Mar 9, 2020, at 7:16 AM, Spencer Russell <s...@media.mit.edu> wrote: > > > if you have an KxN STFT (K frequency components and N frames) then then > zero-padding each frame by K-1 should still eliminate any time-aliasing even > if your filter has hard edges in the frequency domain, right? Right, but if you are using length K FFT and zero-padding by K-1, then the hop size is 1 sample and there are no windows. This is just applying the raw IDFT of the response as an FIR, which is not appropriate for something estimated in a windowed filterbank domain. Deriving an equivalent FIR from, say, an estimated noise reduction mask is not trivial. > > I understand the role of time-domain windowing in STFT processing to be > mostly: > 1. Reduce frequency-domain ripple (side-lobes in each band) Right, this is the “analysis” aspect, where the window controls the spectral characteristics (frequency selectivity, bandwidth, leakage, etc.) > 2. Provide a sort of cross-fade from frame-to-frame to smooth out framing > effects And that is the “synthesis” aspect, where the window controls the characteristics of the artifacts introduced by processing. Note that “framing effects” are by definition time-variant: this is a form of aliasing. Ethan _______________________________________________ dupswapdrop: music-dsp mailing list music-dsp@music.columbia.edu https://lists.columbia.edu/mailman/listinfo/music-dsp