Re: [music-dsp] compensation for window sizes Re: FFT for realtime synthesis?

Scott Cotton Sun, 04 Nov 2018 14:50:45 -0800

On Sun, 4 Nov 2018 at 22:50, gm <g...@voxangelica.net> wrote:

>
> Thanks for the links.
> At the moment it's still to reverberant even with multiresolution.
>
> Probably Gabor windows are part of the solution, I still have to look at
> these papers closely.
> I will try those, but probably they only make sense with more multires
> bands then I have now.
>
> Also I want to get the synthesis stage at 2048 FFT size.
>
> How I do it now:
>
> analysis
>  at 4096 FFT size, log 2 multiresolution (7 bands), 16 overlaps (Hann
> Windows)
> but peak tracking and phase tracking at 4096 FFT size without
> multiresolution for better tracking
> it's a peak if it was a peak candidate in the frame after the frame or in
> adjacent bins
> it's a peak candidate if its larger than adjacent bins and larger than a
> noise threshold
> it's a transient if a peak track starts (or actually stops, regarded
> backwards).
>
> synthesis
> sines and noise sythesis:
> if its not at tracked peak, its noise, and synthesised with random
> amplitudes but updated phases
>






> transients:
> if it's transient, use original phase, accumlated phase otherwise
> do transients only once when they are crossed
>

note that in polyphonic sources, transients may only apply to one of the
sources, so
if you define transient as a slice of time say of a percussive onset, like
guitar, maybe
other strings or instruments are sustaining at the same time, and then the
transient
part may sound weird w.r.t. continuity of the other parts.


>
> freqshift:
> updated phases for timestretch and freq. shift or transients and frequency
> shift
> shift amplitude bins for frequency shift
>
> formants:
> filter original spectrum on log2 scale with increasingly long filters
> (should be done offline)
> divide by filtered spectrum, multiply by shifted filtered spectrum
>
>
I don't understand what you're doing with formants, would be interested to
learn more.



>
> Here is how it sounds, all examples are too time stretched (and or
> shifted) to better hear it's limits
> https://soundcloud.com/traumlos_kalt/transcoder-096-2/s-3bqkl
> Still reverberant, transients are better but still lacking, otherwise I am
> half way content
>
> Gabor windows, I will try these, further suggestions?
>
> phavorit
<https://hci.rwth-aachen.de/materials/publications/karrer2006a.pdf> gives a
nice overview.  There is some other tool I looked into once which did quite
well
and had binary downloads for a standalone tool.  It was part of a thesis at
a UK or US east coast music
school I think, but I've lost the reference. I think it was from about
10-15 years ago.  Maybe that will ring a bell
to someone else?

Also rubber band library, last I looked, does some sort of interpolation of
the windows relating
the input window size to output size used in resynthesis, I think after
interpolation it might
just do the window inverse, but I'm not sure.



> Especially for the transient detection, because at the moment it comes in
> too early
> for the lower bands, due to the temporal dilution.
>
> Maybe you can do transient processing as pre-processing and recombine
after the harmonic
parts are treated, and heuristically subtract the transient part of the
signal (detected for example
by spectral flux, and subtract out the noisy flatter part of the spectrum.




>
> Am 04.11.2018 um 22:07 schrieb Scott Cotton:
>
> The following may help
>
> https://www.dsprelated.com/freebooks/sasp/STFT_COLA_Decomposition.html
>
> http://lonce.org/Publications/publications/2007_RealtimeSignalReconstruction.pdf
> https://hal.archives-ouvertes.fr/hal-00453178/document
>
> librosa uses the below (If you have access)
> [1]D. W. Griffin and J. S. Lim, “Signal estimation from modified
> short-time Fourier transform,” IEEE Trans. ASSP, vol.32, no.2, pp.236–243,
> Apr. 1984.
>
> The biggest thing to note however is that if you modify the spectra in an
> STFT (or by taking in "grains") then the modified sequence of spectra no
> longer necessarily coincides to any signal, and so some sort of estimation
> is used.  If you're doing power of 2 sizes and corresponding subbed
> decomposition, you'd apply that to each band.  The Gabor frame stuff looks
> like it has multiple time-frequency tilings as well.
>
> I believe reconstructing with windowing is one of the hardest parts of
> frequency domain processing to do well.  It is still being researched, and
> it is one big reason why phase vocoders aren't, in my opinion, a solved
> problem.
>
> Scott
>
>
>
>
>
>
>
> On Sun, 4 Nov 2018 at 19:55, gm <g...@voxangelica.net> wrote:
>
>>
>>
>> Am 04.11.2018 um 17:00 schrieb gm:
>> >
>> > ok I now I tried a crude and quick multiresolution FFT analysis at log
>> > 2 basis
>>
>>
>> I half the window size (Hann) for every FFT.
>>
>> To compensate for the smaller window, I multiply by the factor that it
>> is smaller, that is 2, 4, 8,
>>
>> But it appears that there is noticably more energy now in the higher
>> bands with the smaller windows.
>>
>> How do I compensate for the window properly?
>> _______________________________________________
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>>
>
> --
> Scott Cotton
> http://www.iri-labs.com
>
>
>
>
> _______________________________________________
> dupswapdrop: music-dsp mailing 
> listmusic-dsp@music.columbia.eduhttps://lists.columbia.edu/mailman/listinfo/music-dsp
>
>
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp



-- 
Scott Cotton
http://www.iri-labs.com

_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] compensation for window sizes Re: FFT for realtime synthesis?

Reply via email to