Well, kind of. The idea of the phase vocoder, which more or less describes what you said,
I've only been thinking about how this is done for very short periods of
time. My naive approach to timestretching would be to transform the
signal into the frequency domain [either by windowe fourier or by
wavelet transform]. and then afterwards retransform, but with a changed
time base. Actually i rather think of it as synthesizing the
timestretched material from the frequency information..
is to decompose each time-domain frame into N frequency bins, and to suppose that there is only one underlying stationary sinusoidal in each frequency canal. If this is the case, you unwrap the phase to have the frequency of the sinusoid, and you resynthetise it with a
longer or shorter time frame.
The problem is that it demands short windows (for the hypothesis one stationary sinusoid in each frequency canal to be valid), which means very poor frequency resolution. Basically, you have to make a trade off between time resolution and frequency resolution (nothing new here ;)). So the idea is to adapt the window size to the content of the signal, which means being able to detect the transient (which are better stretched with small windows)...
J Bonada wrote his PhD on this subject, if you are interested:
http://www.kug.ac.at/iem/lehre/arbeiten/hammer.pdf
cheers,
David
I'm sure i miss something [haven't actually looked at the math [especially for the retransform with changed timebase] though i do have some fourier/wavelet transform knowledge]
Florian Schmidt
