Wen Xue wrote:
Maybe a beginner's question here:
when pitch-synchronized OLA is used to modify speech pitch, do we
resample the original signal or not?
From 80s speech coding I recall the analysis of the formant of a signal
could be determined by some form of FFT, and I suppose just like with
other applications, you can overlap/average the results, depending on
how the bin-size works with that particular signal.
However, there's a conceptual difference between knowing or measuring
the length of the waveform, or of N waveforms (presuming there is a
singular, undisturbed waveform), and making a harmonic analysis of that
particular length of the waveform(s), or, as a bit different approach to
take a fixed FFT interval length, and do a general frequency analysis,
without making it so that the fundamental is the lowest frequency of the
the FFT analysis. Unless you take a random length FFT (not uncommon in
modern accelerated libs), and are willing to live with the rounding
error you'll get, depending on the number of measured (and partially
averaged) waves, their frequency, and the sample frequency. This
rounding can be considerable, which for speech coding may be fine.
You could also do an actual re-sampling of the signal, based on sampling
theory. which entails having taking proper equi-distant, impulse samples
with your Analog to Digital convertor, using some small or large
windowed version of the sinc (sin(x)/x)) function and the proper math
and signal flow rolling.
If you did actual re-sampling, and you make sure the re-sampled
frequency is higher, or you made sure harmonics were absent or filtered
out to prevent aliasing, you could try to match your averaging interval
(for N=1 or N>1 full wave shapes, in case of a single wave, no musical
chords or atonal components) with the sample-length of the waveform
you're analyzing.
Presuming you gave sufficient spectral components in a general FFT to
inverse FFt the waveform at a different fundamental frequency is
probably going t give you a hard time if you want to get a little
accurate. Serious filtering could get you rid of the transients that
will mess up your FFT results, but the results are probably going to be
relatively crude, have little to do with the re-sampling in EE terms,
but may suffice for speech coding on phones or so.
Theo V.
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp