Wen Xue wrote:
Maybe a beginner's question here:

when pitch-synchronized OLA is used to modify speech pitch, do we
resample the original signal or not?


From 80s speech coding I recall the analysis of the formant of a signal could be determined by some form of FFT, and I suppose just like with other applications, you can overlap/average the results, depending on how the bin-size works with that particular signal.

However, there's a conceptual difference between knowing or measuring the length of the waveform, or of N waveforms (presuming there is a singular, undisturbed waveform), and making a harmonic analysis of that particular length of the waveform(s), or, as a bit different approach to take a fixed FFT interval length, and do a general frequency analysis, without making it so that the fundamental is the lowest frequency of the the FFT analysis. Unless you take a random length FFT (not uncommon in modern accelerated libs), and are willing to live with the rounding error you'll get, depending on the number of measured (and partially averaged) waves, their frequency, and the sample frequency. This rounding can be considerable, which for speech coding may be fine.

You could also do an actual re-sampling of the signal, based on sampling theory. which entails having taking proper equi-distant, impulse samples with your Analog to Digital convertor, using some small or large windowed version of the sinc (sin(x)/x)) function and the proper math and signal flow rolling.

If you did actual re-sampling, and you make sure the re-sampled frequency is higher, or you made sure harmonics were absent or filtered out to prevent aliasing, you could try to match your averaging interval (for N=1 or N>1 full wave shapes, in case of a single wave, no musical chords or atonal components) with the sample-length of the waveform you're analyzing.

Presuming you gave sufficient spectral components in a general FFT to inverse FFt the waveform at a different fundamental frequency is probably going t give you a hard time if you want to get a little accurate. Serious filtering could get you rid of the transients that will mess up your FFT results, but the results are probably going to be relatively crude, have little to do with the re-sampling in EE terms, but may suffice for speech coding on phones or so.


Theo V.

--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Reply via email to