Am 22.05.2018 um 14:11 schrieb Theo Verelst:
> fundamentally limited by the length of the sinc (-like) "perfect"
> resample kernel, and the required delay for accurate re-sampling might
> be considerable!
This can be limited by an increasing sampling rate reducing the
coarsness, but leads to large number of TAPs. I had some progress with
this regular issue in digital signal processing in replacing DSPs by
FPGAs running at high speeds.
> getting a huge transient at the end, or you do some sort of
smoothing. This will always be an issue hard to avoid. Smooting in
between parallely (pipelined) processed wavelets is essential and th
shorter the fragments are the less artifacts one will get. Doing this in
the frequency domain will require e.g. a tight FFT with high overlapping.
We have similar issues when processing radar reflections or time of
> Also, for programs meant for the human voice, there might be issues
> because those programs might do estimations of the voice parameters in
> order to change pitch
With the voice it is even more tricky, since the formant shaping is
different for other frequencies. One reason is, that there are more than
one "equalizer" involved. Just putting the whole track to a hight frequ
will disreagard this.
Moreover the tuning is not even correct:
Very skilled singers perform a slightly higher pitch to the dark "A",
"O" and "U" in comparison to the "I" and "E". This is because they
appear lower even if sung on the right pitch. But this is not static but
realated to the music. Very low "A" are overpitched a bit more, than
higher "A" when singing.
All this cannot be handled by a static pitch shift.
And there is much more...
dupswapdrop: music-dsp mailing list