Dear Peter,

There are numerous (academic) sources which cite phase vocoding as a
"solved problem" when used
in conjunction with transient detection and phase locking.  I don't
entirely agree with that assessment.

Phase vocoders often have limitations around the following
1. integer vs real/float stretch factors
2. total amount of stretch (some phase vocoders deteriorate a lot as the
stretch factor approaches 2 or 1/2.
Given 1) above, this can also mean no stretch factors are in fact useful.
3. synthesis power/volume often is a rough approximation
4. windowing conditions often don't readily scale to different overlap
factors, constraining the available quality/cost tradeoffs available.
5. real time dynamic  pv can introduce additional artefacts at points where
the stretch factor changes.

I'm sure existing pv authors have more to say.  There are also related
sines + noise decompositions, and a
lot of academic reading material.   Many pv authors report that getting it
to work is much harder than textbooks lead one to believe.  Often there are
problems  associated with computing/estimating the principal value of a
phase difference for
frequency estimation.

I've not yet made a good pv myself, except perhaps using too
computationally costly altos,
so take with a grain of  salt.  I'd be interested in hearing others'  ideas
around current pv limitations and quality too.

Scott






On Sun, 28 Oct 2018 at 11:22, Peter P. <peterpar...@fastmail.com> wrote:

> Dear Scott,
>
> * Scott Cotton <w...@iri-labs.com> [2018-10-28 10:49]:
> > I don't know if you're "doing it the right way", however, pitch shift by
> > bin shifting has
> > the following problems:
> >
> > -edge effects (using windowing can help)
> > - pitch shift up puts some frequencies above nyquist limit, they need to
> be
> > elided
> > - the quantised pitch shift is only an approximation of a continuous
> pitch
> > shift because
> > the sinc shaped realisation of a pure sine wave in the quantised
> frequency
> > domain can occur
> > at different distances from the bin centers for different sine waves,
> > shifting bins doesn't do this
> > and thus isn't 100% faithful.
> >
> > From the sound clip, I'd guess that you might have some other problems
> > related to normalising the
> > synthesis volume/power
> >
> > The best quality commonly used pitch shift comes from a phase vocoder
> TSM:
> > stretch the time
> > and then resample (or vice versa) so that the duration of input equals
> that
> > of output.  Phase vocoders
> > however vary a lot in the quality of sound they produce, some are even as
> > bad or worse than the example
> > you provided.
>
> Thank you for this nice explanation, I wonder if you could even add a
> few more lines to it regarding the quality of phase vocoders. Your text
> ended when it was getting even more exciting. :)
>
> Thanks!
> P
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
>

-- 
Scott Cotton
http://www.iri-labs.com
_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to