On Mon, 29 Oct 2018 at 20:09, gm <g...@voxangelica.net> wrote:

> That's understood.
>
> What is not completely understood by me is the technique in the paper, and
> the very much related technique from the book.
> How can this apply to arbitrary signals when it relies on sinusioids
> seperated by several bins?
>
For music, it is a fairly close approximation during the "sustain" part of
notes.


>
> Also it seems I dont understand where the artefacts in my pitch shift come
> from.
> They seem to have to do with phases but it's not understood how exactly.
>
> What is understood is that the neighbouring bins of a sinusoidal peak
> have a phase -pi apart.
>
> I dont see the effect of this though, they rotate in the same direction,
> at the same speed.
>
> But why is there no artefact of this kind when the signal is only
> stretched,
> but not shifted?
>

I think the big picture answer to this last question is:  time stretch is
(or at least can be done) by a continuous factor
while bin shifting is inherently quantised.  Moreover the quantised nature
of the bins isn't really the same as say breaking
down a continuous interval into a set of contiguous equi-sized intervals,
because of the sinc shaped appearance of frequencies in the quantised
domain and edge effects.   I guess in theory one could find a set of  sines
whose projection onto the bins results in the frequency domain picture you
get, but this would be computationally quite expensive compared to bin
shifting (and is normally approximated by peak finding).

If when you are shifting bins, each bin ends up with the value of some
single bin from the original, then the phase
could be close.  But then you might be missing some input data in your
transform.  On the other hand, If the target value of a bin combines
several input bins, then the phase will likely be messed up most of the
time because you're encoding 2 or more distinct phase values into one.

Otherwise, for more precise treatment of phase than wrap/unwrap, see this
chapter <http://sethares.engr.wisc.edu/vocoders/Transforms.pdf> pages
118-120.

I also myself found it difficult to wrap my head around why TSM gives
better pitch shift than directly, the above is what I arrived at, and I
find it convincing.  Frequency domain pitch shift I think can only be as
good as TSM based pitch shift if the frequency domain is treated
continuously rather than in bins.  If you do that, then you no longer have
FFT, and things get costly and complicated fast.

Hope that helps
Scott






> Am 29.10.2018 um 19:50 schrieb Scott Cotton:
>
>
>
> On Mon, 29 Oct 2018 at 19:12, gm <g...@voxangelica.net> wrote:
>
>>
>>
>> Am 29.10.2018 um 05:43 schrieb Ethan Duni:
>> > You should have a search for papers by Jean Laroche and Mark Dolson,
>> > such as "About This Phasiness Business" for some good information on
>> > phase vocoder processing. They address time scale modification mostly
>> > in that specific paper, but many of the insights apply in general, and
>> > you will find references to other applications.
>> >
>> > Ethan
>>
>> I think the technique from the paper only applies for monophonic
>> harmonic input -?
>> It picks amplitude peaks and reconstructs the phase on bins around them
>> depending on the synthetic phase and
>> neighbouring input phase. I dont really see what it should do exactly
>> tbh, but
>> the criterion for a peak is that it is larger than four neighbouring
>> bins so this doesn't apply to arbitrary signals, I think.
>>
>> I also tried Miller Puckets phase locking mentioned in his book The
>> Theory and Technique of Electronic
>> Music and also mentioned in the paper,
>> but a) I don't hear any difference and b) I don't see how and why it
>> should work.
>>
>>  From the structure displayed in the book, he adds two neighbouring
>> complex numbered bins,
>> multiplied. That is, he multiplies their real and imaginary part
>> respectivly
>> and adds that to the values of the bin - (Fig 9.18 p. 293).
>> Unfortunately this is not explained in detail,
>>
>> I don't see what that would do other than adding a tiny perturbation to
>> isolated peaks
>> and somehwat larger one to neighbouring bins of peaks?
>> I don't see how this should lock phases of neighbouring bins?
>>
>> And again this doesn't apply to arbitrary signals?
>>
>
> No phase vocoder applies to arbitrary signals.  PVs work for polyphonic
> input where
> -  the change in frequency over time is slow; specifically slow enough so
> that the
> frequency calculation step over one hop can estimate "the" frequency over
> the corresponding time slice.
> -  there is at most one sinusoidal "component" per bin (this is
> unrealistic for many reasons), meaning
> the time slice needs to be large enough and FFT large enough to
> distinguish.
>
> Note the above can't handle, for example, onsets for most musical
> instruments.
>
> Nonetheless, the errors when these conditions do not hold are such that
> some are able to make
> decent sounding TSM/pitch change phase vocoders for a widER variety of
> input.
>
> If you put a chirp into a PV and change the rate of frequency change of
> the chirp, you'll hear the slow frequency
> change problem. Before you hear it, you'll see it in the form of little
> steps in the waveform
>
> Scott
>
>
>
>
>
>
>>
>> _______________________________________________
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>>
>
> --
> Scott Cotton
> http://www.iri-labs.com
>
>
>
>
> _______________________________________________
> dupswapdrop: music-dsp mailing 
> listmusic-dsp@music.columbia.eduhttps://lists.columbia.edu/mailman/listinfo/music-dsp
>
>
> _______________________________________________
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp



-- 
Scott Cotton
http://www.iri-labs.com
_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to