> I can't understand the difference between SOLA, PSOLA and WSOLA.
I'll attempt a partial answer:
I think PSOLA and WSOLA are clearly distinct.
PSOLA involves identifying a time varying pitch (fundamental frequency)
track for the input, segmenting the input signal into (possibly
overlapping) windowed grains which are synchronous to this fundamental
frequency (e.g. grains that are centered on glottal pulses) and then
altering the rate at which the grains are assembled in the output stream.
WSOLA involves breaking the signal into grains using some method (e.g.
constant duration grains), then concatenating input grains to the output
stream with relative phase adjusted according to two criteria: (1) on
average, the input must be consumed at a rate that maintains the
timescaling factor; (2) the source material should be mixed (with
windowing) into the output stream in a way that minimizes local error
over the crossfade region (i.e. to minimize phase cancellation) -- if
the signal is strongly periodic, and the parameters are just right, this
will fairly nicely keep the period of the source waveform, but it lacks
sub-sample-accurate phase alignment I think. You can add enhancements
such as trying to avoid mixing the same transient into the output stream
more than once.
Not sure what SOLA is.
dupswapdrop: music-dsp mailing list