Re: [music-dsp] A theory of optimal splicing of audio in the time domain.

Olli Niemitalo Wed, 13 Jul 2011 06:29:53 -0700

On Sat, Jul 9, 2011 at 10:53 PM, robert bristow-johnson
<r...@audioimagination.com> wrote:
> On Dec 7, 2010, at 5:27 AM, Olli Niemitalo wrote:
>
> > [I] chose that the ratio a(t)/a(-t) [...] should be preserved
>
> by "preserved", do you mean constant over all t?


Constant over all r.

> what is the fundamental reason for preserving a(t)/a(-t) ?

I'm thinking outside your application of automatic finding of splice
points. Think of crossfades between clips in a multi-track sample
editor. For a cross-fade in which one signal is faded in using a
volume envelope that is a time-reverse of the volume envelope using
which the other signal is faded out, a(t)/a(-t) describes by what
proportions the two signals are mixed at each t. The fundamental
reason then is that I think it is a rather good description of the
shape of the fade, to a user, as it will describe how the second
signal swallows the first by time. The user might choose one "shape"
for a particular crossfade. Then, depending on the correlation between
the superimposed signals, an appropriate symmetrical volume envelope
could be applied to the mixed signal to ensure that there is no peak
or dip in the contour of the mixed signal. Because the envelope is
symmetrical, applying it "preserves" a(t)/a(-t). It can also be
incorporated directly into a(t).

All that is not so far off from the application you describe.

> but i don't think it is necessary to deal with lags where Rxx(tau) < 0.  why
> splice a waveform to another part of the same waveform that has opposite
> polarity?  that would create an even a bigger glitch.

Splicing at quiet regions with negative correlation can give a smaller
glitch than splicing at louder regions with positive correlation. This
applies particularly to rhythmic material like drum loops, where the
time lag between the splice points is constrained, and it may make
most sense to look for quiet spots. However, if it's already so quiet
in there, I don't know how much it matters what you use for a
cross-fade.

Apart from "it's so quiet it doesn't matter", I can think of one other
objection against using cross-fades tailored for r < 0: For example,
let's imagine that our signal is white noise generated from a Gaussian
distribution, and we are dealing with given splice points for which
Rxx(tau) < 0 (slightly). Now, while the samples of the signal were
generated independently, there is "by accident" a bit of negative
correlation in the instantiation of the noise, between those splice
points. Knowing all this, shouldn't we simply use a constant-power
fade, rather than a fade tailored for r < 0, because random deviations
in noise power are to be expected, and only a constant-power fade will
produce noise that is statistically identical to the original. I would
imagine that noise with long-time non-zero autocorrelation (all the
way across the splice points) is a very rare occurrence. Then again,
do we really know all this, or even that we are dealing with noise.

I should note that Rxx(tau) < 0 does not imply opposite polarity, in
the fullest sense of the adjective. Two equal sinusoids that have
phases 91 degrees apart have a correlation coefficient of about
-0.009.

RBJ, I'd like to return the favor and let you know that I have great
respect for you in these matters (and absolutely no disrespect in any
others :-) ). Hey, I wonder if you missed also my other post in the
parent thread? You can search for
AANLkTim=eM_kgPeibOqFGEr2FdKyL5uCCB_wJhz1Vne

-olli
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] A theory of optimal splicing of audio in the time domain.

Reply via email to