Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-12-02 Thread Stefan Stenzel
Now I wonder, am I the only one to calculate ACF using FFT?

Regarding seamless loops, I found that quantizing frequencies
to integer numbers of periods in the loop works extremely well.

Regards,
Stefan
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-11-29 Thread Element Green
On Thu, Nov 25, 2010 at 9:33 PM, robert bristow-johnson
r...@audioimagination.com wrote:

 depending on how big your window is, i think a better term for this is
 *cross-correlation* not autocorrelation.  it's a single stream of audio so
 in a sense of the word, it *is* autocorrelation, but what i normally think
 of, with that semantic is something where the lag is no bigger or not much
 bigger than the analysis window of either loop-end region of the audio and
 the loop-begin.

 if the loop points are separated by a much longer time (number of samples)
 than the size (in samples) of the two slices of audio being correlated, it's
 really cross-correlation.  and you might find poor correlation given all
 lags that you're looking at.  in fact, doing cross-correlation from one part
 of the tone or sound to another part that has a rapid change in amplitude
 envelope might fool your correlation into thinking there is a good match
 when there really isn't (because the amplitude is increasing, then the
 cross-correlation increases, but not necessarily because of a good match).

 so, instead of either cross or autocorrelation, you might want to consider
 AMDF between the loop end and potential candidates to loop back to.  instead
 of looking for a maximum, you're looking for a minimum and a very low
 minimum means a good match (or a bad match during a very low signal level).

Looking at the equation here for AMDF:
http://mi.eng.cam.ac.uk/~ajr/SpeechAnalysis/node72.html

It seems like the algorithm I came up with independently is very
similar.  The absolute value of the difference of the sample points is
taken as with AMDF.  Prior to summing the values together though, I'm
multiplying by the window I described before (with a peak in the
center where the loop point is), giving samples closer to the loop
point more weight.

In practice this seems to work quite well and I'm going to leave it as
is for now.  It seems reasonably fast and straight forward.


 find good loop points, then crossfade.

 another thing about cross fading is that there is something you can do to
 adapt a little to better or poor loop points.  if the loop points (and the
 window surrounding them) match well, then you're doing a crossfade between
 coherent audio and a constant voltage crossfade is indicated (when the
 crossfade is half done, both the fade out and fade in envelopes are at 50%).
  if the loop points are not well matched (but it's the best loop points your
 correlation function can find), then you want to do a crossfade that is
 closer to a constant power crossfade where both fade in and fade out
 envelopes are at 70.7% at the midpoint of the crossfade.  there is a way to
 define the optimal crossfade function for any correlation between 0 (when
 it's like crossfading white noise to white noise) to 100% (like crossfading
 a perfectly periodic waveform to a similarly appearing portion of the
 waveform at loop start).

 does any of this make any sense?


I'm not sure I'm following you.  From what I can understand it sounds
like you are saying that the degree to which the two loop point signal
windows match could be used to select different cross fade envelope
curves, for a better perceptual cross fade.  I hadn't given this much
thought and just assumed a linear cross fade (0-100%) would be the way
to do it (that is from a limited DSP background mind you).  I am
intrigued by this idea though.  Any tips on how to generate the
envelope functions and what sort of equation could be used for
selecting the optimal envelope based on the signal correlation?

 can i ask what the application is? (i may have missed it, but i'll look at
 earlier posts.)  if it's looping for sound/instrument samples, this is an
 analysis thing that is not real-time and we can consider finding the best
 loop-begin points for a large variety of possible loop-end points.  then
 pick the pair that looks  best, given whatever your measure of good is.  but
 in a (time-domain) real-time pitch shifter, having so many choices may not
 be available to you.  you might find yourself in a situation where your
 loop-end is pretty well defined, you have to find a place to splice to and
 take the best that you can get from that.


Its a sample/instrument editor, so its all non-realtime.

 --

 r b-j                  ...@audioimagination.com

 Imagination is more important than knowledge.


Thanks for the helpful info!
Element Green
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-11-26 Thread robert bristow-johnson


On Nov 26, 2010, at 2:21 AM, Ross Bencina wrote:


robert bristow-johnson wrote:
you can have a periodic (or quasi-periodic) signal with absolutely  
no energy at harmonic #1 (what i would call the fundamental), and  
as long  as it has energy in most other odd harmonics, the  
autocorrelation  function will work just as well.  there will still  
be peaks at lags  are at integer multiples of the apparent period.


The way I see it, the ACF will be quasi-periodic at the rate of the  
lowest present harmonic.


but that's not the case.  if x(t) is periodic, then Rx(tau) (as  
calculated with an infinitely large window) is periodic with the same  
period.


a pretty robust definition of the autocorrelation of x(t) might be:

   +T
 Rx(tau) =  lim{   1/(2T) integral{ x(t)*x(t+tau) dt}   }
   T-inf  -T

now, we don't have infinitely large windows to calculate Rx() with, so  
as the lag, tau, increases the overlap of the windows is smaller


let
 v(t) =  x(t)*w(t) (w(t) is a very wide window function.)

then
 Rx(tau)  ~=   Rv(tau)

  +inf
 Rv(tau)   =  integral{ v(t)*v(t+tau) dt}
  -inf

so, while x(t) is periodic, v(t) is not and there is an envelope on  
the apparent autocorrelation Rx(tau) which is:


  +inf
 Rw(tau)   =  integral{ w(t)*w(t+tau) dt}
  -inf

if P is the period of x(t), then

 x(t+P) = x(t)for all t

and
  +inf
 Rv(tau)   =  integral{ x(t)*w(t)*x(t+tau)*w(t+tau) dt}
  -inf

  +inf
 Rv(P) =  integral{ x(t)*w(t)*x(t+P)*w(t+P) dt}
  -inf

  +inf
   =  integral{ (x(t)^2)*w(t)*w(t+P) dt}
  -inf
but
  +inf
  Rv(0)=  integral{ (x(t)^2)*(w(t)^2) dt}
  -inf

Rv(P) is reduced from Rv(0) because w(t)*w(t+P) is smaller than w(t)^2  
because w(t) and w(t+P) do not overlap as much as w(t) overlaps on top  
of itself.  i think, for some windows (like the rectangular window),  
you can show something like


 Rv(n*P)  =  Rx(n*P)*Rw(n*P)

which is true only at the multiples of the period, P.

underneath that envelope, you will see peaks at multiples of the  
period of x(t), whether there is 1st harmonic in there or not.


now there are other ways of computing the autocorrelation (or  
something that looks like it) that do not have this envelope resulting  
from windowing, so the autocorrelation peaks are as large as Rx(0) if  
x(t) is perfectly periodic.  the method shown with Rv(tau) is what you  
would get if you were using the inverse FT of |v(t)|^2.


This doesn't tell you anything about what the fundamental is. The  
peaks may be there, but how do you know which one(s) are related to  
the fundamental?


there *is* the octave problem.  you can have a tone of 360 Hz (with  
all of the harmonics), but mathematically, it is also a tone of 180 Hz  
(where all the odd harmonics have zero energy) or 120 Hz or 90 Hz  
(with a lot of missing harmonics).  the autocorrelation function will  
peak at those periods also.  so you look for the peak of sufficient  
height (so, unfortunately, there is some kind of thresholding needed)  
that has the smallest lag, which would be the one at 1/360 sec


--

r b-j  r...@audioimagination.com

Imagination is more important than knowledge.




--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-11-25 Thread Ross Bencina

Element Green wrote:

I'm the author of a SoundFont instrument editing application called
Swami (http://swami.sourceforge.net).  A while back an interested
developer added a loop finding algorithm which I integrated into the
application.  This feature is supposed to generate a list of start/end
loop points which are optimal for seamless loops.



What kind of loops are you talking about here? loops in pitched musical 
instrument samples or drum/percussive rhythm loops?




The original algorithm was based on autocorrelation.  There were many
bugs in the implementation and I was having trouble understanding how
it functioned, so I wrote a new algorithm which currently does not use
autocorrelation.  The new algorithm seems to come up with good
candidates for seamless loops, but is slower than the old algorithm
by a factor of 5, but at least it does not suffer from bugs, which
often resulted in unpredictable results.



Assumng you're talking about musical instrument samples then autocorrelation 
(or AMDF) is one way to do it.


You need to get clear about whether your only goal is seamless or if you 
want to preserve the pitch of the original accurately too -- that's pretty 
important for musical instrument samples :-) and will be more relevant for 
short loops. In that case you may be better off with a good spectral 
fundamental frequency estimator like this one to choose the loop length:

http://ems.music.uiuc.edu/beaucham/papers/JASA.04.94.pdf

That could be combined with autocorrelation/AMDF and/or crossfading to 
choose where to place the loop.


Note that there are a number of different ways you can perform the crossfade 
if you choose to do that (linear, equal power, I remember my Ensoniq EPS 
even had some weird ensemble  bow-tie crossfade options).


For longish loops on musical/tonal/synth sounds that are spectrally dynamic 
(ie they have lots of phasing or lfo stuff going on) you could also use 
spectral descriptors to find good match points (you probably want to combine 
this with autocorrelation/AMDF to get both the time domain and spectral 
aspects correct).


A while ago James Chandler Jr gave a really good description of an off-line 
time stretching algorithm, part of which did looping for pitched sounds --  
you might want to have a look in the list archives that.. it's probably of 
interest to you. From memory one of the points he made was that one of the 
loop points within a few samples either side of the one automatically chosen 
might actually sound better.. so you could offer a few options to the user 
+/- 1 or two samples.





I have limited knowledge in the area of DSP, so I thought I would seek
advice on this list to determine if the new algorithm makes sense, if
anyone has any ideas on ways to optimize it or if there are better
ways of tackling this task.


I think AMDF is a reasonable approach for a simple algorithm. You could take 
it a lot further using the techniques above -- not sure they're that useful 
though.




First off, is autocorrelation even a solution for this?  That is what
the old algorithm used and it seemed to me that sample points closer
to the loop start or end should be given higher priority in the
quality calculation, which was not done in that case.


You could limit the autocorrelation to a window around the candidate loop 
point to maximise waveform similarity in that region.


Weighting at the loop point might help if you're not doing any crossfading, 
but using a larger, evenly weighted window makes it more likely that your 
algorithm will choose a good global match based on the longer term evolution 
of the sound (including it's pitch and spectral evolution).



Ross.









--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-11-25 Thread Element Green
Hello Didier,

On Wed, Nov 24, 2010 at 9:51 PM, Didier Dambrin di...@skynet.be wrote:
 IMHO finding loop points is the wrong problem to solve, it's better to
 make something loop instead, as (ideally) you're only gonna find the least
 bad loop points, nothing guarantees that there's anything loopable.
 I would use crossfading, and possibly autocorellation to auto-select the
 part to repeat  crossfade (to avoid a volume dip ( timbre change) due to
 phasing).
 I would also reject too small looping sections, as a click-free loop is
 one thing but a loop that doesn't sound repeating is another thing.
 Even if you wanna find loop points, IMHO it's still better to find them not
 caring about noticable clicks, and then do a little crossfade. Unless you
 really can't touch your source sample.


I like this approach.  Cross fading was on my list of features to add,
which could be performed after a good loop candidate is found.  I
still think its a good idea to have a loop candidate finding
algorithm, as you mentioned.  I can see that autocorrelation would
probably work much better if a cross fade was performed afterwards to
remove the possible click.

While reading your reply it occurred to me that a cross fade
audition toggle button would be nice, which automatically cross
fades the played back audio sample with the currently defined loop,
but is temporary and does not actually change the audio sample until
the user applies the cross fade.

Element
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-11-25 Thread Element Green
On Thu, Nov 25, 2010 at 2:14 AM, Johannes Kroll jkr...@lavabit.com wrote:
 On Thu, 25 Nov 2010 06:51:03 +0100
 Didier Dambrin di...@skynet.be wrote:

 IMHO finding loop points is the wrong problem to solve, it's better to
 make something loop instead, as (ideally) you're only gonna find the least
 bad loop points, nothing guarantees that there's anything loopable.
 I would use crossfading, and possibly autocorellation to auto-select the
 part to repeat  crossfade (to avoid a volume dip ( timbre change) due to
 phasing).
 I would also reject too small looping sections, as a click-free loop is
 one thing but a loop that doesn't sound repeating is another thing.
 Even if you wanna find loop points, IMHO it's still better to find them not
 caring about noticable clicks, and then do a little crossfade. Unless you
 really can't touch your source sample.

 IMHO, if the goal would be to make something loop automatically
 (without user input) then you would be right. But if you have a user,
 probably a musician with a trained ear, who says I want it to loop
 roughly at this spot, now find me some good loop points nearby, then
 you shouldn't modify the sample. In this case the OP's algorhithm
 sounds fine (given that it finds good loop points, which I didn't
 test). It would be even better to give the user the few best loop
 points to select from. Crossfading can still be done afterwards if
 they want.


I also agree with this approach.  It may be that it would be best to
have 2 different algorithms.  One which is used when the user just
wants to find a loop within a rather large portion of an instrument
sample, for which autocorrelation seems to be a good option.  The
second algorithm would be something like what is currently implemented
and which you clarified would be good for making smaller adjustments
and for which a cross fade may not be necessary, if a quality loop is
already possible.  A cross fade could of course still be performed for
ether option.

Element
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-11-25 Thread Element Green
On Thu, Nov 25, 2010 at 4:05 AM, Sebastian Stober sto...@ovgu.de wrote:
 Hi Joshua,

 you might be interested in this blog post:
 http://runningwithdata.tumblr.com/post/597154309/earworm-capsule
 about a graph-based approach.

 Best regards,
 Sebastian


Does indeed sound like an interesting project.  From a quick glance it
seems more based on looping full music pieces rather than individual
instrument audio samples though.  Albeit it probably does use some
techniques also applicable to this task.

Best regards,
Element
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Algorithms for finding seamless loops in audio

2010-11-24 Thread Alan Wolfe
Agreed here (:

in 2d graphics and skeletal animation, making tileable 2d art and
seamless blends are basically the same problems.

in both areas they MAKE the things seamless instead of trying to find
how they could be seamless.

in 2d graphics this comes up via texturing (probably obvious), and in
skeletal animation, it is literally in the form that Didier talks
about; they literally cross fade animation weights from an old
animation to a new animation to make a seamless transition.

Of course, even with these seamless techniques, you can still notice
issues like in 2d textures there might be specific features that
really stand out in a texture and you can easily see it repeating.  In
3d animation, even though a blend may be seamless it can still look
wrong.

I only bring these parallels up because there is something in 2d
graphics called wang tiling which can make some really organic
looking tileable textures.

i think the same technique could apply to audio (and even skeletal
animation) but how you would apply the idea, i'm not sure 100% hehe.

my 2 monopoly dollars! :P

On Wed, Nov 24, 2010 at 9:51 PM, Didier Dambrin di...@skynet.be wrote:
 IMHO finding loop points is the wrong problem to solve, it's better to
 make something loop instead, as (ideally) you're only gonna find the least
 bad loop points, nothing guarantees that there's anything loopable.
 I would use crossfading, and possibly autocorellation to auto-select the
 part to repeat  crossfade (to avoid a volume dip ( timbre change) due to
 phasing).
 I would also reject too small looping sections, as a click-free loop is
 one thing but a loop that doesn't sound repeating is another thing.
 Even if you wanna find loop points, IMHO it's still better to find them not
 caring about noticable clicks, and then do a little crossfade. Unless you
 really can't touch your source sample.




 Hello music-dsp list,

 I'm the author of a SoundFont instrument editing application called
 Swami (http://swami.sourceforge.net).  A while back an interested
 developer added a loop finding algorithm which I integrated into the
 application.  This feature is supposed to generate a list of start/end
 loop points which are optimal for seamless loops.

 The original algorithm was based on autocorrelation.  There were many
 bugs in the implementation and I was having trouble understanding how
 it functioned, so I wrote a new algorithm which currently does not use
 autocorrelation.  The new algorithm seems to come up with good
 candidates for seamless loops, but is slower than the old algorithm
 by a factor of 5, but at least it does not suffer from bugs, which
 often resulted in unpredictable results.

 I have limited knowledge in the area of DSP, so I thought I would seek
 advice on this list to determine if the new algorithm makes sense, if
 anyone has any ideas on ways to optimize it or if there are better
 ways of tackling this task.

 First off, is autocorrelation even a solution for this?  That is what
 the old algorithm used and it seemed to me that sample points closer
 to the loop start or end should be given higher priority in the
 quality calculation, which was not done in that case.

 Inputs to the algorithm:
 float sample_data[]: Audio data array of floating point samples
 normalized to -1.0 to 1.0
 int analysis_size: Size of analysis window (number of points compared
 around loop points, the loop point is in the center of the window).
 int half_analysis_size = analysis_size / 2 which is the center point
 of the window (only really center on odd analysis_size values)
 int win1start: Offset in sample_data[] of 1st search window (for loop
 start point).
 int win1size: Size of 1st search window (for loop start point).
 int win2start: Offset in sample_data[] of 2nd search window (for loop
 end point).
 int win2size: Size of 2nd search window (for loop end point).


 Description of the new algorithm:
 A multiplication window array of floats (lets call it
 analysis_window[]) is created which is analysis_size in length and
 contains a peak value in the center of the window, each point away
 from the center is half the value of its closer neighbor and all
 values in the window add up to 0.5.  0.5 was chosen because the
 maximum difference between two sample points is 2 (1 - -1 = 2), so
 this results in a maximum quality value of 1.0 (worst quality).

 The two search windows are exhaustively compared with two loops, one
 embedded in the other.  For each loop start/end candidate a quality
 factor is calculated.  The quality value is calculated from the sum of
 the absolute differences of the sample points surrounding the loop
 points (analysis window size) multiplied individually by values in the
 analysis_window[] array.

 C code:

  /* Calculate fraction divisor */
  for (i = 0, fract = 0, pow2 = 1; i = half_window; i++, pow2 *= 2)
  {
   fract += pow2;
   if (i  half_window) fract += pow2;
  }

  /* Even windows are asymetrical, subtract 1 */
  if