Re: [music-dsp] Algorithms for finding seamless loops in audio
Now I wonder, am I the only one to calculate ACF using FFT? Regarding seamless loops, I found that quantizing frequencies to integer numbers of periods in the loop works extremely well. Regards, Stefan -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] Algorithms for finding seamless loops in audio
On Thu, Nov 25, 2010 at 9:33 PM, robert bristow-johnson r...@audioimagination.com wrote: depending on how big your window is, i think a better term for this is *cross-correlation* not autocorrelation. it's a single stream of audio so in a sense of the word, it *is* autocorrelation, but what i normally think of, with that semantic is something where the lag is no bigger or not much bigger than the analysis window of either loop-end region of the audio and the loop-begin. if the loop points are separated by a much longer time (number of samples) than the size (in samples) of the two slices of audio being correlated, it's really cross-correlation. and you might find poor correlation given all lags that you're looking at. in fact, doing cross-correlation from one part of the tone or sound to another part that has a rapid change in amplitude envelope might fool your correlation into thinking there is a good match when there really isn't (because the amplitude is increasing, then the cross-correlation increases, but not necessarily because of a good match). so, instead of either cross or autocorrelation, you might want to consider AMDF between the loop end and potential candidates to loop back to. instead of looking for a maximum, you're looking for a minimum and a very low minimum means a good match (or a bad match during a very low signal level). Looking at the equation here for AMDF: http://mi.eng.cam.ac.uk/~ajr/SpeechAnalysis/node72.html It seems like the algorithm I came up with independently is very similar. The absolute value of the difference of the sample points is taken as with AMDF. Prior to summing the values together though, I'm multiplying by the window I described before (with a peak in the center where the loop point is), giving samples closer to the loop point more weight. In practice this seems to work quite well and I'm going to leave it as is for now. It seems reasonably fast and straight forward. find good loop points, then crossfade. another thing about cross fading is that there is something you can do to adapt a little to better or poor loop points. if the loop points (and the window surrounding them) match well, then you're doing a crossfade between coherent audio and a constant voltage crossfade is indicated (when the crossfade is half done, both the fade out and fade in envelopes are at 50%). if the loop points are not well matched (but it's the best loop points your correlation function can find), then you want to do a crossfade that is closer to a constant power crossfade where both fade in and fade out envelopes are at 70.7% at the midpoint of the crossfade. there is a way to define the optimal crossfade function for any correlation between 0 (when it's like crossfading white noise to white noise) to 100% (like crossfading a perfectly periodic waveform to a similarly appearing portion of the waveform at loop start). does any of this make any sense? I'm not sure I'm following you. From what I can understand it sounds like you are saying that the degree to which the two loop point signal windows match could be used to select different cross fade envelope curves, for a better perceptual cross fade. I hadn't given this much thought and just assumed a linear cross fade (0-100%) would be the way to do it (that is from a limited DSP background mind you). I am intrigued by this idea though. Any tips on how to generate the envelope functions and what sort of equation could be used for selecting the optimal envelope based on the signal correlation? can i ask what the application is? (i may have missed it, but i'll look at earlier posts.) if it's looping for sound/instrument samples, this is an analysis thing that is not real-time and we can consider finding the best loop-begin points for a large variety of possible loop-end points. then pick the pair that looks best, given whatever your measure of good is. but in a (time-domain) real-time pitch shifter, having so many choices may not be available to you. you might find yourself in a situation where your loop-end is pretty well defined, you have to find a place to splice to and take the best that you can get from that. Its a sample/instrument editor, so its all non-realtime. -- r b-j ...@audioimagination.com Imagination is more important than knowledge. Thanks for the helpful info! Element Green -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] Algorithms for finding seamless loops in audio
On Nov 26, 2010, at 2:21 AM, Ross Bencina wrote: robert bristow-johnson wrote: you can have a periodic (or quasi-periodic) signal with absolutely no energy at harmonic #1 (what i would call the fundamental), and as long as it has energy in most other odd harmonics, the autocorrelation function will work just as well. there will still be peaks at lags are at integer multiples of the apparent period. The way I see it, the ACF will be quasi-periodic at the rate of the lowest present harmonic. but that's not the case. if x(t) is periodic, then Rx(tau) (as calculated with an infinitely large window) is periodic with the same period. a pretty robust definition of the autocorrelation of x(t) might be: +T Rx(tau) = lim{ 1/(2T) integral{ x(t)*x(t+tau) dt} } T-inf -T now, we don't have infinitely large windows to calculate Rx() with, so as the lag, tau, increases the overlap of the windows is smaller let v(t) = x(t)*w(t) (w(t) is a very wide window function.) then Rx(tau) ~= Rv(tau) +inf Rv(tau) = integral{ v(t)*v(t+tau) dt} -inf so, while x(t) is periodic, v(t) is not and there is an envelope on the apparent autocorrelation Rx(tau) which is: +inf Rw(tau) = integral{ w(t)*w(t+tau) dt} -inf if P is the period of x(t), then x(t+P) = x(t)for all t and +inf Rv(tau) = integral{ x(t)*w(t)*x(t+tau)*w(t+tau) dt} -inf +inf Rv(P) = integral{ x(t)*w(t)*x(t+P)*w(t+P) dt} -inf +inf = integral{ (x(t)^2)*w(t)*w(t+P) dt} -inf but +inf Rv(0)= integral{ (x(t)^2)*(w(t)^2) dt} -inf Rv(P) is reduced from Rv(0) because w(t)*w(t+P) is smaller than w(t)^2 because w(t) and w(t+P) do not overlap as much as w(t) overlaps on top of itself. i think, for some windows (like the rectangular window), you can show something like Rv(n*P) = Rx(n*P)*Rw(n*P) which is true only at the multiples of the period, P. underneath that envelope, you will see peaks at multiples of the period of x(t), whether there is 1st harmonic in there or not. now there are other ways of computing the autocorrelation (or something that looks like it) that do not have this envelope resulting from windowing, so the autocorrelation peaks are as large as Rx(0) if x(t) is perfectly periodic. the method shown with Rv(tau) is what you would get if you were using the inverse FT of |v(t)|^2. This doesn't tell you anything about what the fundamental is. The peaks may be there, but how do you know which one(s) are related to the fundamental? there *is* the octave problem. you can have a tone of 360 Hz (with all of the harmonics), but mathematically, it is also a tone of 180 Hz (where all the odd harmonics have zero energy) or 120 Hz or 90 Hz (with a lot of missing harmonics). the autocorrelation function will peak at those periods also. so you look for the peak of sufficient height (so, unfortunately, there is some kind of thresholding needed) that has the smallest lag, which would be the one at 1/360 sec -- r b-j r...@audioimagination.com Imagination is more important than knowledge. -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] Algorithms for finding seamless loops in audio
Element Green wrote: I'm the author of a SoundFont instrument editing application called Swami (http://swami.sourceforge.net). A while back an interested developer added a loop finding algorithm which I integrated into the application. This feature is supposed to generate a list of start/end loop points which are optimal for seamless loops. What kind of loops are you talking about here? loops in pitched musical instrument samples or drum/percussive rhythm loops? The original algorithm was based on autocorrelation. There were many bugs in the implementation and I was having trouble understanding how it functioned, so I wrote a new algorithm which currently does not use autocorrelation. The new algorithm seems to come up with good candidates for seamless loops, but is slower than the old algorithm by a factor of 5, but at least it does not suffer from bugs, which often resulted in unpredictable results. Assumng you're talking about musical instrument samples then autocorrelation (or AMDF) is one way to do it. You need to get clear about whether your only goal is seamless or if you want to preserve the pitch of the original accurately too -- that's pretty important for musical instrument samples :-) and will be more relevant for short loops. In that case you may be better off with a good spectral fundamental frequency estimator like this one to choose the loop length: http://ems.music.uiuc.edu/beaucham/papers/JASA.04.94.pdf That could be combined with autocorrelation/AMDF and/or crossfading to choose where to place the loop. Note that there are a number of different ways you can perform the crossfade if you choose to do that (linear, equal power, I remember my Ensoniq EPS even had some weird ensemble bow-tie crossfade options). For longish loops on musical/tonal/synth sounds that are spectrally dynamic (ie they have lots of phasing or lfo stuff going on) you could also use spectral descriptors to find good match points (you probably want to combine this with autocorrelation/AMDF to get both the time domain and spectral aspects correct). A while ago James Chandler Jr gave a really good description of an off-line time stretching algorithm, part of which did looping for pitched sounds -- you might want to have a look in the list archives that.. it's probably of interest to you. From memory one of the points he made was that one of the loop points within a few samples either side of the one automatically chosen might actually sound better.. so you could offer a few options to the user +/- 1 or two samples. I have limited knowledge in the area of DSP, so I thought I would seek advice on this list to determine if the new algorithm makes sense, if anyone has any ideas on ways to optimize it or if there are better ways of tackling this task. I think AMDF is a reasonable approach for a simple algorithm. You could take it a lot further using the techniques above -- not sure they're that useful though. First off, is autocorrelation even a solution for this? That is what the old algorithm used and it seemed to me that sample points closer to the loop start or end should be given higher priority in the quality calculation, which was not done in that case. You could limit the autocorrelation to a window around the candidate loop point to maximise waveform similarity in that region. Weighting at the loop point might help if you're not doing any crossfading, but using a larger, evenly weighted window makes it more likely that your algorithm will choose a good global match based on the longer term evolution of the sound (including it's pitch and spectral evolution). Ross. -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] Algorithms for finding seamless loops in audio
Hello Didier, On Wed, Nov 24, 2010 at 9:51 PM, Didier Dambrin di...@skynet.be wrote: IMHO finding loop points is the wrong problem to solve, it's better to make something loop instead, as (ideally) you're only gonna find the least bad loop points, nothing guarantees that there's anything loopable. I would use crossfading, and possibly autocorellation to auto-select the part to repeat crossfade (to avoid a volume dip ( timbre change) due to phasing). I would also reject too small looping sections, as a click-free loop is one thing but a loop that doesn't sound repeating is another thing. Even if you wanna find loop points, IMHO it's still better to find them not caring about noticable clicks, and then do a little crossfade. Unless you really can't touch your source sample. I like this approach. Cross fading was on my list of features to add, which could be performed after a good loop candidate is found. I still think its a good idea to have a loop candidate finding algorithm, as you mentioned. I can see that autocorrelation would probably work much better if a cross fade was performed afterwards to remove the possible click. While reading your reply it occurred to me that a cross fade audition toggle button would be nice, which automatically cross fades the played back audio sample with the currently defined loop, but is temporary and does not actually change the audio sample until the user applies the cross fade. Element -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] Algorithms for finding seamless loops in audio
On Thu, Nov 25, 2010 at 2:14 AM, Johannes Kroll jkr...@lavabit.com wrote: On Thu, 25 Nov 2010 06:51:03 +0100 Didier Dambrin di...@skynet.be wrote: IMHO finding loop points is the wrong problem to solve, it's better to make something loop instead, as (ideally) you're only gonna find the least bad loop points, nothing guarantees that there's anything loopable. I would use crossfading, and possibly autocorellation to auto-select the part to repeat crossfade (to avoid a volume dip ( timbre change) due to phasing). I would also reject too small looping sections, as a click-free loop is one thing but a loop that doesn't sound repeating is another thing. Even if you wanna find loop points, IMHO it's still better to find them not caring about noticable clicks, and then do a little crossfade. Unless you really can't touch your source sample. IMHO, if the goal would be to make something loop automatically (without user input) then you would be right. But if you have a user, probably a musician with a trained ear, who says I want it to loop roughly at this spot, now find me some good loop points nearby, then you shouldn't modify the sample. In this case the OP's algorhithm sounds fine (given that it finds good loop points, which I didn't test). It would be even better to give the user the few best loop points to select from. Crossfading can still be done afterwards if they want. I also agree with this approach. It may be that it would be best to have 2 different algorithms. One which is used when the user just wants to find a loop within a rather large portion of an instrument sample, for which autocorrelation seems to be a good option. The second algorithm would be something like what is currently implemented and which you clarified would be good for making smaller adjustments and for which a cross fade may not be necessary, if a quality loop is already possible. A cross fade could of course still be performed for ether option. Element -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] Algorithms for finding seamless loops in audio
On Thu, Nov 25, 2010 at 4:05 AM, Sebastian Stober sto...@ovgu.de wrote: Hi Joshua, you might be interested in this blog post: http://runningwithdata.tumblr.com/post/597154309/earworm-capsule about a graph-based approach. Best regards, Sebastian Does indeed sound like an interesting project. From a quick glance it seems more based on looping full music pieces rather than individual instrument audio samples though. Albeit it probably does use some techniques also applicable to this task. Best regards, Element -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] Algorithms for finding seamless loops in audio
Agreed here (: in 2d graphics and skeletal animation, making tileable 2d art and seamless blends are basically the same problems. in both areas they MAKE the things seamless instead of trying to find how they could be seamless. in 2d graphics this comes up via texturing (probably obvious), and in skeletal animation, it is literally in the form that Didier talks about; they literally cross fade animation weights from an old animation to a new animation to make a seamless transition. Of course, even with these seamless techniques, you can still notice issues like in 2d textures there might be specific features that really stand out in a texture and you can easily see it repeating. In 3d animation, even though a blend may be seamless it can still look wrong. I only bring these parallels up because there is something in 2d graphics called wang tiling which can make some really organic looking tileable textures. i think the same technique could apply to audio (and even skeletal animation) but how you would apply the idea, i'm not sure 100% hehe. my 2 monopoly dollars! :P On Wed, Nov 24, 2010 at 9:51 PM, Didier Dambrin di...@skynet.be wrote: IMHO finding loop points is the wrong problem to solve, it's better to make something loop instead, as (ideally) you're only gonna find the least bad loop points, nothing guarantees that there's anything loopable. I would use crossfading, and possibly autocorellation to auto-select the part to repeat crossfade (to avoid a volume dip ( timbre change) due to phasing). I would also reject too small looping sections, as a click-free loop is one thing but a loop that doesn't sound repeating is another thing. Even if you wanna find loop points, IMHO it's still better to find them not caring about noticable clicks, and then do a little crossfade. Unless you really can't touch your source sample. Hello music-dsp list, I'm the author of a SoundFont instrument editing application called Swami (http://swami.sourceforge.net). A while back an interested developer added a loop finding algorithm which I integrated into the application. This feature is supposed to generate a list of start/end loop points which are optimal for seamless loops. The original algorithm was based on autocorrelation. There were many bugs in the implementation and I was having trouble understanding how it functioned, so I wrote a new algorithm which currently does not use autocorrelation. The new algorithm seems to come up with good candidates for seamless loops, but is slower than the old algorithm by a factor of 5, but at least it does not suffer from bugs, which often resulted in unpredictable results. I have limited knowledge in the area of DSP, so I thought I would seek advice on this list to determine if the new algorithm makes sense, if anyone has any ideas on ways to optimize it or if there are better ways of tackling this task. First off, is autocorrelation even a solution for this? That is what the old algorithm used and it seemed to me that sample points closer to the loop start or end should be given higher priority in the quality calculation, which was not done in that case. Inputs to the algorithm: float sample_data[]: Audio data array of floating point samples normalized to -1.0 to 1.0 int analysis_size: Size of analysis window (number of points compared around loop points, the loop point is in the center of the window). int half_analysis_size = analysis_size / 2 which is the center point of the window (only really center on odd analysis_size values) int win1start: Offset in sample_data[] of 1st search window (for loop start point). int win1size: Size of 1st search window (for loop start point). int win2start: Offset in sample_data[] of 2nd search window (for loop end point). int win2size: Size of 2nd search window (for loop end point). Description of the new algorithm: A multiplication window array of floats (lets call it analysis_window[]) is created which is analysis_size in length and contains a peak value in the center of the window, each point away from the center is half the value of its closer neighbor and all values in the window add up to 0.5. 0.5 was chosen because the maximum difference between two sample points is 2 (1 - -1 = 2), so this results in a maximum quality value of 1.0 (worst quality). The two search windows are exhaustively compared with two loops, one embedded in the other. For each loop start/end candidate a quality factor is calculated. The quality value is calculated from the sum of the absolute differences of the sample points surrounding the loop points (analysis window size) multiplied individually by values in the analysis_window[] array. C code: /* Calculate fraction divisor */ for (i = 0, fract = 0, pow2 = 1; i = half_window; i++, pow2 *= 2) { fract += pow2; if (i half_window) fract += pow2; } /* Even windows are asymetrical, subtract 1 */ if