Re: [Flashcoders] Question about approximate vowel detection in AS3
I would say there are about 5 - 7 mouth shapes you could distribute through your animation that would give the impression that the avatar is saying the right words. Plus if your animation is fluid (meaning it doesn't look like the avatar is straining to say the words) it probably wont be noticeable if it mouths the wrong word from time to time. JAT Karl On Jun 4, 2010, at 12:25 PM, Eric E. Dolecki wrote: I was able to match a single "a" - although even with a straight "a" there can be some subtle variation. So I mapped variations that come close and I don't need to match every value in the complete waveform over time... every couple together or even the first value with buffer comes pretty close. this is with a known, unchanging vocal waveform. So I doubt this would be very useful outside of this current system, which is a bummer. I think it's time for me to retire this code and move on. Oh well... Eric On Fri, Jun 4, 2010 at 9:28 AM, Eric E. Dolecki wrote: I can get waveforms... but say "a" takes 1 second to speak. I get different waveforms over that 1 second... so I'm not matching against a single waveform, but many waveforms in succession. This seems like a tricky thing to match against. What might be a good approach to matching values over a certain amount of time? Is AS3 fast enough to sync quick enough? I imagine it would need to check for all vowels every frame matching values in waveforms over a certain amount of time. Eric On Fri, Jun 4, 2010 at 8:56 AM, Eric E. Dolecki wrote: I've started implementing some code this morning in the hopes to match the vowel "a" this morning. Of course there are several intonations for this depending on the word it's located in, but if I can get a match on a naked "a" I may be on to something. Like you said, I have a higher chance of success since the voice is software generated and not from random people's speech patterns. If I don't get something today I'm going to bail on the engine in the hopes of finding something useful some other time. This isn't a critical feature for me as I have the jaw moving with precision and the effect comes across. Mouth shapes would be the icing on the cake. Eric On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti wrote: Yeh - not sure this will help however - a (very talented) colleague of mine worked on a simple speech recognition software for mobile - it was built to recognise about 20 commands with 90% success rate. His approach (in my simplistic terms) was: 1) get recordings / audio samples of the commands (in your case vowels - it should be easier as it's generated so you wont have to compare against too many/different intonations ) - 2) create / store a graph of the audio commands ( this used FFT (s) - to abstract and simplify, the pattern of the commands - the result was a square voice print graph ) 3) The stored patterns/voiceprints were then compared against the users voice recording. The trickiest part of this whole business were the Fast Fourier Transforms - these things get very complicated, and confuse the life out of me. Anyway, hopefully this will help you - seems like it might be the best approach. if you do crack it - you will end up with a simple voice recognition system. Which would be a brilliant and useful thing bit of code to have... hope this was of any use.. - karim On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote: I would try using that to figure out a way of maping the sounds and then translate that to your project. You are able to see the wave forms in soundbooth? Haven't used it. If so, can you run your cursor over it at any point to get the readings? Might be a little trivial, but may yeild a pattern that you can utilize. JAT Karl Sent from losPhone On Jun 3, 2010, at 6:18 PM, "Eric E. Dolecki" wrote: SoundBooth On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers < k...@designdrumm.com>wrote: Do you have SoundEdit? Or the like? Karl On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: I think I might make waveform bitmaps and then try and compare against the current waveform (block EQ) - and if it's a close match, then fire off specific vowel events. If that works, I could do consonants too. If this works, I'll do jumping jacks and shots of Jack. So how would I compare two bitmaps to see if a waveform ( On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers < k...@designdrumm.com wrote: If you need any of these files or can't find them, lmk and I can send off list. Best, Karl On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: Don't know if this will help, but have you looked into WaveAnalyzer.as or Flash MX - Audio: Sound completion event (The source files for this can be found in the Flash MX/Samples folder.) They both let you control the sound. I am thinking this will point you in a good direction. Its AS2 though. HTH, Karl On Jun 3, 2
Re: [Flashcoders] Question about approximate vowel detection in AS3
I was able to match a single "a" - although even with a straight "a" there can be some subtle variation. So I mapped variations that come close and I don't need to match every value in the complete waveform over time... every couple together or even the first value with buffer comes pretty close. this is with a known, unchanging vocal waveform. So I doubt this would be very useful outside of this current system, which is a bummer. I think it's time for me to retire this code and move on. Oh well... Eric On Fri, Jun 4, 2010 at 9:28 AM, Eric E. Dolecki wrote: > I can get waveforms... but say "a" takes 1 second to speak. I get different > waveforms over that 1 second... so I'm not matching against a single > waveform, but many waveforms in succession. This seems like a tricky thing > to match against. > > What might be a good approach to matching values over a certain amount of > time? Is AS3 fast enough to sync quick enough? I imagine it would need to > check for all vowels every frame matching values in waveforms over a certain > amount of time. > > Eric > > > On Fri, Jun 4, 2010 at 8:56 AM, Eric E. Dolecki wrote: > >> I've started implementing some code this morning in the hopes to match the >> vowel "a" this morning. Of course there are several intonations for this >> depending on the word it's located in, but if I can get a match on a naked >> "a" I may be on to something. Like you said, I have a higher chance of >> success since the voice is software generated and not from random people's >> speech patterns. >> >> If I don't get something today I'm going to bail on the engine in the >> hopes of finding something useful some other time. This isn't a critical >> feature for me as I have the jaw moving with precision and the effect comes >> across. Mouth shapes would be the icing on the cake. >> >> Eric >> >> >> On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti wrote: >> >>> Yeh - not sure this will help >>> >>> however - a (very talented) colleague of mine worked on a simple speech >>> recognition software for mobile - it was built to recognise about 20 >>> commands with 90% success rate. >>> >>> His approach (in my simplistic terms) was: >>> >>> 1) get recordings / audio samples of the commands (in your case vowels - >>> it should be easier as it's generated so you wont have to compare against >>> too many/different intonations ) - >>> 2) create / store a graph of the audio commands ( this used FFT (s) - to >>> abstract and simplify, the pattern of the commands - the result was a square >>> voice print graph ) >>> 3) The stored patterns/voiceprints were then compared against the users >>> voice recording. >>> >>> The trickiest part of this whole business were the Fast Fourier >>> Transforms - these things get very complicated, and confuse the life out of >>> me. Anyway, hopefully this >>> will help you - seems like it might be the best approach. if you do crack >>> it - you will end up with a simple voice recognition system. Which would be >>> a brilliant and useful thing bit of code to >>> have... >>> >>> hope this was of any use.. >>> >>> - karim >>> >>> On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote: >>> >>> > I would try using that to figure out a way of maping the sounds and >>> then translate that to your project. You are able to see the wave forms in >>> soundbooth? Haven't used it. If so, can you run your cursor over it at any >>> point to get the readings? Might be a little trivial, but may yeild a >>> pattern that you can utilize. >>> > >>> > JAT >>> > >>> > Karl >>> > >>> > Sent from losPhone >>> > >>> > On Jun 3, 2010, at 6:18 PM, "Eric E. Dolecki" >>> wrote: >>> > >>> >> SoundBooth >>> >> >>> >> On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers < >>> k...@designdrumm.com>wrote: >>> >> >>> >>> Do you have SoundEdit? Or the like? >>> >>> >>> >>> >>> >>> Karl >>> >>> >>> >>> >>> >>> >>> >>> On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: >>> >>> >>> >>> I think I might make waveform bitmaps and then try and compare >>> against the >>> current waveform (block EQ) - and if it's a close match, then fire >>> off >>> specific vowel events. If that works, I could do consonants too. If >>> this >>> works, I'll do jumping jacks and shots of Jack. >>> >>> So how would I compare two bitmaps to see if a waveform ( >>> On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers < >>> k...@designdrumm.com >>> > wrote: >>> >>> If you need any of these files or can't find them, lmk and I can >>> send off >>> > list. >>> > >>> > Best, >>> > >>> > Karl >>> > >>> > >>> > >>> > On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: >>> > >>> > Don't know if this will help, but have you looked into >>> WaveAnalyzer.as >>> > or >>> > >>> >> Flash MX - Audio: Sound completion event (The source files for >>> this can >>> >> be >>> >> found in the Flash MX/Samples folder.) >>> >> They both let you control the sou
Re: [Flashcoders] Question about approximate vowel detection in AS3
I can get waveforms... but say "a" takes 1 second to speak. I get different waveforms over that 1 second... so I'm not matching against a single waveform, but many waveforms in succession. This seems like a tricky thing to match against. What might be a good approach to matching values over a certain amount of time? Is AS3 fast enough to sync quick enough? I imagine it would need to check for all vowels every frame matching values in waveforms over a certain amount of time. Eric On Fri, Jun 4, 2010 at 8:56 AM, Eric E. Dolecki wrote: > I've started implementing some code this morning in the hopes to match the > vowel "a" this morning. Of course there are several intonations for this > depending on the word it's located in, but if I can get a match on a naked > "a" I may be on to something. Like you said, I have a higher chance of > success since the voice is software generated and not from random people's > speech patterns. > > If I don't get something today I'm going to bail on the engine in the hopes > of finding something useful some other time. This isn't a critical feature > for me as I have the jaw moving with precision and the effect comes across. > Mouth shapes would be the icing on the cake. > > Eric > > > On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti wrote: > >> Yeh - not sure this will help >> >> however - a (very talented) colleague of mine worked on a simple speech >> recognition software for mobile - it was built to recognise about 20 >> commands with 90% success rate. >> >> His approach (in my simplistic terms) was: >> >> 1) get recordings / audio samples of the commands (in your case vowels - >> it should be easier as it's generated so you wont have to compare against >> too many/different intonations ) - >> 2) create / store a graph of the audio commands ( this used FFT (s) - to >> abstract and simplify, the pattern of the commands - the result was a square >> voice print graph ) >> 3) The stored patterns/voiceprints were then compared against the users >> voice recording. >> >> The trickiest part of this whole business were the Fast Fourier Transforms >> - these things get very complicated, and confuse the life out of me. Anyway, >> hopefully this >> will help you - seems like it might be the best approach. if you do crack >> it - you will end up with a simple voice recognition system. Which would be >> a brilliant and useful thing bit of code to >> have... >> >> hope this was of any use.. >> >> - karim >> >> On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote: >> >> > I would try using that to figure out a way of maping the sounds and then >> translate that to your project. You are able to see the wave forms in >> soundbooth? Haven't used it. If so, can you run your cursor over it at any >> point to get the readings? Might be a little trivial, but may yeild a >> pattern that you can utilize. >> > >> > JAT >> > >> > Karl >> > >> > Sent from losPhone >> > >> > On Jun 3, 2010, at 6:18 PM, "Eric E. Dolecki" >> wrote: >> > >> >> SoundBooth >> >> >> >> On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers > >wrote: >> >> >> >>> Do you have SoundEdit? Or the like? >> >>> >> >>> >> >>> Karl >> >>> >> >>> >> >>> >> >>> On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: >> >>> >> >>> I think I might make waveform bitmaps and then try and compare against >> the >> current waveform (block EQ) - and if it's a close match, then fire >> off >> specific vowel events. If that works, I could do consonants too. If >> this >> works, I'll do jumping jacks and shots of Jack. >> >> So how would I compare two bitmaps to see if a waveform ( >> On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers < >> k...@designdrumm.com >> > wrote: >> >> If you need any of these files or can't find them, lmk and I can send >> off >> > list. >> > >> > Best, >> > >> > Karl >> > >> > >> > >> > On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: >> > >> > Don't know if this will help, but have you looked into >> WaveAnalyzer.as >> > or >> > >> >> Flash MX - Audio: Sound completion event (The source files for this >> can >> >> be >> >> found in the Flash MX/Samples folder.) >> >> They both let you control the sound. I am thinking this will point >> you >> >> in >> >> a good direction. Its AS2 though. >> >> >> >> HTH, >> >> >> >> Karl >> >> >> >> >> >> On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: >> >> >> >> Ya - I have the data for both things, but they extend over time and >> are >> >> >> >>> difficult to compare. It's the boiling down the signatures into >> >>> something >> >>> simple and being able to read the playing audio looking for the >> match >> >>> (or >> >>> near match). I thought about using bitmap data and trying to match >> up >> >>> waveforms, etc. but I don't know enough about it to pull that off. >> It >> >>> seems >> >>> like
Re: [Flashcoders] Question about approximate vowel detection in AS3
I've started implementing some code this morning in the hopes to match the vowel "a" this morning. Of course there are several intonations for this depending on the word it's located in, but if I can get a match on a naked "a" I may be on to something. Like you said, I have a higher chance of success since the voice is software generated and not from random people's speech patterns. If I don't get something today I'm going to bail on the engine in the hopes of finding something useful some other time. This isn't a critical feature for me as I have the jaw moving with precision and the effect comes across. Mouth shapes would be the icing on the cake. Eric On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti wrote: > Yeh - not sure this will help > > however - a (very talented) colleague of mine worked on a simple speech > recognition software for mobile - it was built to recognise about 20 > commands with 90% success rate. > > His approach (in my simplistic terms) was: > > 1) get recordings / audio samples of the commands (in your case vowels - it > should be easier as it's generated so you wont have to compare against too > many/different intonations ) - > 2) create / store a graph of the audio commands ( this used FFT (s) - to > abstract and simplify, the pattern of the commands - the result was a square > voice print graph ) > 3) The stored patterns/voiceprints were then compared against the users > voice recording. > > The trickiest part of this whole business were the Fast Fourier Transforms > - these things get very complicated, and confuse the life out of me. Anyway, > hopefully this > will help you - seems like it might be the best approach. if you do crack > it - you will end up with a simple voice recognition system. Which would be > a brilliant and useful thing bit of code to > have... > > hope this was of any use.. > > - karim > > On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote: > > > I would try using that to figure out a way of maping the sounds and then > translate that to your project. You are able to see the wave forms in > soundbooth? Haven't used it. If so, can you run your cursor over it at any > point to get the readings? Might be a little trivial, but may yeild a > pattern that you can utilize. > > > > JAT > > > > Karl > > > > Sent from losPhone > > > > On Jun 3, 2010, at 6:18 PM, "Eric E. Dolecki" > wrote: > > > >> SoundBooth > >> > >> On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers >wrote: > >> > >>> Do you have SoundEdit? Or the like? > >>> > >>> > >>> Karl > >>> > >>> > >>> > >>> On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: > >>> > >>> I think I might make waveform bitmaps and then try and compare against > the > current waveform (block EQ) - and if it's a close match, then fire off > specific vowel events. If that works, I could do consonants too. If > this > works, I'll do jumping jacks and shots of Jack. > > So how would I compare two bitmaps to see if a waveform ( > On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers < > k...@designdrumm.com > > wrote: > > If you need any of these files or can't find them, lmk and I can send > off > > list. > > > > Best, > > > > Karl > > > > > > > > On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: > > > > Don't know if this will help, but have you looked into > WaveAnalyzer.as > > or > > > >> Flash MX - Audio: Sound completion event (The source files for this > can > >> be > >> found in the Flash MX/Samples folder.) > >> They both let you control the sound. I am thinking this will point > you > >> in > >> a good direction. Its AS2 though. > >> > >> HTH, > >> > >> Karl > >> > >> > >> On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: > >> > >> Ya - I have the data for both things, but they extend over time and > are > >> > >>> difficult to compare. It's the boiling down the signatures into > >>> something > >>> simple and being able to read the playing audio looking for the > match > >>> (or > >>> near match). I thought about using bitmap data and trying to match > up > >>> waveforms, etc. but I don't know enough about it to pull that off. > It > >>> seems > >>> like a hack in a way, but if it worked, who cares I suppose. > >>> > >>> On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < > >>> califa010.flashcod...@gmail.com> wrote: > >>> > >>> > >>> > >>> I'm not Henrik, but I've done some lip-synch stuff for Disney. > We > did > it pretty much the way Eric described--we just used amplitude. > It's > not as accurate as Disney would demand on a film, but it's ok in > the > kids' game market. > > > > >>> I see, amplitudes could be just good enough for some stuff. > > Although the "speed" and the intensitiy of the speech could give >
Re: [Flashcoders] Question about approximate vowel detection in AS3
Yeh - not sure this will help however - a (very talented) colleague of mine worked on a simple speech recognition software for mobile - it was built to recognise about 20 commands with 90% success rate. His approach (in my simplistic terms) was: 1) get recordings / audio samples of the commands (in your case vowels - it should be easier as it's generated so you wont have to compare against too many/different intonations ) - 2) create / store a graph of the audio commands ( this used FFT (s) - to abstract and simplify, the pattern of the commands - the result was a square voice print graph ) 3) The stored patterns/voiceprints were then compared against the users voice recording. The trickiest part of this whole business were the Fast Fourier Transforms - these things get very complicated, and confuse the life out of me. Anyway, hopefully this will help you - seems like it might be the best approach. if you do crack it - you will end up with a simple voice recognition system. Which would be a brilliant and useful thing bit of code to have... hope this was of any use.. - karim On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote: > I would try using that to figure out a way of maping the sounds and then > translate that to your project. You are able to see the wave forms in > soundbooth? Haven't used it. If so, can you run your cursor over it at any > point to get the readings? Might be a little trivial, but may yeild a pattern > that you can utilize. > > JAT > > Karl > > Sent from losPhone > > On Jun 3, 2010, at 6:18 PM, "Eric E. Dolecki" wrote: > >> SoundBooth >> >> On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers wrote: >> >>> Do you have SoundEdit? Or the like? >>> >>> >>> Karl >>> >>> >>> >>> On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: >>> >>> I think I might make waveform bitmaps and then try and compare against the current waveform (block EQ) - and if it's a close match, then fire off specific vowel events. If that works, I could do consonants too. If this works, I'll do jumping jacks and shots of Jack. So how would I compare two bitmaps to see if a waveform ( On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers wrote: If you need any of these files or can't find them, lmk and I can send off > list. > > Best, > > Karl > > > > On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: > > Don't know if this will help, but have you looked into WaveAnalyzer.as > or > >> Flash MX - Audio: Sound completion event (The source files for this can >> be >> found in the Flash MX/Samples folder.) >> They both let you control the sound. I am thinking this will point you >> in >> a good direction. Its AS2 though. >> >> HTH, >> >> Karl >> >> >> On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: >> >> Ya - I have the data for both things, but they extend over time and are >> >>> difficult to compare. It's the boiling down the signatures into >>> something >>> simple and being able to read the playing audio looking for the match >>> (or >>> near match). I thought about using bitmap data and trying to match up >>> waveforms, etc. but I don't know enough about it to pull that off. It >>> seems >>> like a hack in a way, but if it worked, who cares I suppose. >>> >>> On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < >>> califa010.flashcod...@gmail.com> wrote: >>> >>> >>> >>> I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. >>> I see, amplitudes could be just good enough for some stuff. Although the "speed" and the intensitiy of the speech could give misleading results, I think. I'm under the impression that you should somehow try to compare the shape of the waves (somehow simplifiy your input to some value of sets of values that are easier to compare, possibly in a "time window") and compare it in some meaningful way to precalculated samples to find a matching pattern. That's the part I have no clue about! Cheers Juan Pablo Califano 2010/6/3 Kerry Thompson Juan Pablo Califano wrote: > > Wow. That was really uncalled for. > >> >> > That was my reaction, too. I didn't see Eric as complaining--just > asking. Maybe Henrik was just having a bad day. > > For me, the hard part, which you seem to imply is rather simple > here, >
Re: [Flashcoders] Question about approximate vowel detection in AS3
I would try using that to figure out a way of maping the sounds and then translate that to your project. You are able to see the wave forms in soundbooth? Haven't used it. If so, can you run your cursor over it at any point to get the readings? Might be a little trivial, but may yeild a pattern that you can utilize. JAT Karl Sent from losPhone On Jun 3, 2010, at 6:18 PM, "Eric E. Dolecki" wrote: SoundBooth On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers wrote: Do you have SoundEdit? Or the like? Karl On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: I think I might make waveform bitmaps and then try and compare against the current waveform (block EQ) - and if it's a close match, then fire off specific vowel events. If that works, I could do consonants too. If this works, I'll do jumping jacks and shots of Jack. So how would I compare two bitmaps to see if a waveform ( On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers wrote: If you need any of these files or can't find them, lmk and I can send off list. Best, Karl On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: Don't know if this will help, but have you looked into WaveAnalyzer.as or Flash MX - Audio: Sound completion event (The source files for this can be found in the Flash MX/Samples folder.) They both let you control the sound. I am thinking this will point you in a good direction. Its AS2 though. HTH, Karl On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: Ya - I have the data for both things, but they extend over time and are difficult to compare. It's the boiling down the signatures into something simple and being able to read the playing audio looking for the match (or near match). I thought about using bitmap data and trying to match up waveforms, etc. but I don't know enough about it to pull that off. It seems like a hack in a way, but if it worked, who cares I suppose. On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < califa010.flashcod...@gmail.com> wrote: I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. I see, amplitudes could be just good enough for some stuff. Although the "speed" and the intensitiy of the speech could give misleading results, I think. I'm under the impression that you should somehow try to compare the shape of the waves (somehow simplifiy your input to some value of sets of values that are easier to compare, possibly in a "time window") and compare it in some meaningful way to precalculated samples to find a matching pattern. That's the part I have no clue about! Cheers Juan Pablo Califano 2010/6/3 Kerry Thompson Juan Pablo Califano wrote: Wow. That was really uncalled for. That was my reaction, too. I didn't see Eric as complaining-- just asking. Maybe Henrik was just having a bad day. For me, the hard part, which you seem to imply is rather simple here, is *matching+ the input audio against said profiles. Admitedly, I don't know anything about digital signal processing and audio programming in general, but "matching" sounds a bit vague. Perhaps you could enlighten us, I you feel like. I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. Doing something more accurate would probably involve at least 6 mouth positions, and if you're doing it in real time, you'd have to do a reverse FFT. It can be done--there was a really good commercial lip-synch program that generated Action Script to control mouth positions. I don't know if it's still around--that was 5 years ago, and it was pretty expensive (about $2,500 for one seat, I think). It may even have been a Director Xtra that worked with a Flash Sprite, but let's not talk about Director :-P Cordially, Kerry Thompson ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com __
Re: [Flashcoders] Question about approximate vowel detection in AS3
Dude, whether you know it or not, you come off being pretty arrogant with your comments. Don't worry, I won't be ending up on the dailywtf anytime soon. On Thu, Jun 3, 2010 at 6:36 PM, Henrik Andersson wrote: > Before you start reinvesting the squarewheel, at least do some research on > how people are doing it. > > I did not learn enough from it personally, but I can tell that it is a good > book: > http://www.dspguide.com/pdfbook.htm > > Read it and then do the matching algorithm. This way you will avoid making > a solution that deserves to end up on thedailwtf.com > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
SoundBooth On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers wrote: > Do you have SoundEdit? Or the like? > > > Karl > > > > On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: > > I think I might make waveform bitmaps and then try and compare against the >> current waveform (block EQ) - and if it's a close match, then fire off >> specific vowel events. If that works, I could do consonants too. If this >> works, I'll do jumping jacks and shots of Jack. >> >> So how would I compare two bitmaps to see if a waveform ( >> On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers > >wrote: >> >> If you need any of these files or can't find them, lmk and I can send off >>> list. >>> >>> Best, >>> >>> Karl >>> >>> >>> >>> On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: >>> >>> Don't know if this will help, but have you looked into WaveAnalyzer.as >>> or >>> Flash MX - Audio: Sound completion event (The source files for this can be found in the Flash MX/Samples folder.) They both let you control the sound. I am thinking this will point you in a good direction. Its AS2 though. HTH, Karl On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: Ya - I have the data for both things, but they extend over time and are > difficult to compare. It's the boiling down the signatures into > something > simple and being able to read the playing audio looking for the match > (or > near match). I thought about using bitmap data and trying to match up > waveforms, etc. but I don't know enough about it to pull that off. It > seems > like a hack in a way, but if it worked, who cares I suppose. > > On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < > califa010.flashcod...@gmail.com> wrote: > > > > I'm not Henrik, but I've done some lip-synch stuff for Disney. We >> did >> it pretty much the way Eric described--we just used amplitude. It's >> not as accurate as Disney would demand on a film, but it's ok in the >> kids' game market. >> >> >> > I see, amplitudes could be just good enough for some stuff. >> >> Although the "speed" and the intensitiy of the speech could give >> misleading >> results, I think. I'm under the impression that you should somehow try >> to >> compare the shape of the waves (somehow simplifiy your input to some >> value >> of sets of values that are easier to compare, possibly in a "time >> window") >> and compare it in some meaningful way to precalculated samples to find >> a >> matching pattern. That's the part I have no clue about! >> >> Cheers >> Juan Pablo Califano >> >> 2010/6/3 Kerry Thompson >> >> Juan Pablo Califano wrote: >> >>> >>> Wow. That was really uncalled for. >>> >>> That was my reaction, too. I didn't see Eric as complaining--just >>> asking. Maybe Henrik was just having a bad day. >>> >>> For me, the hard part, which you seem to imply is rather simple >>> here, >>> is >>> >> >> *matching+ the input audio against said profiles. Admitedly, I don't >>> know >>> >> >> anything about digital signal processing and audio programming in >>> general, >>> >>> but "matching" sounds a bit vague. Perhaps you could enlighten us, I you >>> >> >> feel like. >>> >>> I'm not Henrik, but I've done some lip-synch stuff for Disney. We did >>> it pretty much the way Eric described--we just used amplitude. It's >>> not as accurate as Disney would demand on a film, but it's ok in the >>> kids' game market. >>> >>> Doing something more accurate would probably involve at least 6 mouth >>> positions, and if you're doing it in real time, you'd have to do a >>> reverse FFT. It can be done--there was a really good commercial >>> lip-synch program that generated Action Script to control mouth >>> positions. I don't know if it's still around--that was 5 years ago, >>> and it was pretty expensive (about $2,500 for one seat, I think). It >>> may even have been a Director Xtra that worked with a Flash Sprite, >>> but let's not talk about Director :-P >>> >>> Cordially, >>> >>> Kerry Thompson >>> ___ >>> Flashcoders mailing list >>> Flashcoders@chattyfig.figleaf.com >>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>> >>> ___ >>> >> Flashcoders mailing list >> Flashcoders@chattyfig.figleaf.com >> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >> >> >> > > -- > http://ericd.net > Interactive design and development > _
Re: [Flashcoders] Question about approximate vowel detection in AS3
Before you start reinvesting the squarewheel, at least do some research on how people are doing it. I did not learn enough from it personally, but I can tell that it is a good book: http://www.dspguide.com/pdfbook.htm Read it and then do the matching algorithm. This way you will avoid making a solution that deserves to end up on thedailwtf.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Do you have SoundEdit? Or the like? Karl On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote: I think I might make waveform bitmaps and then try and compare against the current waveform (block EQ) - and if it's a close match, then fire off specific vowel events. If that works, I could do consonants too. If this works, I'll do jumping jacks and shots of Jack. So how would I compare two bitmaps to see if a waveform ( On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers wrote: If you need any of these files or can't find them, lmk and I can send off list. Best, Karl On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: Don't know if this will help, but have you looked into WaveAnalyzer.as or Flash MX - Audio: Sound completion event (The source files for this can be found in the Flash MX/Samples folder.) They both let you control the sound. I am thinking this will point you in a good direction. Its AS2 though. HTH, Karl On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: Ya - I have the data for both things, but they extend over time and are difficult to compare. It's the boiling down the signatures into something simple and being able to read the playing audio looking for the match (or near match). I thought about using bitmap data and trying to match up waveforms, etc. but I don't know enough about it to pull that off. It seems like a hack in a way, but if it worked, who cares I suppose. On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < califa010.flashcod...@gmail.com> wrote: I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. I see, amplitudes could be just good enough for some stuff. Although the "speed" and the intensitiy of the speech could give misleading results, I think. I'm under the impression that you should somehow try to compare the shape of the waves (somehow simplifiy your input to some value of sets of values that are easier to compare, possibly in a "time window") and compare it in some meaningful way to precalculated samples to find a matching pattern. That's the part I have no clue about! Cheers Juan Pablo Califano 2010/6/3 Kerry Thompson Juan Pablo Califano wrote: Wow. That was really uncalled for. That was my reaction, too. I didn't see Eric as complaining--just asking. Maybe Henrik was just having a bad day. For me, the hard part, which you seem to imply is rather simple here, is *matching+ the input audio against said profiles. Admitedly, I don't know anything about digital signal processing and audio programming in general, but "matching" sounds a bit vague. Perhaps you could enlighten us, I you feel like. I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. Doing something more accurate would probably involve at least 6 mouth positions, and if you're doing it in real time, you'd have to do a reverse FFT. It can be done--there was a really good commercial lip-synch program that generated Action Script to control mouth positions. I don't know if it's still around--that was 5 years ago, and it was pretty expensive (about $2,500 for one seat, I think). It may even have been a Director Xtra that worked with a Flash Sprite, but let's not talk about Director :-P Cordially, Kerry Thompson ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing
Re: [Flashcoders] Question about approximate vowel detection in AS3
I think I might make waveform bitmaps and then try and compare against the current waveform (block EQ) - and if it's a close match, then fire off specific vowel events. If that works, I could do consonants too. If this works, I'll do jumping jacks and shots of Jack. So how would I compare two bitmaps to see if a waveform ( On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers wrote: > If you need any of these files or can't find them, lmk and I can send off > list. > > Best, > > Karl > > > > On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: > > Don't know if this will help, but have you looked into WaveAnalyzer.as or >> Flash MX - Audio: Sound completion event (The source files for this can be >> found in the Flash MX/Samples folder.) >> They both let you control the sound. I am thinking this will point you in >> a good direction. Its AS2 though. >> >> HTH, >> >> Karl >> >> >> On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: >> >> Ya - I have the data for both things, but they extend over time and are >>> difficult to compare. It's the boiling down the signatures into something >>> simple and being able to read the playing audio looking for the match (or >>> near match). I thought about using bitmap data and trying to match up >>> waveforms, etc. but I don't know enough about it to pull that off. It >>> seems >>> like a hack in a way, but if it worked, who cares I suppose. >>> >>> On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < >>> califa010.flashcod...@gmail.com> wrote: >>> >>> >>> I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. >>> I see, amplitudes could be just good enough for some stuff. Although the "speed" and the intensitiy of the speech could give misleading results, I think. I'm under the impression that you should somehow try to compare the shape of the waves (somehow simplifiy your input to some value of sets of values that are easier to compare, possibly in a "time window") and compare it in some meaningful way to precalculated samples to find a matching pattern. That's the part I have no clue about! Cheers Juan Pablo Califano 2010/6/3 Kerry Thompson Juan Pablo Califano wrote: > > Wow. That was really uncalled for. >> > > That was my reaction, too. I didn't see Eric as complaining--just > asking. Maybe Henrik was just having a bad day. > > For me, the hard part, which you seem to imply is rather simple here, >> > is > *matching+ the input audio against said profiles. Admitedly, I don't >> > know > anything about digital signal processing and audio programming in >> > general, > >> but "matching" sounds a bit vague. Perhaps you could enlighten us, I >> > you > feel like. >> > > I'm not Henrik, but I've done some lip-synch stuff for Disney. We did > it pretty much the way Eric described--we just used amplitude. It's > not as accurate as Disney would demand on a film, but it's ok in the > kids' game market. > > Doing something more accurate would probably involve at least 6 mouth > positions, and if you're doing it in real time, you'd have to do a > reverse FFT. It can be done--there was a really good commercial > lip-synch program that generated Action Script to control mouth > positions. I don't know if it's still around--that was 5 years ago, > and it was pretty expensive (about $2,500 for one seat, I think). It > may even have been a Director Xtra that worked with a Flash Sprite, > but let's not talk about Director :-P > > Cordially, > > Kerry Thompson > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > > ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>> >>> >>> -- >>> http://ericd.net >>> Interactive design and development >>> ___ >>> Flashcoders mailing list >>> Flashcoders@chattyfig.figleaf.com >>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>> >> >> Karl DeSaulniers >> Design Drumm >> http://designdrumm.com >> >> ___ >> Flashcoders mailing list >> Flashcoders@chattyfig.figleaf.com >> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >> > > Karl DeSaulniers > Design Drumm > http://designdrumm.com > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figlea
Re: [Flashcoders] Question about approximate vowel detection in AS3
If you need any of these files or can't find them, lmk and I can send off list. Best, Karl On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote: Don't know if this will help, but have you looked into WaveAnalyzer.as or Flash MX - Audio: Sound completion event (The source files for this can be found in the Flash MX/Samples folder.) They both let you control the sound. I am thinking this will point you in a good direction. Its AS2 though. HTH, Karl On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: Ya - I have the data for both things, but they extend over time and are difficult to compare. It's the boiling down the signatures into something simple and being able to read the playing audio looking for the match (or near match). I thought about using bitmap data and trying to match up waveforms, etc. but I don't know enough about it to pull that off. It seems like a hack in a way, but if it worked, who cares I suppose. On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < califa010.flashcod...@gmail.com> wrote: I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. I see, amplitudes could be just good enough for some stuff. Although the "speed" and the intensitiy of the speech could give misleading results, I think. I'm under the impression that you should somehow try to compare the shape of the waves (somehow simplifiy your input to some value of sets of values that are easier to compare, possibly in a "time window") and compare it in some meaningful way to precalculated samples to find a matching pattern. That's the part I have no clue about! Cheers Juan Pablo Califano 2010/6/3 Kerry Thompson Juan Pablo Califano wrote: Wow. That was really uncalled for. That was my reaction, too. I didn't see Eric as complaining--just asking. Maybe Henrik was just having a bad day. For me, the hard part, which you seem to imply is rather simple here, is *matching+ the input audio against said profiles. Admitedly, I don't know anything about digital signal processing and audio programming in general, but "matching" sounds a bit vague. Perhaps you could enlighten us, I you feel like. I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. Doing something more accurate would probably involve at least 6 mouth positions, and if you're doing it in real time, you'd have to do a reverse FFT. It can be done--there was a really good commercial lip-synch program that generated Action Script to control mouth positions. I don't know if it's still around--that was 5 years ago, and it was pretty expensive (about $2,500 for one seat, I think). It may even have been a Director Xtra that worked with a Flash Sprite, but let's not talk about Director :-P Cordially, Kerry Thompson ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Don't know if this will help, but have you looked into WaveAnalyzer.as or Flash MX - Audio: Sound completion event (The source files for this can be found in the Flash MX/Samples folder.) They both let you control the sound. I am thinking this will point you in a good direction. Its AS2 though. HTH, Karl On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote: Ya - I have the data for both things, but they extend over time and are difficult to compare. It's the boiling down the signatures into something simple and being able to read the playing audio looking for the match (or near match). I thought about using bitmap data and trying to match up waveforms, etc. but I don't know enough about it to pull that off. It seems like a hack in a way, but if it worked, who cares I suppose. On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < califa010.flashcod...@gmail.com> wrote: I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. I see, amplitudes could be just good enough for some stuff. Although the "speed" and the intensitiy of the speech could give misleading results, I think. I'm under the impression that you should somehow try to compare the shape of the waves (somehow simplifiy your input to some value of sets of values that are easier to compare, possibly in a "time window") and compare it in some meaningful way to precalculated samples to find a matching pattern. That's the part I have no clue about! Cheers Juan Pablo Califano 2010/6/3 Kerry Thompson Juan Pablo Califano wrote: Wow. That was really uncalled for. That was my reaction, too. I didn't see Eric as complaining--just asking. Maybe Henrik was just having a bad day. For me, the hard part, which you seem to imply is rather simple here, is *matching+ the input audio against said profiles. Admitedly, I don't know anything about digital signal processing and audio programming in general, but "matching" sounds a bit vague. Perhaps you could enlighten us, I you feel like. I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. Doing something more accurate would probably involve at least 6 mouth positions, and if you're doing it in real time, you'd have to do a reverse FFT. It can be done--there was a really good commercial lip-synch program that generated Action Script to control mouth positions. I don't know if it's still around--that was 5 years ago, and it was pretty expensive (about $2,500 for one seat, I think). It may even have been a Director Xtra that worked with a Flash Sprite, but let's not talk about Director :-P Cordially, Kerry Thompson ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Jason Merrill wrote: > You're probably thinking of the Flash-based SitePal: > http://www.sitepal.com/ ? It could have been. I honestly don't remember--it was at least 5 years ago. We considered using the software, but the studio head vetoed it as too expensive, especially since we already had an Xtra, written in C++, to measure amplitude. Basically, the higher the amplitude, the more open the mouth was. I think we only used 3-4 mouth positions, and it was good enough for Disney. It was a series of games based on Disney Channel properties (or cartoons, as they are known in the real world). I don't watch much Disney Channel, but I suspect their lip synch isn't up to the same standards as their movies. Cordially, Kerry Thompson ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Ya - I have the data for both things, but they extend over time and are difficult to compare. It's the boiling down the signatures into something simple and being able to read the playing audio looking for the match (or near match). I thought about using bitmap data and trying to match up waveforms, etc. but I don't know enough about it to pull that off. It seems like a hack in a way, but if it worked, who cares I suppose. On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano < califa010.flashcod...@gmail.com> wrote: > >>> > > I'm not Henrik, but I've done some lip-synch stuff for Disney. We did > it pretty much the way Eric described--we just used amplitude. It's > not as accurate as Disney would demand on a film, but it's ok in the > kids' game market. > > >>> > > I see, amplitudes could be just good enough for some stuff. > > Although the "speed" and the intensitiy of the speech could give misleading > results, I think. I'm under the impression that you should somehow try to > compare the shape of the waves (somehow simplifiy your input to some value > of sets of values that are easier to compare, possibly in a "time window") > and compare it in some meaningful way to precalculated samples to find a > matching pattern. That's the part I have no clue about! > > Cheers > Juan Pablo Califano > > 2010/6/3 Kerry Thompson > > > Juan Pablo Califano wrote: > > > > > Wow. That was really uncalled for. > > > > That was my reaction, too. I didn't see Eric as complaining--just > > asking. Maybe Henrik was just having a bad day. > > > > > For me, the hard part, which you seem to imply is rather simple here, > is > > > *matching+ the input audio against said profiles. Admitedly, I don't > know > > > anything about digital signal processing and audio programming in > > general, > > > but "matching" sounds a bit vague. Perhaps you could enlighten us, I > you > > > feel like. > > > > I'm not Henrik, but I've done some lip-synch stuff for Disney. We did > > it pretty much the way Eric described--we just used amplitude. It's > > not as accurate as Disney would demand on a film, but it's ok in the > > kids' game market. > > > > Doing something more accurate would probably involve at least 6 mouth > > positions, and if you're doing it in real time, you'd have to do a > > reverse FFT. It can be done--there was a really good commercial > > lip-synch program that generated Action Script to control mouth > > positions. I don't know if it's still around--that was 5 years ago, > > and it was pretty expensive (about $2,500 for one seat, I think). It > > may even have been a Director Xtra that worked with a Flash Sprite, > > but let's not talk about Director :-P > > > > Cordially, > > > > Kerry Thompson > > ___ > > Flashcoders mailing list > > Flashcoders@chattyfig.figleaf.com > > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
>>> I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. >>> I see, amplitudes could be just good enough for some stuff. Although the "speed" and the intensitiy of the speech could give misleading results, I think. I'm under the impression that you should somehow try to compare the shape of the waves (somehow simplifiy your input to some value of sets of values that are easier to compare, possibly in a "time window") and compare it in some meaningful way to precalculated samples to find a matching pattern. That's the part I have no clue about! Cheers Juan Pablo Califano 2010/6/3 Kerry Thompson > Juan Pablo Califano wrote: > > > Wow. That was really uncalled for. > > That was my reaction, too. I didn't see Eric as complaining--just > asking. Maybe Henrik was just having a bad day. > > > For me, the hard part, which you seem to imply is rather simple here, is > > *matching+ the input audio against said profiles. Admitedly, I don't know > > anything about digital signal processing and audio programming in > general, > > but "matching" sounds a bit vague. Perhaps you could enlighten us, I you > > feel like. > > I'm not Henrik, but I've done some lip-synch stuff for Disney. We did > it pretty much the way Eric described--we just used amplitude. It's > not as accurate as Disney would demand on a film, but it's ok in the > kids' game market. > > Doing something more accurate would probably involve at least 6 mouth > positions, and if you're doing it in real time, you'd have to do a > reverse FFT. It can be done--there was a really good commercial > lip-synch program that generated Action Script to control mouth > positions. I don't know if it's still around--that was 5 years ago, > and it was pretty expensive (about $2,500 for one seat, I think). It > may even have been a Director Xtra that worked with a Flash Sprite, > but let's not talk about Director :-P > > Cordially, > > Kerry Thompson > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
RE: [Flashcoders] Question about approximate vowel detection in AS3
>>there was a really good commercial lip-synch program that generated >> Action Script to control mouth positions. >> I don't know if it's still around--that was 5 years ago, >>and it was pretty expensive (about $2,500 for one seat, I think). >> It may even have >>been a Director Xtra that worked with a Flash Sprite You're probably thinking of the Flash-based SitePal: http://www.sitepal.com/ ? We had licenses for a while - they dropped it after a while as we discovered the audience thought they were annoying and ongoing license fees too ridiculous. They have gotten better - visually at least, but having to keep paying to use technology like this is kinda stupid I think. Jason Merrill Instructional Technology Architect Bank of America Global Learning Join the Bank of America Flash Platform Community and visit our Instructional Technology Design Blog (note: these are for Bank of America employees only) ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
RE: [Flashcoders] Question about approximate vowel detection in AS3
>> I meant no ill will. >> I just meant that there is a better solution than this idea. Ah, so that's what you meant by "stop complaining". I see, that makes perfect sense now. ;) Jason Merrill Instructional Technology Architect Bank of America Global Learning Join the Bank of America Flash Platform Community and visit our Instructional Technology Design Blog (note: these are for Bank of America employees only) ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Juan Pablo Califano wrote: > Wow. That was really uncalled for. That was my reaction, too. I didn't see Eric as complaining--just asking. Maybe Henrik was just having a bad day. > For me, the hard part, which you seem to imply is rather simple here, is > *matching+ the input audio against said profiles. Admitedly, I don't know > anything about digital signal processing and audio programming in general, > but "matching" sounds a bit vague. Perhaps you could enlighten us, I you > feel like. I'm not Henrik, but I've done some lip-synch stuff for Disney. We did it pretty much the way Eric described--we just used amplitude. It's not as accurate as Disney would demand on a film, but it's ok in the kids' game market. Doing something more accurate would probably involve at least 6 mouth positions, and if you're doing it in real time, you'd have to do a reverse FFT. It can be done--there was a really good commercial lip-synch program that generated Action Script to control mouth positions. I don't know if it's still around--that was 5 years ago, and it was pretty expensive (about $2,500 for one seat, I think). It may even have been a Director Xtra that worked with a Flash Sprite, but let's not talk about Director :-P Cordially, Kerry Thompson ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Juan Pablo Califano wrote: Wow. That was really uncalled for. I meant no ill will. I just meant that there is a better solution than this idea. For me, the hard part, which you seem to imply is rather simple here, is *matching+ the input audio against said profiles. Admitedly, I don't know anything about digital signal processing and audio programming in general, but "matching" sounds a bit vague. Perhaps you could enlighten us, I you feel like. Since you asked, I actually don't know how to do it myself. I have honestly studied the subject, but I still don't know how to do it. ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
I'm abandoning the whole vowel recognition unless I can find something someone else has done to base my implementation on. I've burnt too much time on it for something that won't give a whole lot of bang for the buck. It's a very complex problem (for me anyway). Eric On Thu, Jun 3, 2010 at 2:37 PM, Juan Pablo Califano < califa010.flashcod...@gmail.com> wrote: > Wow. That was really uncalled for. > > Anyway, if you can pre-generate samples for all vowels for all samples, I > can't see why comparing them to the speech generated by the same system > would be any harder than comparing it to a number of collected profiles. > > >>> > You really just need to collect profiles to match against. Record people > saying stuff and match the recordings with the live data. When they match, > you know what the vocal is saying. > >>> > > For me, the hard part, which you seem to imply is rather simple here, is > *matching+ the input audio against said profiles. Admitedly, I don't know > anything about digital signal processing and audio programming in general, > but "matching" sounds a bit vague. Perhaps you could enlighten us, I you > feel like. > > Cheers > Juan Pablo Califano > > 2010/6/3 Henrik Andersson > > > Eric E. Dolecki wrote: > > > >> It's using dynamic text to speech, so I wouldn't be able to use cue > points > >> reliably. > >> > >> > > Use dynamic cuepoints and stop complaining. If it can generate voice, it > > can tell you what kinds of voice it put where. It is far more exact than > > trying to reverse the incredibly lossy transformation that the synthesis > is. > > > > > > ___ > > Flashcoders mailing list > > Flashcoders@chattyfig.figleaf.com > > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Wow. That was really uncalled for. Anyway, if you can pre-generate samples for all vowels for all samples, I can't see why comparing them to the speech generated by the same system would be any harder than comparing it to a number of collected profiles. >>> You really just need to collect profiles to match against. Record people saying stuff and match the recordings with the live data. When they match, you know what the vocal is saying. >>> For me, the hard part, which you seem to imply is rather simple here, is *matching+ the input audio against said profiles. Admitedly, I don't know anything about digital signal processing and audio programming in general, but "matching" sounds a bit vague. Perhaps you could enlighten us, I you feel like. Cheers Juan Pablo Califano 2010/6/3 Henrik Andersson > Eric E. Dolecki wrote: > >> It's using dynamic text to speech, so I wouldn't be able to use cue points >> reliably. >> >> > Use dynamic cuepoints and stop complaining. If it can generate voice, it > can tell you what kinds of voice it put where. It is far more exact than > trying to reverse the incredibly lossy transformation that the synthesis is. > > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Have a look at this stuff, don't know if there is source, but it might be a start... http://www.allflashwebsite.com/article/real-time-lip-sync-in-flash On 03/06/2010 15:03, Eric E. Dolecki wrote: I've tried running software voice vowels through the system and I am able to create "signatures" for the vowels that's somewhat accurate (depending on how it's influenced in a word or if it's standalone). I've run them several times and my values always seem to match (which is good). I end up with a very long stream of numbers for a signature because of the enter frame. I am wondering what the best way to compare the currents to over a period of time to match known values might be. What's a fast/best lookup means to check against? For instance, a spoken "A" for me looks like this: speech loaded 0.16304096207022667 0.16304096207022667 0.16304096207022667 0.16304096207022667 0.4167095571756363 1.840158924460411 1.840158924460411 2.3130274564027786 2.7141911536455154 2.7141911536455154 5.49285389482975 8.781380131840706 9.142853170633316 9.142853170633316 ... TONS more data... On Thu, Jun 3, 2010 at 8:23 AM, Eric E. Dolecki wrote: I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no microphone source)? That might be enough but I'm not sure. On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulnierswrote: You could try matching say a lowered jaw with low octaves and a cheeky jaw with high octaves. JAT Karl On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote: This is a software voice, so nailing down vowels should be easier. However you mention matching recordings with the live data. What is being matched? Some kind of pattern I suppose. What form would the pattern take? How long of a sample should be checked continuously, etc.? It's a big topic. I understand your concept of how to do it, but I don't have the technical expertise or foundation to implement the idea yet. Eric On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson wrote: Eric E. Dolecki wrote: I have a face that uses computeSpectrum in order to sync a mouth with dynamic vocal-only MP3s... it works, but works much like a robot mouth. The jaw animates by certain amounts based on volume. I am trying to somehow get vowel approximations so that I can fire off some events to update the mouth UI. Does anyone have any kind of algo that can somehow get close enough readings from audio to detect vowels? Anything I can do besides random to adjust the mouth shape will go miles in making my face look more realistic. You really just need to collect profiles to match against. Record people saying stuff and match the recordings with the live data. When they match, you know what the vocal is saying. ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
RE: [Flashcoders] Question about approximate vowel detection in AS3
>>My most humble apologies go out to Henrik and anyone else who felt that I was >>complaining about something. Which I wasn't. I don't think you need to apologize, I didn't think you were complaining at all - just stating your view of how you see this technique working with your project. Jason Merrill Instructional Technology Architect Bank of America Global Learning Join the Bank of America Flash Platform Community and visit our Instructional Technology Design Blog (note: these are for Bank of America employees only) ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
My most humble apologies go out to Henrik and anyone else who felt that I was complaining about something. Which I wasn't. On Thu, Jun 3, 2010 at 10:02 AM, Henrik Andersson wrote: > Eric E. Dolecki wrote: > >> It's using dynamic text to speech, so I wouldn't be able to use cue points >> reliably. >> >> > Use dynamic cuepoints and stop complaining. If it can generate voice, it > can tell you what kinds of voice it put where. It is far more exact than > trying to reverse the incredibly lossy transformation that the synthesis is. > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
I've tried running software voice vowels through the system and I am able to create "signatures" for the vowels that's somewhat accurate (depending on how it's influenced in a word or if it's standalone). I've run them several times and my values always seem to match (which is good). I end up with a very long stream of numbers for a signature because of the enter frame. I am wondering what the best way to compare the currents to over a period of time to match known values might be. What's a fast/best lookup means to check against? For instance, a spoken "A" for me looks like this: speech loaded 0.16304096207022667 0.16304096207022667 0.16304096207022667 0.16304096207022667 0.4167095571756363 1.840158924460411 1.840158924460411 2.3130274564027786 2.7141911536455154 2.7141911536455154 5.49285389482975 8.781380131840706 9.142853170633316 9.142853170633316 ... TONS more data... On Thu, Jun 3, 2010 at 8:23 AM, Eric E. Dolecki wrote: > I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no > microphone source)? That might be enough but I'm not sure. > > > On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniers wrote: > >> You could try matching say a lowered jaw with low octaves and a cheeky jaw >> with high octaves. >> JAT >> >> >> Karl >> >> >> On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote: >> >> This is a software voice, so nailing down vowels should be easier. >>> However >>> you mention matching recordings with the live data. What is being >>> matched? >>> Some kind of pattern I suppose. What form would the pattern take? How >>> long >>> of a sample should be checked continuously, etc.? >>> >>> It's a big topic. I understand your concept of how to do it, but I don't >>> have the technical expertise or foundation to implement the idea yet. >>> >>> Eric >>> >>> >>> On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson >> >wrote: >>> >>> Eric E. Dolecki wrote: I have a face that uses computeSpectrum in order to sync a mouth with > dynamic vocal-only MP3s... it works, but works much like a robot mouth. > The > jaw animates by certain amounts based on volume. > > I am trying to somehow get vowel approximations so that I can fire off > some > events to update the mouth UI. Does anyone have any kind of algo that > can > somehow get close enough readings from audio to detect vowels? Anything > I > can do besides random to adjust the mouth shape will go miles in making > my > face look more realistic. > > > You really just need to collect profiles to match against. Record people saying stuff and match the recordings with the live data. When they match, you know what the vocal is saying. ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>> >>> >>> -- >>> http://ericd.net >>> Interactive design and development >>> ___ >>> Flashcoders mailing list >>> Flashcoders@chattyfig.figleaf.com >>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>> >> >> Karl DeSaulniers >> Design Drumm >> http://designdrumm.com >> >> >> ___ >> Flashcoders mailing list >> Flashcoders@chattyfig.figleaf.com >> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >> > > > > -- > http://ericd.net > Interactive design and development > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Eric E. Dolecki wrote: It's using dynamic text to speech, so I wouldn't be able to use cue points reliably. Use dynamic cuepoints and stop complaining. If it can generate voice, it can tell you what kinds of voice it put where. It is far more exact than trying to reverse the incredibly lossy transformation that the synthesis is. ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no microphone source)? That might be enough but I'm not sure. On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniers wrote: > You could try matching say a lowered jaw with low octaves and a cheeky jaw > with high octaves. > JAT > > > Karl > > > On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote: > > This is a software voice, so nailing down vowels should be easier. However >> you mention matching recordings with the live data. What is being matched? >> Some kind of pattern I suppose. What form would the pattern take? How long >> of a sample should be checked continuously, etc.? >> >> It's a big topic. I understand your concept of how to do it, but I don't >> have the technical expertise or foundation to implement the idea yet. >> >> Eric >> >> >> On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson > >wrote: >> >> Eric E. Dolecki wrote: >>> >>> I have a face that uses computeSpectrum in order to sync a mouth with dynamic vocal-only MP3s... it works, but works much like a robot mouth. The jaw animates by certain amounts based on volume. I am trying to somehow get vowel approximations so that I can fire off some events to update the mouth UI. Does anyone have any kind of algo that can somehow get close enough readings from audio to detect vowels? Anything I can do besides random to adjust the mouth shape will go miles in making my face look more realistic. You really just need to collect profiles to match against. Record >>> people >>> saying stuff and match the recordings with the live data. When they >>> match, >>> you know what the vocal is saying. >>> ___ >>> Flashcoders mailing list >>> Flashcoders@chattyfig.figleaf.com >>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>> >>> >> >> >> -- >> http://ericd.net >> Interactive design and development >> ___ >> Flashcoders mailing list >> Flashcoders@chattyfig.figleaf.com >> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >> > > Karl DeSaulniers > Design Drumm > http://designdrumm.com > > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
It's using dynamic text to speech, so I wouldn't be able to use cue points reliably. On Thu, Jun 3, 2010 at 4:09 AM, Glen Pike wrote: > If your mp3's are pre-recorded rather than people recording them > dynamically, could you use cue points? > > > On 02/06/2010 20:57, Eric E. Dolecki wrote: > >> I have a face that uses computeSpectrum in order to sync a mouth with >> dynamic vocal-only MP3s... it works, but works much like a robot mouth. >> The >> jaw animates by certain amounts based on volume. >> >> I am trying to somehow get vowel approximations so that I can fire off >> some >> events to update the mouth UI. Does anyone have any kind of algo that can >> somehow get close enough readings from audio to detect vowels? Anything I >> can do besides random to adjust the mouth shape will go miles in making my >> face look more realistic. >> >> Thanks for any insights. >> >> Eric >> ___ >> Flashcoders mailing list >> Flashcoders@chattyfig.figleaf.com >> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >> >> >> >> > > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
You could try matching say a lowered jaw with low octaves and a cheeky jaw with high octaves. JAT Karl On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote: This is a software voice, so nailing down vowels should be easier. However you mention matching recordings with the live data. What is being matched? Some kind of pattern I suppose. What form would the pattern take? How long of a sample should be checked continuously, etc.? It's a big topic. I understand your concept of how to do it, but I don't have the technical expertise or foundation to implement the idea yet. Eric On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson wrote: Eric E. Dolecki wrote: I have a face that uses computeSpectrum in order to sync a mouth with dynamic vocal-only MP3s... it works, but works much like a robot mouth. The jaw animates by certain amounts based on volume. I am trying to somehow get vowel approximations so that I can fire off some events to update the mouth UI. Does anyone have any kind of algo that can somehow get close enough readings from audio to detect vowels? Anything I can do besides random to adjust the mouth shape will go miles in making my face look more realistic. You really just need to collect profiles to match against. Record people saying stuff and match the recordings with the live data. When they match, you know what the vocal is saying. ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders Karl DeSaulniers Design Drumm http://designdrumm.com ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
If your mp3's are pre-recorded rather than people recording them dynamically, could you use cue points? On 02/06/2010 20:57, Eric E. Dolecki wrote: I have a face that uses computeSpectrum in order to sync a mouth with dynamic vocal-only MP3s... it works, but works much like a robot mouth. The jaw animates by certain amounts based on volume. I am trying to somehow get vowel approximations so that I can fire off some events to update the mouth UI. Does anyone have any kind of algo that can somehow get close enough readings from audio to detect vowels? Anything I can do besides random to adjust the mouth shape will go miles in making my face look more realistic. Thanks for any insights. Eric ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
This is a software voice, so nailing down vowels should be easier. However you mention matching recordings with the live data. What is being matched? Some kind of pattern I suppose. What form would the pattern take? How long of a sample should be checked continuously, etc.? It's a big topic. I understand your concept of how to do it, but I don't have the technical expertise or foundation to implement the idea yet. Eric On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson wrote: > Eric E. Dolecki wrote: > >> I have a face that uses computeSpectrum in order to sync a mouth with >> dynamic vocal-only MP3s... it works, but works much like a robot mouth. >> The >> jaw animates by certain amounts based on volume. >> >> I am trying to somehow get vowel approximations so that I can fire off >> some >> events to update the mouth UI. Does anyone have any kind of algo that can >> somehow get close enough readings from audio to detect vowels? Anything I >> can do besides random to adjust the mouth shape will go miles in making my >> face look more realistic. >> >> > You really just need to collect profiles to match against. Record people > saying stuff and match the recordings with the live data. When they match, > you know what the vocal is saying. > ___ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- http://ericd.net Interactive design and development ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
Re: [Flashcoders] Question about approximate vowel detection in AS3
Eric E. Dolecki wrote: I have a face that uses computeSpectrum in order to sync a mouth with dynamic vocal-only MP3s... it works, but works much like a robot mouth. The jaw animates by certain amounts based on volume. I am trying to somehow get vowel approximations so that I can fire off some events to update the mouth UI. Does anyone have any kind of algo that can somehow get close enough readings from audio to detect vowels? Anything I can do besides random to adjust the mouth shape will go miles in making my face look more realistic. You really just need to collect profiles to match against. Record people saying stuff and match the recordings with the live data. When they match, you know what the vocal is saying. ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
[Flashcoders] Question about approximate vowel detection in AS3
I have a face that uses computeSpectrum in order to sync a mouth with dynamic vocal-only MP3s... it works, but works much like a robot mouth. The jaw animates by certain amounts based on volume. I am trying to somehow get vowel approximations so that I can fire off some events to update the mouth UI. Does anyone have any kind of algo that can somehow get close enough readings from audio to detect vowels? Anything I can do besides random to adjust the mouth shape will go miles in making my face look more realistic. Thanks for any insights. Eric ___ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders