I've tried running software voice vowels through the system and I am able to create "signatures" for the vowels that's somewhat accurate (depending on how it's influenced in a word or if it's standalone). I've run them several times and my values always seem to match (which is good). I end up with a very long stream of numbers for a signature because of the enter frame. I am wondering what the best way to compare the currents to over a period of time to match known values might be. What's a fast/best lookup means to check against?
For instance, a spoken "A" for me looks like this: speech loaded 0.16304096207022667 0.16304096207022667 0.16304096207022667 0.16304096207022667 0.4167095571756363 1.840158924460411 1.840158924460411 2.3130274564027786 2.7141911536455154 2.7141911536455154 5.49285389482975 8.781380131840706 9.142853170633316 9.142853170633316 ... TONS more data... On Thu, Jun 3, 2010 at 8:23 AM, Eric E. Dolecki <edole...@gmail.com> wrote: > I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no > microphone source)? That might be enough but I'm not sure. > > > On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniers <k...@designdrumm.com>wrote: > >> You could try matching say a lowered jaw with low octaves and a cheeky jaw >> with high octaves. >> JAT >> >> >> Karl >> >> >> On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote: >> >> This is a software voice, so nailing down vowels should be easier. >>> However >>> you mention matching recordings with the live data. What is being >>> matched? >>> Some kind of pattern I suppose. What form would the pattern take? How >>> long >>> of a sample should be checked continuously, etc.? >>> >>> It's a big topic. I understand your concept of how to do it, but I don't >>> have the technical expertise or foundation to implement the idea yet. >>> >>> Eric >>> >>> >>> On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson <he...@henke37.cjb.net >>> >wrote: >>> >>> Eric E. Dolecki wrote: >>>> >>>> I have a face that uses computeSpectrum in order to sync a mouth with >>>>> dynamic vocal-only MP3s... it works, but works much like a robot mouth. >>>>> The >>>>> jaw animates by certain amounts based on volume. >>>>> >>>>> I am trying to somehow get vowel approximations so that I can fire off >>>>> some >>>>> events to update the mouth UI. Does anyone have any kind of algo that >>>>> can >>>>> somehow get close enough readings from audio to detect vowels? Anything >>>>> I >>>>> can do besides random to adjust the mouth shape will go miles in making >>>>> my >>>>> face look more realistic. >>>>> >>>>> >>>>> You really just need to collect profiles to match against. Record >>>> people >>>> saying stuff and match the recordings with the live data. When they >>>> match, >>>> you know what the vocal is saying. >>>> _______________________________________________ >>>> Flashcoders mailing list >>>> Flashcoders@chattyfig.figleaf.com >>>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>>> >>>> >>> >>> >>> -- >>> http://ericd.net >>> Interactive design and development >>> _______________________________________________ >>> Flashcoders mailing list >>> Flashcoders@chattyfig.figleaf.com >>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >>> >> >> Karl DeSaulniers >> Design Drumm >> http://designdrumm.com >> >> >> _______________________________________________ >> Flashcoders mailing list >> Flashcoders@chattyfig.figleaf.com >> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >> > > > > -- > http://ericd.net > Interactive design and development > -- http://ericd.net Interactive design and development _______________________________________________ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders