I've tried running software voice vowels through the system and I am able to
create "signatures" for the vowels that's somewhat accurate (depending on
how it's influenced in a word or if it's standalone). I've run them several
times and my values always seem to match (which is good). I end up with a
very long stream of numbers for a signature because of the enter frame. I am
wondering what the best way to compare the currents to over a period of time
to match known values might be. What's a fast/best lookup means to check
against?

For instance, a spoken "A" for me looks like this:

speech loaded
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.4167095571756363
1.840158924460411
1.840158924460411
2.3130274564027786
2.7141911536455154
2.7141911536455154
5.49285389482975
8.781380131840706
9.142853170633316
9.142853170633316
... TONS more data...




On Thu, Jun 3, 2010 at 8:23 AM, Eric E. Dolecki <edole...@gmail.com> wrote:

> I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no
> microphone source)? That might be enough but I'm not sure.
>
>
> On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniers <k...@designdrumm.com>wrote:
>
>> You could try matching say a lowered jaw with low octaves and a cheeky jaw
>> with high octaves.
>> JAT
>>
>>
>> Karl
>>
>>
>> On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote:
>>
>>  This is a software voice, so nailing down vowels should be easier.
>>> However
>>> you mention matching recordings with the live data. What is being
>>> matched?
>>> Some kind of pattern I suppose. What form would the pattern take? How
>>> long
>>> of a sample should be checked continuously, etc.?
>>>
>>> It's a big topic. I understand your concept of how to do it, but I don't
>>> have the technical expertise or foundation to implement the idea yet.
>>>
>>> Eric
>>>
>>>
>>> On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson <he...@henke37.cjb.net
>>> >wrote:
>>>
>>>  Eric E. Dolecki wrote:
>>>>
>>>>  I have a face that uses computeSpectrum in order to sync a mouth with
>>>>> dynamic vocal-only MP3s... it works, but works much like a robot mouth.
>>>>> The
>>>>> jaw animates by certain amounts based on volume.
>>>>>
>>>>> I am trying to somehow get vowel approximations so that I can fire off
>>>>> some
>>>>> events to update the mouth UI. Does anyone have any kind of algo that
>>>>> can
>>>>> somehow get close enough readings from audio to detect vowels? Anything
>>>>> I
>>>>> can do besides random to adjust the mouth shape will go miles in making
>>>>> my
>>>>> face look more realistic.
>>>>>
>>>>>
>>>>>  You really just need to collect profiles to match against. Record
>>>> people
>>>> saying stuff and match the recordings with the live data. When they
>>>> match,
>>>> you know what the vocal is saying.
>>>> _______________________________________________
>>>> Flashcoders mailing list
>>>> Flashcoders@chattyfig.figleaf.com
>>>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>>>>
>>>>
>>>
>>>
>>> --
>>> http://ericd.net
>>> Interactive design and development
>>> _______________________________________________
>>> Flashcoders mailing list
>>> Flashcoders@chattyfig.figleaf.com
>>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>>>
>>
>> Karl DeSaulniers
>> Design Drumm
>> http://designdrumm.com
>>
>>
>> _______________________________________________
>> Flashcoders mailing list
>> Flashcoders@chattyfig.figleaf.com
>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>>
>
>
>
> --
> http://ericd.net
> Interactive design and development
>



-- 
http://ericd.net
Interactive design and development
_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Reply via email to