Wow. That was really uncalled for. Anyway, if you can pre-generate samples for all vowels for all samples, I can't see why comparing them to the speech generated by the same system would be any harder than comparing it to a number of collected profiles.
>>> You really just need to collect profiles to match against. Record people saying stuff and match the recordings with the live data. When they match, you know what the vocal is saying. >>> For me, the hard part, which you seem to imply is rather simple here, is *matching+ the input audio against said profiles. Admitedly, I don't know anything about digital signal processing and audio programming in general, but "matching" sounds a bit vague. Perhaps you could enlighten us, I you feel like. Cheers Juan Pablo Califano 2010/6/3 Henrik Andersson <[email protected]> > Eric E. Dolecki wrote: > >> It's using dynamic text to speech, so I wouldn't be able to use cue points >> reliably. >> >> > Use dynamic cuepoints and stop complaining. If it can generate voice, it > can tell you what kinds of voice it put where. It is far more exact than > trying to reverse the incredibly lossy transformation that the synthesis is. > > > _______________________________________________ > Flashcoders mailing list > [email protected] > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > _______________________________________________ Flashcoders mailing list [email protected] http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

