Steve, why didn't you say so in the first place?? ;P Thanks for the info - juju
On Thu, Aug 27, 2009 at 1:41 PM, Steven Sacks <flash...@stevensacks.net>wrote: > This is how you record sound: > http://www.getmicrophone.com/?p=69 > > If you're asking how to convert sound waves into speech, dude, what? Do > you realize how challenging speech recognition is? Wait, why am I asking > you this? If you did, you wouldn't be asking people on a Flash list how to > do it, as if it's some piece of code somebody can copy and paste or a few > links that will tell you the secret formula. > > Most speech to text programs are based on the Hidden Markov models. In > speech recognition, the hidden Markov model would output a sequence of > n-dimensional real-valued vectors (with n being a small integer, such as > 10), outputting one of these every 10 milliseconds. The vectors would > consist of cepstral coefficients, which are obtained by taking a Fourier > transform of a short time window of speech and decorrelating the spectrum > using a cosine transform, then taking the first (most significant) > coefficients. The hidden Markov model will tend to have in each state a > statistical distribution that is a mixture of diagonal covariance Gaussians > which will give a likelihood for each observed vector. Each word, or (for > more general speech recognition systems), each phoneme, will have a > different output distribution; a hidden Markov model for a sequence of words > or phonemes is made by concatenating the individual trained hidden Markov > models for the separate words and phonemes. > > There you have it. That's a high level overview of speech to text. Do you > understand anything in that paragraph? Probably not. > > Unless you're willing to study and put in the time to figure out how to do > this, you're not going to figure it out. Nobody is going to point you in > the right direction because this is a very niche knowledge area and none of > these people are on Flashcoders. They're at universities working on their > doctorates or working for the military or government, or some private > company and they're not sharing this information. This is the stuff patents > are made of. > > So either give up now (because what you want is some easy solution and > there isn't one) or start doing real research, learn some serious Calculus, > become an expert on on sound, speech, waveforms, and then figure out how to > port all of this into Flash, which, in all likelihood, lacks the performance > to actually achieve this. > > You'll probably have to do it on the server, passing the sound to the > server as an mp3 file, and then pass the text back. That's the only thing I > can think of that would possibly be able to do this. > > Prove me wrong. If you pull this off, you could probably build an entire > company around your technology. > > _______________________________________________ > Flashcoders mailing list > Flashcoders@chattyfig.figleaf.com > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > _______________________________________________ Flashcoders mailing list Flashcoders@chattyfig.figleaf.com http://chattyfig.figleaf.com/mailman/listinfo/flashcoders