Re: [Flashcoders] RE: Flash speech-to-text

juju Thu, 27 Aug 2009 01:35:08 -0700

Steve, why didn't you say so in the first place?? ;P

Thanks for the info - juju


On Thu, Aug 27, 2009 at 1:41 PM, Steven Sacks <flash...@stevensacks.net>wrote:

> This is how you record sound:
> http://www.getmicrophone.com/?p=69
>
> If you're asking how to convert sound waves into speech, dude, what?  Do
> you realize how challenging speech recognition is?  Wait, why am I asking
> you this?  If you did, you wouldn't be asking people on a Flash list how to
> do it, as if it's some piece of code somebody can copy and paste or a few
> links that will tell you the secret formula.
>
> Most speech to text programs are based on the Hidden Markov models. In
> speech recognition, the hidden Markov model would output a sequence of
> n-dimensional real-valued vectors (with n being a small integer, such as
> 10), outputting one of these every 10 milliseconds. The vectors would
> consist of cepstral coefficients, which are obtained by taking a Fourier
> transform of a short time window of speech and decorrelating the spectrum
> using a cosine transform, then taking the first (most significant)
> coefficients. The hidden Markov model will tend to have in each state a
> statistical distribution that is a mixture of diagonal covariance Gaussians
> which will give a likelihood for each observed vector. Each word, or (for
> more general speech recognition systems), each phoneme, will have a
> different output distribution; a hidden Markov model for a sequence of words
> or phonemes is made by concatenating the individual trained hidden Markov
> models for the separate words and phonemes.
>
> There you have it. That's a high level overview of speech to text. Do you
> understand anything in that paragraph?  Probably not.
>
> Unless you're willing to study and put in the time to figure out how to do
> this, you're not going to figure it out.  Nobody is going to point you in
> the right direction because this is a very niche knowledge area and none of
> these people are on Flashcoders.  They're at universities working on their
> doctorates or working for the military or government, or some private
> company and they're not sharing this information.  This is the stuff patents
> are made of.
>
> So either give up now (because what you want is some easy solution and
> there isn't one) or start doing real research, learn some serious Calculus,
> become an expert on on sound, speech, waveforms, and then figure out how to
> port all of this into Flash, which, in all likelihood, lacks the performance
> to actually achieve this.
>
> You'll probably have to do it on the server, passing the sound to the
> server as an mp3 file, and then pass the text back. That's the only thing I
> can think of that would possibly be able to do this.
>
> Prove me wrong.  If you pull this off, you could probably build an entire
> company around your technology.
>
> _______________________________________________
> Flashcoders mailing list
> Flashcoders@chattyfig.figleaf.com
> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>
_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Re: [Flashcoders] RE: Flash speech-to-text

Reply via email to