Will it be possible to use it with voice dialing? You said vocabulary is 5-10. Will it be enough? It would be cool if I would have a possibility to say: "Message to Jane" to open sms dialog or "Call to Jane" to call, presuming Jane is a hot chick ;-) Is it possible?
On 6/22/08, saurabh gupta <[EMAIL PROTECTED]> wrote: > Hello everyone, > > This is the status update of the GSoC project, Speech Recognition facility > in Openmoko. This week, much of the time was devoted in writing codes and > optimizing the existing one. I have written many subroutines like forward > backward procedure, LPC and cepstral analysis of speech signals in frames, > viterbi algorithm and training algorithm using K-means segmental method. All > the source codes have been successfully compiled using GNU C compiler. > There are various optimizations done in the coding to make it suitable > for working on the ARM 16/32-bit processor running at 266 or 400 MHz > maximum. The whole code is written using fixed point arithmetic. I used > some external libraries for some subroutines and converted them in fixed > point arithmetic. The other optimization was done by choosing K-means > segmental procedure for training the HMM models rather than Baum Welch > algorithm which requires more processing since it accounts for all the > possible hidden states for a given sequence. On the other hand K-means > segmental method uses viterbi algorithm to find the best state sequence and > then iterates for re-estimation and training the HMM model. K-means > segmental method has been proved to show good results and fast processing > than Baum-Welch. The other optimization is regarding the probability density > function. As this project aims for a small vocabulary (around 5 or 10) for > recognition, vector quantization will be used instead of continuous > observation sequence. Vector quantization procedure is faster and yields > good result for applications in small embedded devices. The vector > quantization source code is about to finish. Soon after that, the actual > testing of speech recognition code will be done on the speech samples > collected. > I have uploaded all Documents (Design Document version-0.2) and > source codes on the svn repository of Openmoko ( > https://svn.projects.openmoko.org/svnroot/speech/). Any comments and > suggestions will be highly appreciated. > > http://saurabh1403.wordpress.com/ > > Regards.... > -- > Saurabh Gupta > Electronics and Communication Engg. > NSIT,New Delhi > _______________________________________________ Openmoko community mailing list [email protected] http://lists.openmoko.org/mailman/listinfo/community

