Hello, On Mon, Jun 23, 2008 at 1:05 AM, <[EMAIL PROTECTED]> wrote:
> Will it be possible to use it with voice dialing? > You said vocabulary is 5-10. Will it be enough? > It would be cool if I would have a possibility to say: "Message to > Jane" to open sms dialog or "Call to Jane" to call, presuming Jane is > a hot chick ;-) > Is it possible? > yes, it is of course possible. But it requires the speech recognition for connected words which needs the level building algorithms and proper noise handling along with learning grammar for machine. This project has a great scope and can be extended to any limit. However in this small duration for GSoC Project, I dont think that it will be possible to incorporate these advanced features in it. The initial aim will be to provide an API in which user can store his/her own words individually and connect any particular activity with that word. Upon detection of that word, the API corresponding to that activity for that word will be called. I have included these points in my Design Document and the scope of advanced models using speech recognition. I think once the individual word recognition application is built, the advanced features can be added using this application and newer one. > > On 6/22/08, saurabh gupta <[EMAIL PROTECTED]> wrote: > > Hello everyone, > > > > This is the status update of the GSoC project, Speech Recognition > facility > > in Openmoko. This week, much of the time was devoted in writing codes and > > optimizing the existing one. I have written many subroutines like forward > > backward procedure, LPC and cepstral analysis of speech signals in > frames, > > viterbi algorithm and training algorithm using K-means segmental method. > All > > the source codes have been successfully compiled using GNU C compiler. > > There are various optimizations done in the coding to make it > suitable > > for working on the ARM 16/32-bit processor running at 266 or 400 MHz > > maximum. The whole code is written using fixed point arithmetic. I used > > some external libraries for some subroutines and converted them in fixed > > point arithmetic. The other optimization was done by choosing K-means > > segmental procedure for training the HMM models rather than Baum Welch > > algorithm which requires more processing since it accounts for all the > > possible hidden states for a given sequence. On the other hand K-means > > segmental method uses viterbi algorithm to find the best state sequence > and > > then iterates for re-estimation and training the HMM model. K-means > > segmental method has been proved to show good results and fast processing > > than Baum-Welch. The other optimization is regarding the probability > density > > function. As this project aims for a small vocabulary (around 5 or 10) > for > > recognition, vector quantization will be used instead of continuous > > observation sequence. Vector quantization procedure is faster and yields > > good result for applications in small embedded devices. The vector > > quantization source code is about to finish. Soon after that, the actual > > testing of speech recognition code will be done on the speech samples > > collected. > > I have uploaded all Documents (Design Document version-0.2) > and > > source codes on the svn repository of Openmoko ( > > https://svn.projects.openmoko.org/svnroot/speech/). Any comments and > > suggestions will be highly appreciated. > > > > http://saurabh1403.wordpress.com/ > > > > Regards.... > > -- > > Saurabh Gupta > > Electronics and Communication Engg. > > NSIT,New Delhi > > > > _______________________________________________ > Openmoko community mailing list > community@lists.openmoko.org > http://lists.openmoko.org/mailman/listinfo/community > -- Saurabh Gupta Electronics and Communication Engg. NSIT,New Delhi
_______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community