On Mon, Jun 16, 2008 at 6:04 AM, Dan Staley <[EMAIL PROTECTED]> wrote:
> I actually just interfaced with the Sphinx project at one of the > research positions I hold. It is actually a very well written interface > (for the most part...there were a few things poorly documented and/or > implemented) But anyway, I found the java version of the project (Sphinx > 4 http://cmusphinx.sourceforge.net/sphinx4/ ) to be pretty easy to > build/interface with. Its great Dan that u got sphinx packages worked for you. I tried it but got some error. However now a days i was concentrating on understanding their some libraries and trying to write my own optimized codes. I will definitely ping you in case of any help. > > > The benefit of using the HMMs and models and methods that Sphinx > implements is that anyone in their programs should be able to specify a > grammar (similar to a simplified regex) that they want to be recognized > and then the interpreter should be able to be user independant...meaning > anyone can speak the phrase into the phone and get the desired output. > Speech training wouldn't be required. I found that once you set it up > correctly, the Sphinx engine is very powerful, and usually identifies > the spoken words no matter who says them (we found it even seemed to > work decently well with a variety different accents). This is good and in fact I will also try to implement this in the model. I will get the HMM models of words by training them from different speakers. This thing i have covered in my Design Document. Thanks in advance... > > -Dan Staley > > On Sun, 2008-06-15 at 19:07 -0400, Ajit Natarajan wrote: > > Hello, > > > > I know nothing about speech recognition, so if the following won't work, > > please let me know (gently :) ). > > > > I understand that there is a project called Sphinx in CMU which attempts > > speech recognition. It seems pretty complex. I couldn't get it to work > > on my Linux desktop. I'm not sure if it would work on an FR since it > > may need a lot of CPU horsepower and memory. > > > > I see a speech project on the OM projects page. To me, it seems like > > the project is attempting command recognition, e.g., voice dialing. > > However, it would be great if the FR can function as a rudimentary > > dictation machine, i.e., allow the user to speak and convert to text. > > > > Perhaps the following may work. > > > > 1. Ask the user to speak some standard words. Record the speech and > > establish the mapping from the words to the corresponding speech. > > It may even be good to maintain separate databases for different > > purposes, e.g., one for UNIX command lines, one for emails, and a > > third for technical documents. > > > > 2. The speech recognizer then functions similar to a keyboard in that it > > converts speech to text which it then enters into the application > > that has focus. > > > > 3. The user must speak word by word. The speech recognizer finds the > > closest match for the speech my checking against the recordings made > > in step 1 (and step 4). The user may need to set the database from > > which the match must be made. > > > > 4. If there is no close match, or if the user is unhappy with the > > selection made in step 3, the user can type in the correct word. A > > new record can be added to the appropriate database. > > > > The process may be frustrating for the user at first, but over time, the > > speech recognition should become better and better. > > > > The separate databases may be needed, for example, because the word > > period should usually translate to the symbol `.' except when writing > > about time periods when it should translate to the word `period'. > > > > I do not know what the storage requirements would be to maintain this > > database. I do not know if the closest match algorithm in step 3 is > > even possible. But if we could get a good dictation engine, that would > > be a killer app, in my opinion. No more typing! No more carpal tunnel > > injuries. No more having to worry about small on screen keyboards that > > challenge finger typing. > > > > Thanks. > > > > Ajit > > > > > > > _______________________________________________ > Openmoko community mailing list > community@lists.openmoko.org > http://lists.openmoko.org/mailman/listinfo/community > -- Saurabh Gupta Electronics and Communication Engg. NSIT,New Delhi
_______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community