On Mon, Jun 30, 2008 at 3:14 AM, Asheesh Laroia <[EMAIL PROTECTED]> wrote:
> On Mon, 30 Jun 2008, saurabh gupta wrote: > > > You have identified the correct and justified problem in training. I > thought > > to handle it in this way. Whenever a user runs this application, the GUI > for > > speech recognition will ask it to go in training or recognition mode. In > > training mode, after uttering a word, the GUI will again ask the user to > > utter the same word again and so on. The user will have to feed the > training > > word three times (I have assumed that constant to be three) to fully > create > > a word in the vocabulary. If the user terminates the application or > > mishandles it before three sequences, the application will not save the > > word. > > What do you mean mishandles? Mishandling the application meant that the user didn't train the word fully at the time of training. > > > However there is no easy way to detect the mishandling since if the user > > neither terminates the application nor speaks training word again, > > application can pick the louder noise thinking it as the training word > > and wrong result will be produced. This is always a bigger problem in > > speech related applications since environment noise as well as end point > > detection is quite difficult in real world scenario. > > You are speaking of the "training mode", which I agree is important. > > I am instead talking about making the normal use mode a training mode, in > a way, to non-intrusively improve accuracy. > > At least, that's my guess - I think it would be worthwhile to run some > experiments to see if it's really true! But if you can explain to me why > this idea is invalid from the start than maybe we can skip the > experiments. (-; > Correct me if I am not getting exactly what you really meant to say. As you said, to use the normal mode as a training mode, then I see a problem in it. Suppose a user trains a word e.g. "hello" insufficiently, then there are chances that the application recognizes a wrong or mispronounced word as this word (i.e. "hello") because of a poor HMM model. Now if it uses this new word to improve the previous trained model (for the word "hello"), then it will turn out to be a completely wrong trained word since the word which is recognized is itself not correct. This can be solved to make it a manual procedure, that is, when the application recognizes a word then it asks the user if it was a correct word or not. If it is correct then it will use that to improve the previous model since the model was not fully trained. But again this will require the use of a lot of memory to store the word and much processing. Also as this application implements vector quantization so a codebook of each word is to be prepared during training. The best way to prepare a proper codebook is to have enough training vectors, which should be used together to create this codebook. > > -- Asheesh. > > -- > Clear the laundromat!! This whirl-o-matic just had a nuclear meltdown!! > > _______________________________________________ > Openmoko community mailing list > community@lists.openmoko.org > http://lists.openmoko.org/mailman/listinfo/community > -- Saurabh Gupta Electronics and Communication Engg. NSIT,New Delhi, India I blog here: http://saurabh1403-blog.blogspot.com/
_______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community