Built Speech To Text Engine with basic functionally

Suman Saurabh Tue, 22 Jul 2014 10:51:23 -0700

Hi Rupert,

I have built Speech To Text Engine module, with these functionalities :


1 ) Provided the support for multiple language Model files, default
language is "*en".* Client can parse the language for model files and it
will be loaded.
2 ) Transcripts extracted from sound file as plain text Blob are fed back
to ContentItem.
3 ) Genearted the junit test cases for both SphinxModelProvider and
SpeechToTexEngine Service. All test Cases are passed.

Things yet to be done:

1 ) Providing support for loading several named models  for the same
language.

*Problem* : Do not know client will parse the name of custom Model files,
in that case how multiple custom model file names will be parsed. Please
give me a hint on how to implement this service.

2 ) Providing support for loading models from particular bundle.
*Problem:* If a client wants to parse bundle name of model files, how will
he parse it and how can I access name of the bundle name. Is there any
service that implements it, help me on it.

3 ) In previous discussions it was said that SpeechToText Engine Service would
not just provide the extracted text, but also annotation about the time
when this text was spoken in the parsed audio/video file. A possible use
case would be a client that wants to *highlight the text* currently spoken
in a Audio/Video
file.

*Problem* : I am not clear about the Time Stamp Annotation service and how
client will highlight the text.

Regards,
Suman Saurabh

PS: If my words are not clear, reply me.

[1] https://github.com/sumansaurabh/Sphinx-Model
[2] https://github.com/sumansaurabh/SphinxModelProvider
[3] https://github.com/sumansaurabh/SpeechToTextEngine

Built Speech To Text Engine with basic functionally

Reply via email to