Hi Marco.

SpeechRTC was my first tentative with the platform. At early 2013 neither I
had enough knowledge about gecko internals as even b2g was at very early
stage (in the very beggining, Steven Lee needed to send me patches to gum
work properly), so the fastest path was capture and stream online. The
great part is that opus is pretty efficient plus nodejs + a speech server
wrapping pocketsphinx turned the whole roundtrip really fast.

But I knew that was not ideal for command and control / grammar, then I
started to research a direct port of pocketsphinx using emscripten. Did
work but three reasons made me move to a full cpp version:

1) the whole speech api frontend in gecko was ready to roll only waiting a
backend, and this, as we know was built in cpp;

2) my tests ran very well, but on peak [2] for example, performed slower
than on low end devices running android [3]

3) with emscripten, the model loading inside decoder's creation at each
reload ended very slow and I couldn't figure out how to keep the decoder
instance between tabs and reloads while in cpp this happens only once, due
Gecko's architecture
On Oct 31, 2014 12:27 AM, "Marco Chen" <mc...@mozilla.com> wrote:

> Hi Andre,
>
> It is a nice work and expect the voice recognition on B2G.
>
> Beside this final result, I am also interesting in the reason of you
> migrate from SpeechRTC -> emscripten -> Web Speech API.
> Could you also share what is the factor triggered these transition? Then
> that can be the lesson learn for us.
>
> ex: SpeechRTC -> voice recognition can't be performed on local.
>      emscripten -> performance issue? or license issue? or ?
>
> Thanks,
> Sincerely yours.
>
> ------------------------------
> *From: *"Andre Natal" <ana...@gmail.com>
> *To: *dev-platform@lists.mozilla.org, "Sandip Kamat" <ska...@mozilla.com>,
> "Olli.Pettay" <opet...@mozilla.com>
> *Sent: *Friday, October 31, 2014 7:18:06 AM
> *Subject: *Intent to ship: Web Speech API - Speech Recognition with
> Pocketsphinx
>
> I've been researching speech recognition in Firefox for two years. First
> SpeechRTC, then emscripten, and now Web Speech API with CMU pocketsphinx
> [1] embedded in Gecko C++ layer, project that I had the luck to develop for
> Google Summer of Code with the mentoring of Olli Pettay, Guilherme
> Gonçalves, Steven Lee, Randell Jesup plus others and with the management of
> Sandip Kamat.
>
> The implementation already works in B2G, Fennec and all FF desktop
> versions, and the first language supported will be english. The API and
> implementation are in conformity with W3C standard [2]. The preference to
> enable it is: media.webspeech.service.default = pocketsphinx
>
> The required patches for achieve this are:
>
>  - Import pocketsphinx sources in Gecko. Bug 1051146 [3]
>  - Embed english models. Bug 1065911 [4]
>  - Change SpeechGrammarList to store grammars inside SpeechGrammar objects.
> Bug 1088336 [5]
>  - Creation of a SpeechRecognitionService for Pocketsphinx. Bug 1051148 [6]
>
>
> Also, other important features that we don't have patches yet:
>  - Relax VAD strategy to be les strict and avoid stop in the middle of
> speech when speaking low volume phonemes [7]
>  - Integrate or develop a grapheme to phoneme algorithm to realtime
> generator when compiling grammars [8]
>  - Inlcude and build models for other languages [9]
>  - Continuous and wordspotting recognition [10]
>
> The wip repo is here [11] and this Air Mozilla video [12] plus this wiki
> has more detailed info [13].
>
> At this comment you can see a cpu usage on flame while recognition is
> happening [14]
>
> I wish to hear your comments.
>
> Thanks,
>
> Andre Natal
>
> [1] http://cmusphinx.sourceforge.net/
> [2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
> [3] https://bugzilla.mozilla.org/show_bug.cgi?id=1051146
> [4] https://bugzilla.mozilla.org/show_bug.cgi?id=1065911
> [5] https://bugzilla.mozilla.org/show_bug.cgi?id=1088336
> [6] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148
> [7] https://bugzilla.mozilla.org/show_bug.cgi?id=1051604
> [8] https://bugzilla.mozilla.org/show_bug.cgi?id=1051554
> [9] https://bugzilla.mozilla.org/show_bug.cgi?id=1065904 and
> https://bugzilla.mozilla.org/show_bug.cgi?id=1051607
> [10] https://bugzilla.mozilla.org/show_bug.cgi?id=967896
> [11] https://github.com/andrenatal/gecko-dev
> [12] https://air.mozilla.org/mozilla-weekly-project-meeting-20141027/
> (Jump
> to 12:00)
> [13] https://wiki.mozilla.org/SpeechRTC_-_Speech_enabling_the_open_web
> [14] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148#c14
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
>
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to