On 5/17/10 4:05 PM, Bjorn Bringert wrote:
Back in December there was a discussion about web APIs for speech
recognition and synthesis that saw a decent amount of interest
(http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-December/thread.html#24281).
Based on that discussion, we would like to propose a simple API for
speech recognition, using a new<input type="speech"> element. An
informal spec of the new API, along with some sample apps and use
cases can be found at:
http://docs.google.com/Doc?docid=0AaYxrITemjbxZGNmZzc5cHpfM2Ryajc5Zmhx&hl=en.
It would be very helpful if you could take a look and share your
comments. Our next steps will be to implement the current design, get
some feedback from web developers, continue to tweak, and seek
standardization as soon it looks mature enough and/or other vendors
become interested in implementing it.
After a quick read I, in general, like the proposal.
Few comments though.
- What should happen if for example
What happens to the events which are fired during that time?
Or should recognition stop?
- What exactly are grammars builtin:dictation and builtin:search?
Especially the latter one is not at all clear to me
- When does recognitionState change? Before which events?
- It is not quite clear how SGRS works with <input type="speech">
- I believe there is no need for
DOMImplementation.hasFeature("SpeechInput", "1.0")
And I think we really need to define something for TTS.
Not every web developers have servers for text -> <audio>.
-Olli