Re: Testing Expectations (was: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization)
On Thu, 12 Jan 2012 17:06:34 +0100, Doug Schepers schep...@w3.org wrote: As such, the creation of tests should not be left to CR... there should be a plan in place (e.g. a person, and a loose policy, like as we implement, we'll make tests and contribute them to the WG), and a person responsible for collecting and maintaining the tests (i.e. making sure that the tests are adapted to meet the changing spec). Indeed, we agreed recently that this would be something we require in any work the group takes on... So rather than warm fuzzies, we're really after a warm body to take responsibility for driving it. cheers -- Charles 'chaals' McCathieNevile Opera Software, Standards Group je parle français -- hablo español -- jeg kan litt norsk http://my.opera.com/chaals Try Opera: http://www.opera.com
Testing Expectations (was: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization)
Hi, folks- On 1/11/12 9:40 AM, Arthur Barstow wrote: On 1/10/12 11:25 AM, ext Glen Shires wrote: Per #4 Testing commitment(s): can you elaborate on what you would like to see at this point? At this point, I think a `warm fuzzy` like if/when the spec advances to Candidate Recommendation, we will contribute to a test suite that is sufficient to exit the CR would be useful. I agree with this general sentiment, but I'd like to offer a different priority to tests. Technically, the W3C Process does not require a test suite, but pragmatically, it's the best way to indicate conformance and implementability, and to promote interoperability. Modern expectations (e.g. in the last 4-6 years) about the specification process include early prototyping and implementation feedback well before CR phase, with real-world webapps depending upon these early implementations, so we need interoperability fairly early on. The infrastructure and methodology for creating and maintaining tests (at W3C and in implementation projects) has improved dramatically in the last few years as well, so it's easier to do. As such, the creation of tests should not be left to CR... there should be a plan in place (e.g. a person, and a loose policy, like as we implement, we'll make tests and contribute them to the WG), and a person responsible for collecting and maintaining the tests (i.e. making sure that the tests are adapted to meet the changing spec). Regards- -Doug
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Olli Pettay olli.pet...@helsinki.fi, 2012-01-09 18:12 +0200: It doesn't matter too much to me in which group the API will be developed (except that I'm against doing it in HTML WG). WebApps is reasonably good place (if there won't be any IP issues.) Starting the work in a Community Group is another option to consider. A really good option, actually. It's certainly the quickest way to get it started and to get a W3C draft actually published, and the route that would entail the least amount of unnecessary process overhead. The work could later be graduated to, e.g., the WebApps WG if/when needed. --Mike -- Michael[tm] Smith http://people.w3.org/mike/+
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
It doesn't matter too much to me in which group the API will be developed (except that I'm against doing it in HTML WG). WebApps is reasonably good place (if there won't be any IP issues.) Starting the work in a Community Group is another option to consider. A really good option, actually. It's certainly the quickest way to get it started and to get a W3C draft actually published, and the route that would entail the least amount of unnecessary process overhead. The work could later be graduated to, e.g., the WebApps WG if/when needed. The Community Groups [1] page says they are for anyone to socialize their ideas for the Web at the W3C for possible future standardization. The HTML Speech Incubator Group has done a considerable amount of work and the final report [2] is quite detailed with requirements, use cases and API proposals. Since we are interested in transitioning to the standards track now, working with the relevant WGs seems more appropriate than forming a new Community Group. [1] http://www.w3.org/community/about/#cg [2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Satish S sat...@google.com, 2012-01-11 10:04 +: The Community Groups [1] page says they are for anyone to socialize their ideas for the Web at the W3C for possible future standardization. I don't think that page adequately describes the potential value of the Community Group option. A CG can be used for much more than just socializing ideas for some hope of standardization someday. The HTML Speech Incubator Group has done a considerable amount of work and the final report [2] is quite detailed with requirements, use cases and API proposals. Since we are interested in transitioning to the standards track now, working with the relevant WGs seems more appropriate than forming a new Community Group. I can understand you seeing it that way, but I hope you can also understand me saying that I'm not at all sure it's more appropriate for this work. I think everybody could agree that the point is not just to produce a spec that is nominally on the W3C standards track. Having something on the W3C standards track doesn't necessarily do anything magical to ensure that anybody actually implements it. I think we all want is to for Web-platform technologies to actually get implemented across multiple browsers, interoperably -- preferably sooner rather than later. Starting from the WG option is not absolutely always the best way to cause that to happen. It is almost certainly not the best way to ensure it will get done more quickly. You can start up a CG and have the work formally going on within that CG in a matter of days, literally. In contrast, getting it going formally as a deliverable within a WG requires a matter of months. Among the things that are valuable about formal deliverables in WGs is that they get you RF commitments from participants in the WG. But one thing that I think not everybody understands about CGs is that they also get you RF commitments from participants in the CG; everybody in the CG has to agree to the terms of the W3C Community Contributor License Agreement - http://www.w3.org/community/about/agreements/cla/ Excerpt: I agree to license my Essential Claims under the W3C CLA RF Licensing Requirements. This requirement includes Essential Claims that I own Anyway, despite what it may seem like from what I've said above, I'm not trying to do a hard sell here. It's up to you all what you choose to do. But I would like to help make sure you're making a fully informed decision based on what the actual benefits and costs of the different options are. --Mike [1] http://www.w3.org/community/about/#cg [2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ -- Michael[tm] Smith http://people.w3.org/mike/+
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Hi Michael, Thanks for the info! On Wed, Jan 11, 2012 at 11:36 AM, Michael[tm] Smith m...@w3.org wrote: Satish S sat...@google.com, 2012-01-11 10:04 +: The Community Groups [1] page says they are for anyone to socialize their ideas for the Web at the W3C for possible future standardization. I don't think that page adequately describes the potential value of the Community Group option. A CG can be used for much more than just socializing ideas for some hope of standardization someday. The HTML Speech Incubator Group has done a considerable amount of work and the final report [2] is quite detailed with requirements, use cases and API proposals. Since we are interested in transitioning to the standards track now, working with the relevant WGs seems more appropriate than forming a new Community Group. I can understand you seeing it that way, but I hope you can also understand me saying that I'm not at all sure it's more appropriate for this work. I think everybody could agree that the point is not just to produce a spec that is nominally on the W3C standards track. Having something on the W3C standards track doesn't necessarily do anything magical to ensure that anybody actually implements it. We have strong interest from Mozilla and Google to implement. Would this not be sufficient to have this API designed in this group? Thanks, Andrei I think we all want is to for Web-platform technologies to actually get implemented across multiple browsers, interoperably -- preferably sooner rather than later. Starting from the WG option is not absolutely always the best way to cause that to happen. It is almost certainly not the best way to ensure it will get done more quickly. You can start up a CG and have the work formally going on within that CG in a matter of days, literally. In contrast, getting it going formally as a deliverable within a WG requires a matter of months. Among the things that are valuable about formal deliverables in WGs is that they get you RF commitments from participants in the WG. But one thing that I think not everybody understands about CGs is that they also get you RF commitments from participants in the CG; everybody in the CG has to agree to the terms of the W3C Community Contributor License Agreement - http://www.w3.org/community/about/agreements/cla/ Excerpt: I agree to license my Essential Claims under the W3C CLA RF Licensing Requirements. This requirement includes Essential Claims that I own Anyway, despite what it may seem like from what I've said above, I'm not trying to do a hard sell here. It's up to you all what you choose to do. But I would like to help make sure you're making a fully informed decision based on what the actual benefits and costs of the different options are. --Mike [1] http://www.w3.org/community/about/#cg [2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ -- Michael[tm] Smith http://people.w3.org/mike/+
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Michael[tm] Smith m...@w3.org, 2012-01-11 20:36 +0900: Satish S sat...@google.com, 2012-01-11 10:04 +: The Community Groups [1] page says they are for anyone to socialize their ideas for the Web at the W3C for possible future standardization. I don't think that page adequately describes the potential value of the Community Group option. A CG can be used for much more than just socializing ideas for some hope of standardization someday. The HTML Speech Incubator Group has done a considerable amount of work and the final report [2] is quite detailed with requirements, use cases and API proposals. Since we are interested in transitioning to the standards track now, working with the relevant WGs seems more appropriate than forming a new Community Group. Another data point to consider is, we have a precedent of a CG that's already far along with work on a spec that already has multiple implementations: The Web Media Text Tracks CG, which is working on the WebVTT format for text tracks (captions, subtitles, etc.) for HTML video: http://www.w3.org/community/texttracks/ They're well beyond the stage of documenting use cases and requirements and providing proposals; they already have a complete spec: http://dev.w3.org/html5/webvtt/ And the WebVTT spec is already implemented in IE10 and partially in WebKit, with active implementation work continuing - http://msdn.microsoft.com/en-us/library/hh673566.aspx#WebVTT https://bugs.webkit.org/showdependencytree.cgi?id=43668hide_resolved=1 That CG was started only a little over 3 months ago. So it is in fact possible for a CG to be producing work that's actually already getting actively implemented in current browsers. --Mike -- Michael[tm] Smith http://people.w3.org/mike/+
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
On Wed, 11 Jan 2012 22:36:28 +1100, Michael[tm] Smith m...@w3.org wrote: Satish S sat...@google.com, 2012-01-11 10:04 +: The Community Groups [1] page says they are for anyone to socialize their ideas for the Web at the W3C for possible future standardization. I don't think that page adequately describes the potential value of the Community Group option. A CG can be used for much more than just socializing ideas for some hope of standardization someday. The HTML Speech Incubator Group has done a considerable amount of work and the final report [2] is quite detailed with requirements, use cases and AP proposals. Since we are interested in transitioning to the standards track now, working with the relevant WGs seems more appropriate than forming a new Community Group. I can understand you seeing it that way, but I hope you can also understand me saying that I'm not at all sure it's more appropriate for this work. And I hope you all understand me saying that I think it is indeed more appropriate to move it to a formal working group, for reasons explained below... I think everybody could agree that the point is not just to produce a spec that is nominally on the W3C standards track. Having something on the W3C standards track doesn't necessarily do anything magical to ensure that anybody actually implements it. Indeed. But the same goes for a community group. Implementation commitment doesn't come from people writing a spec. I think we all want is to for Web-platform technologies to actually get implemented across multiple browsers, interoperably -- preferably sooner rather than later. Starting from the WG option is not absolutely always the best way to cause that to happen. It is almost certainly not the best way to ensure it will get done more quickly. Actually, I don't think that what kind of group the work happens in is relevant one way or another to how fast it gets implemented - and not very relevant to the rate of developing the spec. You can start up a CG and have the work formally going on within that CG in a matter of days, literally. In contrast, getting it going formally as a deliverable within a WG requires a matter of months. In the general case this is true. But *starting* work is easy - as Mike said above the goal is to get stuff interoperably implemented, in other words, *finished*. And the startup time only has an impact on the finish time in very trivial cases. Among the things that are valuable about formal deliverables in WGs is that they get you RF commitments from participants in the WG. But one thing that I think not everybody understands about CGs is that they also get you RF commitments from participants in the CG; everybody in the CG has to agree to the terms of the W3C Community Contributor License Agreement - http://www.w3.org/community/about/agreements/cla/ Excerpt: I agree to license my Essential Claims under the W3C CLA RF Licensing Requirements. This requirement includes Essential Claims that I own There are important differences in what WGs and CGs offer, and each has both advantages and disadvantages in terms of the overall level of protection offered. A fair criticism of the process applied to HTML5 is that the editor claims to accept input from the working group, plus the WHAT-WG (whose participants have made no commitment on patents at all) plus anything he reads in email, blogs, the side of milk cartons, etc. There is a theoretical risk that he will read something placed in front of him by someone who has avoided joining the WG (and therefore makes no patent commitment) and introduce it into the spec not knowing it carries a patent liability. I think that in practice this is unlikely to be a real problem for HTML - but that doesn't mean it is unlikely to be a real problem for any Web technology. In particular, I think that the work being proposed here would benefit from being in a real working group - either the Voice WG or the Web Apps WG seem like sensible candidate groups, a priori. Web Apps has the benefit that we are in the middle of the rechartering process, so adding deliverables is as painless now as it can ever be (and the truth is that this doesn't mean trivial - broad patent licensing doesn't always come without some effort, which is why it is considered valuable). Anyway, despite what it may seem like from what I've said above, I'm not trying to do a hard sell here. It's up to you all what you choose to do. But I would like to help make sure you're making a fully informed decision based on what the actual benefits and costs of the different options are. Indeed. cheers Chaals -- Charles 'chaals' McCathieNevile Opera Software, Standards Group je parle français -- hablo español -- jeg kan litt norsk http://my.opera.com/chaals Try Opera: http://www.opera.com
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
On 1/10/12 11:25 AM, ext Glen Shires wrote: Per #4 Testing commitment(s): can you elaborate on what you would like to see at this point? At this point, I think a `warm fuzzy` like if/when the spec advances to Candidate Recommendation, we will contribute to a test suite that is sufficient to exit the CR would be useful. Also, what is the next step? WRT the API you proposed, I think we have enough preliminary feedback for me to start a CfC to add the API to WebApps charter. My only concern is the open question (at least to me) re the markup part. It seems like it would be useful to review the proposed API and markup together. However, a CfC for the markup can be done separately (provided sufficient interest/commitment is expressed). If I don't see any objection from Chaals or Doug, today or tomorrow I'll start a CfC for the API proposal . -AB
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Per #4 Testing commitment(s): can you elaborate on what you would like to see at this point? At this point, I think a `warm fuzzy` like if/when the spec advances to Candidate Recommendation, we will contribute to a test suite that is sufficient to exit the CR would be useful. Yes we will contribute to a test suite that is sufficient for the Candidate Recommendation. Also, what is the next step? WRT the API you proposed, I think we have enough preliminary feedback for me to start a CfC to add the API to WebApps charter. My only concern is the open question (at least to me) re the markup part. It seems like it would be useful to review the proposed API and markup together. However, a CfC for the markup can be done separately (provided sufficient interest/commitment is expressed). In the spirit of starting with the basics and iterating we did not include markup in the proposed API. Markup support also renders cleanly as a layer on top of the JS API with few additions, so as you suggest if there is sufficient interest/commitment a separate CfC could be done.
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Art, Per #2 Editor commitment(s): we confirm that Bjorn Bringert, Satish Sampath and Glen Shires volunteer as editors. If others would like to help, we welcome them. Per #4 Testing commitment(s): can you elaborate on what you would like to see at this point? Also, what is the next step? On Mon, Jan 9, 2012 at 8:12 AM, Olli Pettay olli.pet...@helsinki.fi wrote: On 01/09/2012 04:59 PM, Arthur Barstow wrote: Hi All, As I indicated in [1], WebApps already has a relatively large number of specs in progress and the group has agreed to add some new specs. As such, to review any new charter addition proposals, I think we need at least the following: 1. Relatively clear scope of the feature(s). (This information should be detailed enough for WG members with relevant IP to be able to make an IP assessment.) 2. Editor commitment(s) 3. Implementation commitments from at least two WG members Is this really requirement nowadays? Is there for example commitment to implement File System API? http://dev.w3.org/2009/dap/**file-system/file-dir-sys.htmlhttp://dev.w3.org/2009/dap/file-system/file-dir-sys.html But anyway, I'm interested to implement the speech API, and as far as I know, also other people involved with Mozilla have shown interest. 4. Testing commitment(s) Re the APIs in this thread - I think Glen's API proposal [2] adequately addresses #1 above and his previous responses imply support for #2 but it would be good for Glen, et al. to confirm. Re #3, other than Google, I don't believe any other implementor has voiced their support for WebApps adding these APIs. As such, I think we we need additional input on implementation support (e.g. Apple, Microsoft, Mozilla, Opera, etc.). It doesn't matter too much to me in which group the API will be developed (except that I'm against doing it in HTML WG). WebApps is reasonably good place (if there won't be any IP issues.) -Olli Re the markup question - WebAppsdoes have some precedence for defining markup (e.g. XBL2, Widget XML config). I don't have a strong opinion on whether or not WebApps should include the type of markup in the XG Report. I think the next step here is for WG members to submit comments on this question. In particular, proponents of including markup in WebApps' charter should respond to #1-4 above. -AB [1] http://lists.w3.org/Archives/**Public/public-webapps/** 2011OctDec/1474.htmlhttp://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1474.html [2] http://lists.w3.org/Archives/**Public/public-webapps/** 2011OctDec/att-1696/speechapi.**htmlhttp://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html On 1/5/12 6:49 AM, ext Satish S wrote: 2) How does the draft incorporate with the existing input speech API[1]? It seems to me as if it'd be best to define both the attribute as the DOM APIs in a single specification, also because they share several events (yet don't seem to be interchangeable) and the attribute already has an implementation. The input speech API proposal was implemented as input x-webkit-speech in Chromium a while ago. A lot of the developer feedback we received was about finer grained control including a javascript API and letting the web application decide how to present the user interface rather than tying it to the input element. The HTML Speech Incubator Group's final report [1] includes a reco element which addresses both these concerns and provides automatic binding of speech recognition results to existing HTML elements. We are not sure if the WebApps WG is a good place to work on standardising such markup elements, hence did not include in the simplified Javascript API [2]. If there is sufficient interest and scope in the WebApps WG charter for the Javascript API and markup, we are happy to combine them both in the proposal. [1] http://www.w3.org/2005/**Incubator/htmlspeech/XGR-**htmlspeech/http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [2] http://lists.w3.org/Archives/**Public/public-webapps/** 2011OctDec/att-1696/speechapi.**htmlhttp://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Thanks, Peter [1] http://lists.w3.org/Archives/**Public/public-xg-htmlspeech/** 2011Feb/att-0020/api-draft.**htmlhttp://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html On Thu, Jan 5, 2012 at 07:15, Glen Shires gshi...@google.com mailto:gshi...@google.com wrote: As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Hi All, As I indicated in [1], WebApps already has a relatively large number of specs in progress and the group has agreed to add some new specs. As such, to review any new charter addition proposals, I think we need at least the following: 1. Relatively clear scope of the feature(s). (This information should be detailed enough for WG members with relevant IP to be able to make an IP assessment.) 2. Editor commitment(s) 3. Implementation commitments from at least two WG members 4. Testing commitment(s) Re the APIs in this thread - I think Glen's API proposal [2] adequately addresses #1 above and his previous responses imply support for #2 but it would be good for Glen, et al. to confirm. Re #3, other than Google, I don't believe any other implementor has voiced their support for WebApps adding these APIs. As such, I think we we need additional input on implementation support (e.g. Apple, Microsoft, Mozilla, Opera, etc.). Re the markup question - WebAppsdoes have some precedence for defining markup (e.g. XBL2, Widget XML config). I don't have a strong opinion on whether or not WebApps should include the type of markup in the XG Report. I think the next step here is for WG members to submit comments on this question. In particular, proponents of including markup in WebApps' charter should respond to #1-4 above. -AB [1] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1474.html [2] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html On 1/5/12 6:49 AM, ext Satish S wrote: 2) How does the draft incorporate with the existing input speech API[1]? It seems to me as if it'd be best to define both the attribute as the DOM APIs in a single specification, also because they share several events (yet don't seem to be interchangeable) and the attribute already has an implementation. The input speech API proposal was implemented as input x-webkit-speech in Chromium a while ago. A lot of the developer feedback we received was about finer grained control including a javascript API and letting the web application decide how to present the user interface rather than tying it to the input element. The HTML Speech Incubator Group's final report [1] includes a reco element which addresses both these concerns and provides automatic binding of speech recognition results to existing HTML elements. We are not sure if the WebApps WG is a good place to work on standardising such markup elements, hence did not include in the simplified Javascript API [2]. If there is sufficient interest and scope in the WebApps WG charter for the Javascript API and markup, we are happy to combine them both in the proposal. [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [2] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Thanks, Peter [1] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html On Thu, Jan 5, 2012 at 07:15, Glen Shires gshi...@google.com mailto:gshi...@google.com wrote: As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as an aid to developing full standards in this space. Because that work was so broad, Art Barstow asked (below) for a relatively specific proposal. We at Google are proposing that a subset of it be accepted as a work item by the Web Applications WG. Specifically, we are proposing this Javascript API [3], which enables web developers to incorporate speech recognition and synthesis into their web pages. This simplified subset enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control, and it supports the majority of use-cases in the Incubator Group's Final Report. We welcome your feedback and ask that the Web Applications WG consider accepting this Javascript API [3] as a work item. [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter [2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [3] API: http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Bjorn Bringert Satish Sampath Glen Shires On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires gshi...@google.com mailto:gshi...@google.com wrote: Milan, The IDLs contained
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
On 01/09/2012 04:59 PM, Arthur Barstow wrote: Hi All, As I indicated in [1], WebApps already has a relatively large number of specs in progress and the group has agreed to add some new specs. As such, to review any new charter addition proposals, I think we need at least the following: 1. Relatively clear scope of the feature(s). (This information should be detailed enough for WG members with relevant IP to be able to make an IP assessment.) 2. Editor commitment(s) 3. Implementation commitments from at least two WG members Is this really requirement nowadays? Is there for example commitment to implement File System API? http://dev.w3.org/2009/dap/file-system/file-dir-sys.html But anyway, I'm interested to implement the speech API, and as far as I know, also other people involved with Mozilla have shown interest. 4. Testing commitment(s) Re the APIs in this thread - I think Glen's API proposal [2] adequately addresses #1 above and his previous responses imply support for #2 but it would be good for Glen, et al. to confirm. Re #3, other than Google, I don't believe any other implementor has voiced their support for WebApps adding these APIs. As such, I think we we need additional input on implementation support (e.g. Apple, Microsoft, Mozilla, Opera, etc.). It doesn't matter too much to me in which group the API will be developed (except that I'm against doing it in HTML WG). WebApps is reasonably good place (if there won't be any IP issues.) -Olli Re the markup question - WebAppsdoes have some precedence for defining markup (e.g. XBL2, Widget XML config). I don't have a strong opinion on whether or not WebApps should include the type of markup in the XG Report. I think the next step here is for WG members to submit comments on this question. In particular, proponents of including markup in WebApps' charter should respond to #1-4 above. -AB [1] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1474.html [2] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html On 1/5/12 6:49 AM, ext Satish S wrote: 2) How does the draft incorporate with the existing input speech API[1]? It seems to me as if it'd be best to define both the attribute as the DOM APIs in a single specification, also because they share several events (yet don't seem to be interchangeable) and the attribute already has an implementation. The input speech API proposal was implemented as input x-webkit-speech in Chromium a while ago. A lot of the developer feedback we received was about finer grained control including a javascript API and letting the web application decide how to present the user interface rather than tying it to the input element. The HTML Speech Incubator Group's final report [1] includes a reco element which addresses both these concerns and provides automatic binding of speech recognition results to existing HTML elements. We are not sure if the WebApps WG is a good place to work on standardising such markup elements, hence did not include in the simplified Javascript API [2]. If there is sufficient interest and scope in the WebApps WG charter for the Javascript API and markup, we are happy to combine them both in the proposal. [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [2] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Thanks, Peter [1] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html On Thu, Jan 5, 2012 at 07:15, Glen Shires gshi...@google.com mailto:gshi...@google.com wrote: As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as an aid to developing full standards in this space. Because that work was so broad, Art Barstow asked (below) for a relatively specific proposal. We at Google are proposing that a subset of it be accepted as a work item by the Web Applications WG. Specifically, we are proposing this Javascript API [3], which enables web developers to incorporate speech recognition and synthesis into their web pages. This simplified subset enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control, and it supports the majority of use-cases in the Incubator Group's Final Report. We welcome your feedback and ask that the Web Applications WG consider accepting this Javascript API [3] as a work item. [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter [2] report:
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
On 01/09/2012 06:17 PM, Young, Milan wrote: To clarify, are you interested in developing the entirety of the JS API we developed in the HTML Speech XG, or just the subset proposed by Google? Not sure if you sent the reply to me only on purpose. CCing the WG and XG lists. Since from practical point of view the API+protocol XG defined is a huge thing to implement at once, it makes sense to implement it in pieces. Something like (1) Initial API implementation. Some subset of what XG defined Not necessarily exactly what Google proposed but something close to it. Support for remote speech services could be in the initial API, but if UA doesn't implement the protocol, it would just fail when trying to connect to remove services. (2) Simultaneously or later - depending on the protocol standardization in IETF or elsewhere - support remote speech services (3) implement some more of the API XG defined (if needed by web developers or web services) (4) Implement reco? I'm not at all convinced we need reco element since automatic value binding makes it just a bit strange and inconsistent. This is the way web APIs tend to evolve. Implement first something quite small, and then add new features if/when needed. -Olli Thanks -Original Message- From: Olli Pettay [mailto:olli.pet...@helsinki.fi] Sent: Monday, January 09, 2012 8:13 AM To: Arthur Barstow Cc: ext Satish S; Peter Beverloo; Glen Shires; public-webapps@w3.org; public-xg-htmlspe...@w3.org; Dan Burnett Subject: Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization On 01/09/2012 04:59 PM, Arthur Barstow wrote: Hi All, As I indicated in [1], WebApps already has a relatively large number of specs in progress and the group has agreed to add some new specs. As such, to review any new charter addition proposals, I think we need at least the following: 1. Relatively clear scope of the feature(s). (This information should be detailed enough for WG members with relevant IP to be able to make an IP assessment.) 2. Editor commitment(s) 3. Implementation commitments from at least two WG members Is this really requirement nowadays? Is there for example commitment to implement File System API? http://dev.w3.org/2009/dap/file-system/file-dir-sys.html But anyway, I'm interested to implement the speech API, and as far as I know, also other people involved with Mozilla have shown interest. 4. Testing commitment(s) Re the APIs in this thread - I think Glen's API proposal [2] adequately addresses #1 above and his previous responses imply support for #2 but it would be good for Glen, et al. to confirm. Re #3, other than Google, I don't believe any other implementor has voiced their support for WebApps adding these APIs. As such, I think we we need additional input on implementation support (e.g. Apple, Microsoft, Mozilla, Opera, etc.). It doesn't matter too much to me in which group the API will be developed (except that I'm against doing it in HTML WG). WebApps is reasonably good place (if there won't be any IP issues.) -Olli Re the markup question - WebAppsdoes have some precedence for defining markup (e.g. XBL2, Widget XML config). I don't have a strong opinion on whether or not WebApps should include the type of markup in the XG Report. I think the next step here is for WG members to submit comments on this question. In particular, proponents of including markup in WebApps' charter should respond to #1-4 above. -AB [1] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1474.html [2] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/s peechapi.html On 1/5/12 6:49 AM, ext Satish S wrote: 2) How does the draft incorporate with the existinginput speech API[1]? It seems to me as if it'd be best to define both the attribute as the DOM APIs in a single specification, also because they share several events (yet don't seem to be interchangeable) and the attribute already has an implementation. Theinput speech API proposal was implemented asinput x-webkit-speech in Chromium a while ago. A lot of the developer feedback we received was about finer grained control including a javascript API and letting the web application decide how to present the user interface rather than tying it to theinput element. The HTML Speech Incubator Group's final report [1] includes areco element which addresses both these concerns and provides automatic binding of speech recognition results to existing HTML elements. We are not sure if the WebApps WG is a good place to work on standardising such markup elements, hence did not include in the simplified Javascript API [2]. If there is sufficient interest and scope in the WebApps WG charter for the Javascript API and markup, we are happy to combine them both in the proposal. [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [2] http://lists.w3.org
RE: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
The HTML Speech XG worked for over a year prioritizing use cases against timelines and packaged all of that into a recommendation complete with IDLs and examples. So while I understand that WebApps may not have the time to review the entirety of this work, it's hard to see how dissecting it would speed the process of understanding. Perhaps a better approach would be to find half an hour to present to select members of WebApps the content of the recommendation and the possible relevance to their group. Does that sound reasonable? Thanks From: Glen Shires [mailto:gshi...@google.com] Sent: Wednesday, January 04, 2012 11:15 PM To: public-webapps@w3.org Cc: public-xg-htmlspe...@w3.org; Arthur Barstow; Dan Burnett Subject: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as an aid to developing full standards in this space. Because that work was so broad, Art Barstow asked (below) for a relatively specific proposal. We at Google are proposing that a subset of it be accepted as a work item by the Web Applications WG. Specifically, we are proposing this Javascript API [3], which enables web developers to incorporate speech recognition and synthesis into their web pages. This simplified subset enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control, and it supports the majority of use-cases in the Incubator Group's Final Report. We welcome your feedback and ask that the Web Applications WG consider accepting this Javascript API [3] as a work item. [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter [2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [3] API: http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/s peechapi.html Bjorn Bringert Satish Sampath Glen Shires On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires gshi...@google.com wrote: Milan, The IDLs contained in both documents are in the same format and order, so it's relatively easy to compare the two side http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#sp eechreco-section -by-side http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/ speechapi.html#api_description . The semantics of the attributes, methods and events have not changed, and both IDLs link directly to the definitions contained in the Speech XG Final Report. As you mention, we agree that the protocol portions of the Speech XG Final Report are most appropriate for consideration by a group such as IETF, and believe such work can proceed independently, particularly because the Speech XG Final Report has provided a roadmap for these to remain compatible. Also, as shown in the Speech XG Final Report - Overview http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#in troductory , the Speech Web API is not dependent on the Speech Protocol and a Default Speech service can be used for local or remote speech recognition and synthesis. Glen Shires On Thu, Dec 22, 2011 at 10:32 AM, Young, Milan milan.yo...@nuance.com wrote: Hello Glen, The proposal says that it contains a simplified subset of the JavaScript API. Could you please clarify which elements of the HTMLSpeech recommendation's JavaScript API were omitted? I think this would be the most efficient way for those of us familiar with the XG recommendation to evaluate the new proposal. I'd also appreciate clarification on how you see the protocol being handled. In the HTMLSpeech group we were thinking about this as a hand-in-hand relationship between W3C and IETF like WebSockets. Is this still your (and Google's) vision? Thanks From: Glen Shires [mailto:gshi...@google.com] Sent: Thursday, December 22, 2011 11:14 AM To: public-webapps@w3.org; Arthur Barstow Cc: public-xg-htmlspe...@w3.org; Dan Burnett Subject: Re: HTML Speech XG Completes, seeks feedback for eventual standardization We at Google believe that a scripting-only (Javascript) subset of the API defined in the Speech XG Incubator Group Final Report is of appropriate scope for consideration by the WebApps WG. The enclosed scripting-only subset supports the majority of the use-cases and samples in the XG proposal. Specifically, it enables web-pages to generate speech output and to use speech recognition as an input for forms, continuous dictation and control. The Javascript API will allow
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
Hi Glen et al., I'd like to share two pieces of feedback which came to mind when reading through the unofficial draft. 1) The primary interfaces are abbreviated as TTS and SpeechReco. Personally I believe it'd be clearer for authors when these would be defined as TextToSpeech and SpeechRecognition. TTS may not be directly obvious for those who have no experience with similar systems, whereas cutting off in the middle of Reco | gnition just seems a bit odd. Is the benefit this provides being a shorter word, at the cost of clarity? 2) How does the draft incorporate with the existing input speech API[1]? It seems to me as if it'd be best to define both the attribute as the DOM APIs in a single specification, also because they share several events (yet don't seem to be interchangeable) and the attribute already has an implementation. Thanks, Peter [1] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html On Thu, Jan 5, 2012 at 07:15, Glen Shires gshi...@google.com wrote: As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as an aid to developing full standards in this space. Because that work was so broad, Art Barstow asked (below) for a relatively specific proposal. We at Google are proposing that a subset of it be accepted as a work item by the Web Applications WG. Specifically, we are proposing this Javascript API [3], which enables web developers to incorporate speech recognition and synthesis into their web pages. This simplified subset enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control, and it supports the majority of use-cases in the Incubator Group's Final Report. We welcome your feedback and ask that the Web Applications WG consider accepting this Javascript API [3] as a work item. [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter [2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [3] API: http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Bjorn Bringert Satish Sampath Glen Shires On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires gshi...@google.com wrote: Milan, The IDLs contained in both documents are in the same format and order, so it's relatively easy to compare the two side-by-side. The semantics of the attributes, methods and events have not changed, and both IDLs link directly to the definitions contained in the Speech XG Final Report. As you mention, we agree that the protocol portions of the Speech XG Final Report are most appropriate for consideration by a group such as IETF, and believe such work can proceed independently, particularly because the Speech XG Final Report has provided a roadmap for these to remain compatible. Also, as shown in the Speech XG Final Report - Overview, the Speech Web API is not dependent on the Speech Protocol and a Default Speech service can be used for local or remote speech recognition and synthesis. Glen Shires On Thu, Dec 22, 2011 at 10:32 AM, Young, Milan milan.yo...@nuance.com wrote: Hello Glen, The proposal says that it contains a “simplified subset of the JavaScript API”. Could you please clarify which elements of the HTMLSpeech recommendation’s JavaScript API were omitted? I think this would be the most efficient way for those of us familiar with the XG recommendation to evaluate the new proposal. I’d also appreciate clarification on how you see the protocol being handled. In the HTMLSpeech group we were thinking about this as a hand-in-hand relationship between W3C and IETF like WebSockets. Is this still your (and Google’s) vision? Thanks From: Glen Shires [mailto:gshi...@google.com] Sent: Thursday, December 22, 2011 11:14 AM To: public-webapps@w3.org; Arthur Barstow Cc: public-xg-htmlspe...@w3.org; Dan Burnett Subject: Re: HTML Speech XG Completes, seeks feedback for eventual standardization We at Google believe that a scripting-only (Javascript) subset of the API defined in the Speech XG Incubator Group Final Report is of appropriate scope for consideration by the WebApps WG. The enclosed scripting-only subset supports the majority of the use-cases and samples in the XG proposal. Specifically, it enables web-pages to generate speech output and to use speech recognition as an input for forms, continuous dictation and control. The Javascript API will allow web pages to control activation and timing and to handle results and
Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
2) How does the draft incorporate with the existing input speech API[1]? It seems to me as if it'd be best to define both the attribute as the DOM APIs in a single specification, also because they share several events (yet don't seem to be interchangeable) and the attribute already has an implementation. The input speech API proposal was implemented as input x-webkit-speech in Chromium a while ago. A lot of the developer feedback we received was about finer grained control including a javascript API and letting the web application decide how to present the user interface rather than tying it to the input element. The HTML Speech Incubator Group's final report [1] includes a reco element which addresses both these concerns and provides automatic binding of speech recognition results to existing HTML elements. We are not sure if the WebApps WG is a good place to work on standardising such markup elements, hence did not include in the simplified Javascript API [2]. If there is sufficient interest and scope in the WebApps WG charter for the Javascript API and markup, we are happy to combine them both in the proposal. [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [2] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Thanks, Peter [1] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html On Thu, Jan 5, 2012 at 07:15, Glen Shires gshi...@google.com wrote: As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as an aid to developing full standards in this space. Because that work was so broad, Art Barstow asked (below) for a relatively specific proposal. We at Google are proposing that a subset of it be accepted as a work item by the Web Applications WG. Specifically, we are proposing this Javascript API [3], which enables web developers to incorporate speech recognition and synthesis into their web pages. This simplified subset enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control, and it supports the majority of use-cases in the Incubator Group's Final Report. We welcome your feedback and ask that the Web Applications WG consider accepting this Javascript API [3] as a work item. [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter [2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [3] API: http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Bjorn Bringert Satish Sampath Glen Shires On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires gshi...@google.com wrote: Milan, The IDLs contained in both documents are in the same format and order, so it's relatively easy to compare the two side-by-side. The semantics of the attributes, methods and events have not changed, and both IDLs link directly to the definitions contained in the Speech XG Final Report. As you mention, we agree that the protocol portions of the Speech XG Final Report are most appropriate for consideration by a group such as IETF, and believe such work can proceed independently, particularly because the Speech XG Final Report has provided a roadmap for these to remain compatible. Also, as shown in the Speech XG Final Report - Overview, the Speech Web API is not dependent on the Speech Protocol and a Default Speech service can be used for local or remote speech recognition and synthesis. Glen Shires On Thu, Dec 22, 2011 at 10:32 AM, Young, Milan milan.yo...@nuance.com wrote: Hello Glen, The proposal says that it contains a “simplified subset of the JavaScript API”. Could you please clarify which elements of the HTMLSpeech recommendation’s JavaScript API were omitted? I think this would be the most efficient way for those of us familiar with the XG recommendation to evaluate the new proposal. I’d also appreciate clarification on how you see the protocol being handled. In the HTMLSpeech group we were thinking about this as a hand-in-hand relationship between W3C and IETF like WebSockets. Is this still your (and Google’s) vision? Thanks From: Glen Shires [mailto:gshi...@google.com] Sent: Thursday, December 22, 2011 11:14 AM To: public-webapps@w3.org; Arthur Barstow Cc: public-xg-htmlspe...@w3.org; Dan Burnett Subject: Re: HTML Speech XG Completes, seeks feedback for eventual standardization
Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization
As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has recently wrapped up its work on use cases, requirements, and proposals for adding automatic speech recognition (ASR) and text-to-speech (TTS) capabilities to HTML. The work of the group is documented in the group's Final Report. [2] The members of the group intend this work to be input to one or more working groups, in W3C and/or other standards development organizations such as the IETF, as an aid to developing full standards in this space. Because that work was so broad, Art Barstow asked (below) for a relatively specific proposal. We at Google are proposing that a subset of it be accepted as a work item by the Web Applications WG. Specifically, we are proposing this Javascript API [3], which enables web developers to incorporate speech recognition and synthesis into their web pages. This simplified subset enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control, and it supports the majority of use-cases in the Incubator Group's Final Report. We welcome your feedback and ask that the Web Applications WG consider accepting this Javascript API [3] as a work item. [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter [2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [3] API: http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html Bjorn Bringert Satish Sampath Glen Shires On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires gshi...@google.com wrote: Milan, The IDLs contained in both documents are in the same format and order, so it's relatively easy to compare the two sidehttp://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#speechreco-section -by-sidehttp://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html#api_description. The semantics of the attributes, methods and events have not changed, and both IDLs link directly to the definitions contained in the Speech XG Final Report. As you mention, we agree that the protocol portions of the Speech XG Final Report are most appropriate for consideration by a group such as IETF, and believe such work can proceed independently, particularly because the Speech XG Final Report has provided a roadmap for these to remain compatible. Also, as shown in the Speech XG Final Report - Overviewhttp://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#introductory, the Speech Web API is not dependent on the Speech Protocol and a Default Speech service can be used for local or remote speech recognition and synthesis. Glen Shires On Thu, Dec 22, 2011 at 10:32 AM, Young, Milan milan.yo...@nuance.comwrote: Hello Glen, ** ** The proposal says that it contains a “simplified subset of the JavaScript API”. Could you please clarify which elements of the HTMLSpeech recommendation’s JavaScript API were omitted? I think this would be the most efficient way for those of us familiar with the XG recommendation to evaluate the new proposal. ** ** I’d also appreciate clarification on how you see the protocol being handled. In the HTMLSpeech group we were thinking about this as a hand-in-hand relationship between W3C and IETF like WebSockets. Is this still your (and Google’s) vision? ** ** Thanks ** ** ** ** *From:* Glen Shires [mailto:gshi...@google.com] *Sent:* Thursday, December 22, 2011 11:14 AM *To:* public-webapps@w3.org; Arthur Barstow *Cc:* public-xg-htmlspe...@w3.org; Dan Burnett *Subject:* Re: HTML Speech XG Completes, seeks feedback for eventual standardization ** ** We at Google believe that a scripting-only (Javascript) subset of the API defined in the Speech XG Incubator Group Final Report is of appropriate scope for consideration by the WebApps WG. ** ** The enclosed scripting-only subset supports the majority of the use-cases and samples in the XG proposal. Specifically, it enables web-pages to generate speech output and to use speech recognition as an input for forms, continuous dictation and control. The Javascript API will allow web pages to control activation and timing and to handle results and alternatives.* *** ** ** We welcome your feedback and ask that the Web Applications WG consider accepting this as a work item. ** ** Bjorn Bringert Satish Sampath Glen Shires ** ** On Tue, Dec 13, 2011 at 11:39 AM, Glen Shires gshi...@google.com wrote: We at Google believe that a scripting-only (Javascript) subset of the API defined in the Speech XG Incubator Group Final Report [1] is of appropriate scope for consideration by the WebApps WG. ** ** A scripting-only subset supports the majority of the use-cases and samples in the XG proposal. Specifically, it enables web-pages to generate speech output and to use speech recognition as an input for