Re: [whatwg] Access to live/raw audio and video stream data from both local and remote sources
Rob, I'm sorry for the late answer. The W3C DAP and WebRTC chairs have discussed this, and come to the following: - The WebRTC WG deals with access to live (audio and video) streams, and also currently have support for local recording of them in the API proposal [1]. - DAP has a note about the device element in the HTML Media Capture draft, but the device element has been replaced by getUserMedia [1]. - In the WebRTC charter there are references to DAP regarding device exploration and media capturing as that was deemed as in DAP scope at the time of writing the WebRTC charter. This has however since been resolved, for media streams this will be handled by WebRTC. - WebRTC is planning coordination with the Audio WG to ensure alignment regarding media streams. A question: what do you mean by raw audio and video stream data? The MediaStreams discussed in WebRTC are more of logical references (which you can attach to audio/video elements for rendering, to a PeerConnection for streaming to a peer and so on). Stefan (for the DAP and WebRTC chairs). [1] http://dev.w3.org/2011/webrtc/editor/webrtc.html On 2011-07-27 02:56, Rob Manson wrote: Hi, sorry for posting across multiple groups, but I hope you'll see from my comments below that this is really needed. This is definitely not intended as criticism of any of the work going on. It's intended as constructive feedback that hopefully provides clarification on a key use case and it's supporting requirements. Access to live/raw audio and video stream data from both local and remote sources in a consistent way I've spent quite a bit of time trying to follow a clear thread of requirements/solutions that provide API access to raw stream data (e.g. audio, video, etc.). But I'm a bit concerned this is falling in the gap between the DAP and RTC WGs. If this is not the case then please point me to the relevant docs and I'll happily get back in my box 8) Here's how the thread seems to flow at the moment based on public documents. On the DAP page [1] the mission states: the Device APIs and Policy Working Group is to create client-side APIs that enable the development of Web Applications and Web Widgets that interact with devices services such as Calendar, Contacts, Camera, etc So it seems clear that this is the place to start. Further down that page the HTML Media Capture and Media Capture APIs are listed. HTML Media Capture (camera/microphone interactions through HTML forms) initially seems like a good candidate, however the intro in the latest PWD [2] clearly states: Providing streaming access to these capabilities is outside of the scope of this specification. Followed by a NOTE that states: The Working Group is investigating the opportunity to specify streaming access via the proposeddevice element. The link on the proposeddevice element [3] links to a no longer maintained document that then redirects to the top level of the whatwg current work page [4]. On that page the most relevant link is the video conferencing and peer-to-peer communication section [5]. More about that further below. So back to the DAP page to follow explore the other Media Capture API (programmatic access to camera/microphone) [1] and it's latest PWD [6]. The abstract states: This specification defines an Application Programming Interface (API) that provides access to the audio, image and video capture capabilities of the device. And the introduction states: The Capture API defines a high-level interface for accessing the microphone and camera of a hosting device. It completes the HTML Form Based Media Capturing specification [HTMLMEDIACAPTURE] with a programmatic access to start a parametrized capture process. So it seems clear that this is not related to streams in any way either. The Notes column for this API on the DAP page [1] also states: Programmatic API that completes the form based approach Need to check if still interest in this How does it relate with the Web RTC Working Group? Is there an updated position on this? So if you then head over to the WebRTC WG's charter [7] it states: ...to define client-side APIs to enable Real-Time Communications in Web browsers. These APIs should enable building applications that can be run inside a browser, requiring no extra downloads or plugins, that allow communication between parties using audio, video and supplementary real-time communication, without having to use intervening servers... So this is clearly focused upon peer-to-peer communication between systems and the stream related access is naturally just treated as an ancillary requirement. The scope section then states: Enabling real-time communications between Web browsers require
[whatwg] PeerConnection, MediaStream, getUserMedia(), and other feedback
On Tue, Jul 26, 2011 at 07:30, Ian Hickson ian at hixie.ch wrote: If you send two MediaStream objects constructed from the same LocalMediaStream over a PeerConnection there needs to be a way to separate them on the receiving side. What's the use case for sending the same feed twice? There's no proper use case as such but the spec allows this. The question is how serious a problem this is. If you want to fork, and make both (all) versions available at the peer, would you not transmit the full stream and fork at the receiving end for efficiency reasons? And if you really want to fork at the sender, one way to separate them is to use one PeerConnection per fork. Stefan
Re: [whatwg] MTU Size PeerConnection send method (was RE: PeerConnection feedback)
Wouldn't it be possible to abstract this away for the web developer? I.e. the send method should, like for WebSockets, not have a max size. Instead the sending UA would be responsible for chopping up (the receiving UA for re-assembling) the message into packets not larger than the minimum path MTU. Depending on the UA (and how integrated with the IP stack of the device it is) different levels of implementation sophistication could be used (e.g. max 576 byte, or select 576/1280 depending on IP version, or even using MTU path discovery to find out max size). Yes, we could reimplement UDP's defragmentation mechanism at the higher level. There are a few things to keep in mind if you do that (for instance, there's a well known resource exhaustion attack where an attacker sends you the first part of UDP packets and never sends you the rest of it, until you run out of reassembly buffers, and of course the chance of losing a packet goes up significantly when all the fragments need to make it in order to achieve correct reassembly). The attacker in this case would be a (hacked) browser as the web developer can do no such thing. Of course larger data chunks increases the risk of not getting it over. This may be problematic: how can you explain this to the web developer in an understandable way?
[whatwg] MTU Size PeerConnection send method (was RE: PeerConnection feedback)
On Fri, 22 Apr 2011, Ian Hickson wrote: On Mon, 11 Apr 2011, Justin Uberti wrote: On Mon, Apr 11, 2011 at 7:09 PM, Ian Hickson i...@hixie.ch wrote: This has made UDP packets larger than the MTU pretty useless. So I guess the question is do we want to limit the input to a fixed value that is the lowest used MTU (576 bytes per IPv4), or dynamically and regularly determine what the lowest possible MTU is? The former has a major advantage: if an application works in one environment, you know it'll work elsewhere, because the maximum packet size won't change. This is a erious concern on the Web, where authors tend to do limited testing and thus often fail to handle rare edge cases well. The latter has a major disadvantage: the path MTU might change, meaning we might start dropping data if we don't keep trying to determine the Path MTU. Also, it's really hard to determine the Path MTU in practice. For now I've gone with the IPv4 minimum maximum of 576 minus overhead, leaving 504 bytes for user data per packet. It seems small, but I don't know how much data people normally send along these low-latency unreliable channels. However, if people want to instead have the minimum be dynamically determined, I'm open to that too. I think the best way to approach that would be to have UAs implement it as an experimental extension at first, and for us to get implementation experience on how well it works. If anyone is interested in doing that I'm happy to work with them to work out a way to do this that doesn't interfere with UAs that don't yet implement that extension. In practice, applications assume that the minimum MTU is 1280 (the minimum IPv6 MTU), and limit payloads to about 1200 bytes so that with framing they will fit into a 1280-byte MTU. Going down to 576 would significantly increase the packetization overhead. Interesting. Is there any data out there about what works in practice? I've seen very conflicting information, ranging from anything above what IPv4 allows is risky to Ethernet kills everything above 1500. Wikipedia seems to think that while IPv4's lowest MTU is 576, practical path MTUs are only generally higher, which doesn't seem like a good enough guarantee for Web-platform APIs. I'm happy to change this, but I'd like solid data to base the decision on. Wouldn't it be possible to abstract this away for the web developer? I.e. the send method should, like for WebSockets, not have a max size. Instead the sending UA would be responsible for chopping up (the receiving UA for re-assembling) the message into packets not larger than the minimum path MTU. Depending on the UA (and how integrated with the IP stack of the device it is) different levels of implementation sophistication could be used (e.g. max 576 byte, or select 576/1280 depending on IP version, or even using MTU path discovery to find out max size). Like for WebSockets, a readonly bufferedAmount attribute could be added. Note: I take for granted that some kind of rate control must be added to the PeerConnection's data UDP media stream, so allowing large data chunks to be sent would not increase the risk for network congestion. Stefan (this isn't really my area of expertise, so maybe I've misunderstood - then please disregard this input)
[whatwg] Initial video resolution (Re: PeerConnection feedback))
On Wed Apr 13 07:08:21 PDT 2011 Harald Alvestrand wrote On 04/13/11 13:35, Stefan Håkansson LK wrote: -Original Message- From: Ian Hickson [mailto:ian at hixie.ch] Sent: den 12 april 2011 04:09 To: whatwg Subject: [whatwg] PeerConnection feedback On Tue, 29 Mar 2011, Stefan H kansson LK wrote: The web application must be able to define the media format to be used for the streams sent to a peer. Shouldn't this be automatic and renegotiated dynamically via SDP offer/answer? Yes, this should be (re)negotiated via SDP, but what is unclear is how the SDP is populated based on the application's preferences. Why would the Web application have any say on this? Surely the user agent is in a better position to know what to negotiate, since it will be doing the encoding and decoding itself. The best format of the coded media being streamed from UA a to UA b depends on a lot of factors. An obvious one is that the codec used is supported by both UAs As you say much of it can be handled without any involvement from the application. But let's say that the app in UA a does addStream. The application in UA b (the same application as in UA a) has twovideo elements, one using a large display size, one using a small size. The UAs don't know in which element the stream will be rendered at this stage (that will be known first when the app in UA b connects the stream to one of the elements at onaddstream), so I don't understand how the UAs can select a suitable video resolution without the application giving some input. (Once the stream is being rendered in an element the situation is different - then UA b has knowledge about the rendering and could somehow inform UA a.) I had assumed that the video would at first be sent with some more or less arbitrary dimensions (maybe the native ones), and that the receiving UA would then renegotiate the dimensions once the stream was being displayed somewhere. Since the page can let the user change thevideo size dynamically, it seems the UA would likely need to be able to do that kind of dynamic update anyway. Yeah, maybe that's the way to do it. But I think the media should be sent with some sensible default resolution initially. Having a very high resolution could congest the network, and a very low would give bad user experience until the format has been renegotiated. One possible initial resolution is 0x0 (no video sent); if the initial addStream callback is called as soon as the ICE negotiation concludes, the video recipient can set up the destination path so that it knows what a sensible resolution is, and can signal that back. Of course, this means that after the session negotiation and the ICE negotiation, we have to wait for the resolution negotiation before we have any video worth showing. I think this is an interesting idea. Don't transmit until someone consumes. I guess some assessment should be made of how long the extra wait would be - but the ICE channel is available so maybe it would not be too long.
Re: [whatwg] PeerConnection feedback
-Original Message- From: Ian Hickson [mailto:i...@hixie.ch] Sent: den 12 april 2011 04:09 To: whatwg Subject: [whatwg] PeerConnection feedback On Tue, 29 Mar 2011, Stefan H kansson LK wrote: The web application must be able to define the media format to be used for the streams sent to a peer. Shouldn't this be automatic and renegotiated dynamically via SDP offer/answer? Yes, this should be (re)negotiated via SDP, but what is unclear is how the SDP is populated based on the application's preferences. Why would the Web application have any say on this? Surely the user agent is in a better position to know what to negotiate, since it will be doing the encoding and decoding itself. The best format of the coded media being streamed from UA a to UA b depends on a lot of factors. An obvious one is that the codec used is supported by both UAs As you say much of it can be handled without any involvement from the application. But let's say that the app in UA a does addStream. The application in UA b (the same application as in UA a) has two video elements, one using a large display size, one using a small size. The UAs don't know in which element the stream will be rendered at this stage (that will be known first when the app in UA b connects the stream to one of the elements at onaddstream), so I don't understand how the UAs can select a suitable video resolution without the application giving some input. (Once the stream is being rendered in an element the situation is different - then UA b has knowledge about the rendering and could somehow inform UA a.) I had assumed that the video would at first be sent with some more or less arbitrary dimensions (maybe the native ones), and that the receiving UA would then renegotiate the dimensions once the stream was being displayed somewhere. Since the page can let the user change the video size dynamically, it seems the UA would likely need to be able to do that kind of dynamic update anyway. Yeah, maybe that's the way to do it. But I think the media should be sent with some sensible default resolution initially. Having a very high resolution could congest the network, and a very low would give bad user experience until the format has been renegotiated. //Stefan
[whatwg] Recording interface (Re: Peer-to-peer communication, video conferencing, and related topics (2))
-Original Message- From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-boun...@lists.whatwg.org] On Behalf Of whatwg-requ...@lists.whatwg.org Sent: den 29 mars 2011 20:33 To: whatwg@lists.whatwg.org Subject: whatwg Digest, Vol 84, Issue 69 I also believe that the recording interface should be removed from this part of the specification; there should be no requirement that all streams be recordable. Recording of streams is needed for some use cases unrelated to video conferencing, such as recording messages. Having a recording function is needed in multiple use cases; I think we all agree on that. This is mostly a matter of style, which I'm happy to defer on. The streams should be regarded as a control surface, not as a data channel; in many cases, the question of what is the format of the stream at this point is literally unanswerable; it may be represented as hardware states, memory buffers, byte streams, or something completely different. Agreed. Recording any of these requires much more specification than just record here. Could you elaborate on what else needs specifying? One thing I remember from an API design talk I viewed: An ability to record to a file means that the file format is part of your API. For instance, for audio recording, it's likely that you want control over whether the resulting file is in Ogg Vorbis format or in MP3 format; for video, it's likely that you may want to specify that it will be stored using the VP8 video codec, the Vorbis audio codec and the Matroska container format. These desires have to be communicated to the underlying audio/video engine, so that the proper transforms can be inserted into the processing stream, and I think they have to be communicated across this interface; since the output of these operations is a blob without any inherent type information, the caller has to already know which format the media is in. This is absolutely correct, and it is not only about codecs or container formats. Maybe you need to supply info like audio sampling rate, video frame rate, video resolution, ... There was an input on this already last November: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-November/029069.html Clearer? -- Message: 2 Date: Tue, 29 Mar 2011 15:27:58 +0200 From: Wilhelm Joys Andersen wilhel...@opera.com To: whatwg@lists.whatwg.org Subject: [whatwg] details, summary and styling Message-ID: op.vs3w0wvrm3w...@kunnskapet.oslo.osa Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Hi, I'm currently writing tests in preparation for Opera's implementation of details and summary. In relation to this, I have a few questions about issues that, as far as I can tell, are currently undefined in the specification. The spec says: If there is no child summary element [of the details element], the user agent should provide its own legend (e.g. Details). [1] How exactly should this legend be provided? Should the user agent add an implied summary element to the DOM, similar to tbody, a pseudo-element, or a magic non-element behaving differently from both of the above? In the current WebKit implementation[2], the UA-provided legend behaves inconsistently from from an author-provided summary in the following ways: * Although it can be styled with rules applying to summary, it does not respond to :hover or :first-child. * With regards to text selection, it behaves more like an input type='submit' than a user-provided summary. Text within this implied element may only be selected _together_ with the text preceding and following it. * A different mouse cursor is used. This indicates that it is slightly more magic than I would prefer. I believe a closer resemblance to an ordinary element would be more convenient for authors - a ::summary pseudo element with Details as its content() might be the cleanest approach, although that would require a few more bytes in the author's stylesheet to cater to both author- and UA-defined summaries: summary, ::summary { color: green; } Furthermore, the rendering spec says: The first container is expected to contain at least one line box, and that line box is expected to contain a disclosure widget (typically a triangle), horizontally positioned within the left padding of the details element. [3] For user agents aiming to support the suggested default rendering, how should the disclosure widget be embedded? Ideally, graphical browsers should all do this in a similar manner, and in a way that allows authors to style these elements to the same extent as any other element. There are several options: * A ::marker pseudo element[4]. * A default, non-repeating background image positioned within the recommended 40 pixel left padding. * A method similar to
[whatwg] Media negotiation (RE: Peer-to-peer communication, video conferencing, and related topics (2))
!The web application must be able to!If the video is going to be displayed ! !define the media format to be used for !in a large window, use higher bit-! !the streams sent to a peer. !rate/resolution. Should media settings! ! !be allowed to be changed during a ! ! !session (at e.g. window resize)? ! Shouldn't this be automatic and renegotiated dynamically via SDP offer/answer? Yes, this should be (re)negotiated via SDP, but what is unclear is how the SDP is populated based on the application's preferences. Why would the Web application have any say on this? Surely the user agent is in a better position to know what to negotiate, since it will be doing the encoding and decoding itself. The best format of the coded media being streamed from UA a to UA b depends on a lot of factors. An obvious one is that the codec used is supported by both UAs As you say much of it can be handled without any involvement from the application. But let's say that the app in UA a does addStream. The application in UA b (the same application as in UA a) has two video elements, one using a large display size, one using a small size. The UAs don't know in which element the stream will be rendered at this stage (that will be known first when the app in UA b connects the stream to one of the elements at onaddstream), so I don't understand how the UAs can select a suitable video resolution without the application giving some input. (Once the stream is being rendered in an element the situation is different - then UA b has knowledge about the rendering and could somehow inform UA a.) Stefan
[whatwg] Peer-to-peer and stream APIs
All, there are now two different sets of APIs public, one documented in the spec (http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html#video-conferencing-and-peer-to-peer-communication) and one sent the other day (https://sites.google.com/a/alvestrand.com/rtc-web/w3c-activity/api-proposals). A quick look at the API sets gives me the impression that they are on a top level quite similar. The model and the level of the two API sets seem to be more or less the same. The first set seem to me clearer, more thought through and better documented. The second one also lacks the possibility to send text peer-to-peer, something that can be very important for certain cases (e.g. gaming). I could go on discussing details, but my main message is: given that the two API sets are, on a top level, quite similar, would we not be better off selecting one of them, and use this as a basis for further discussion, testing and refinement? Working on two parallel tracks could waste implementation efforts, lead to non converging parallel discussions and possibly end up in a fragmented situation. My view is that a good way forward would be to use the API set in the spec as starting point, and propose enhancements/additions to it. Stefan
Re: [whatwg] Peer-to-peer use case (was Peer-to-peer communication, video conferencing, device, and related topics)
Some feedback below. (Stuff where I agree and there is no question have left out). On Mon, 31 Jan 2011, Stefan H kansson LK wrote this use case: We've since produced an updated use case doc: http://www.ietf.org/id/draft-holmberg-rtcweb-ucreqs-01.txt ... The web author developing the application has decided to display a self-view as well as the video from the remote side in rather small windows, but the user can change the display size during the session. The application also supports if a participant (for a longer or shorter time) would like to stop sending audio (but keep video) or video (keep audio) to the other peer (mute). ... All of this except selectively muting audio vs video is currently possible in the proposed API. The simplest way to make selective muting possible too would be to change how the pause/resume thing works in GeneratedStream, so that instead of pause() and resume(), we have individual controls for audio and video. Something like: void muteAudio(); void resumeAudio(); readonly attribute boolean audioMuted; void muteVideo(); void resumeViduo(); readonly attribute boolean videoMuted; Alternatively, we could just have mutable attributes: attribute boolean audioEnabled; attribute boolean videoEnabled; Any opinions on this? We're looking into this and will produce a more elaborate input related to this. ... !The web application must be able to!If the video is going to be displayed ! !define the media format to be used for !in a large window, use higher bit-! !the streams sent to a peer.!rate/resolution. Should media settings! ! !be allowed to be changed during a ! ! !session (at e.g. window resize)? ! Shouldn't this be automatic and renegotiated dynamically via SDP offer/answer? Yes, this should be (re)negotiated via SDP, but what is unclear is how the SDP is populated based on the application's preferences. ... !Streams being transmitted must be !Do not starve other traffic (e.g. on ! !subject to rate control!ADSL link) ! Not sure whether this requires any thing special. Could you elaborate? What I am after is that the RTP/UDP streams sent from one UA to the other must have some rate adaptation implemented. HTTP uses TCP transport, and TCP reduces the send rate when a packet does not arrive (so that flows share the available throughput in a fair way when there is a bottleneck). For UDP there is no such mechanism, so unless something is added in the RTP implementation it could starve other traffic. I don't think it should be visible in the API though, it is a requirment on the implemenation in the UA. ... !Synchronization between audio and video! ! !must be supported ! ! If there's one stream, that's automatic, no? One audiovisual stream is actually transmitted as two RTP streams (one audio, one video). And synchronization at playout is not automatic, it is something you do based on RTP timestamps and RTCP stuff. But again, this is a req on the implementaion in the UA, not on the API. ... !The web application must be made aware !To be able to inform user and take! !of when streams from a peer are no !action (one of the peers still has! !longer received!connection with the server) ! -- -- !The browser must detect when no streams! ! !are received from a peer ! ! These aren't really yet supported in the API, but I intend for us to add this kind of thing at the same time sa we add similar metrics to video and audio. To do this, though, it would really help to have a better idea what the requirements are. What information should be available? Packets received per second (and sent, maybe) seems like an obvious one, but what other information can we collect? I think more studies are required to answer this one. //Stefan
Re: [whatwg] ConnectionPeer experiences
(Patrik has a couple of well deserved days off): - In your experimentation, did you find any reasonable underlying protocol to map sendFile, sendBitmap and their corresponding callbacks to, or did you just ignore them for now? We just ignored them for the time being - In connecting, did you operate with connections going via a server, or did you go between browsers? If so, how did you identify the server vs identifying the remote participant? Was the remoteConfiguration method flexible enough? The connection that is set up using ConnectionPeer is browser-to-browser in our case (there is no need to use TURN in our setup). So what we do supply as remoteConfiguration (note our proposal to change addRemoteConfiguration to setRemoteConfiguration) is basically the candidates obtained in contact with the STUN server. add/setRemoteConfiguration seems flexible enough (we have only experimented with a simple case though). Of course the data supplied (acquired from getLocalConfig at the remote sida) must be able to express all required possibilities. //Stefan Message: 4 Date: Fri, 28 Jan 2011 16:14:52 -0800 From: Harald Alvestrand har...@alvestrand.no To: whatwg@lists.whatwg.org Subject: Re: [whatwg] ConnectionPeer experiences Message-ID: 4d435bfc.7040...@alvestrand.no Content-Type: text/plain; charset=ISO-8859-1; format=flowed Thank you Patrik, I enjoyed reading that! Questions: - In your experimentation, did you find any reasonable underlying protocol to map sendFile, sendBitmap and their corresponding callbacks to, or did you just ignore them for now? - In connecting, did you operate with connections going via a server, or did you go between browsers? If so, how did you identify the server vs identifying the remote participant? Was the remoteConfiguration method flexible enough? I'm currently staring at the problem of defining a set of semantics for mapping this to RTC-Web protocols, and am having some problems interpreting what's currently in the spec, so would like your interpretation. Harald On 01/26/11 01:04, Patrik Persson J wrote: We have done some experimentation with the ConnectionPeer API. We have an initial implementation of a subset of the API, using ICE (RFC 5245) for the peer-to-peer handshaking. Our implementation is WebKit/GTK+/gstreamer-based, and we of course intend to submit it to WebKit, but the implementation is not quite ready for that yet. More information about our work so far can be found here: https://labs.ericsson.com/developer-community/blog/beyond-html 5-peer-peer-conversational-video However, we have bumped into some details that we'd like to discuss here right away. The following is our mix of proposals and questions. 1. We propose adding a readyState attribute, to decouple the onconnect() callback from any observers (such as the UI). const unsigned short CONNECTING = 0; const unsigned short CONNECTED = 1; const unsigned short CLOSED = 2; readonly attribute unsigned short readyState; 2. We propose replacing the onstream event with custom events of type RemoteStreamEvent, to distinguish between adding and removing streams. attribute Function onstreamadded; // RemoteStreamEvent attribute Function onstreamremoved; // RemoteStreamEvent ... interface RemoteStreamEvent : Event { readonly attribute Stream stream; }; The 'stream' attribute indicates which stream was added/removed. 3. We propose renaming addRemoteConfiguration to setRemoteConfiguration. Our understanding of the ConnectionPeer is that it provides a single-point-to-single-point connection; hence, only one remote peer configuration is to be set, rather than many to be added. void setRemoteConfiguration(in DOMString configuration, in optional DOMString remoteOrigin); 4. We propose swapping the ConnectionPeerConfigurationCallback callback parameters. The current example seems to use only one (the second one). Swapping them allows clients that care about 'server' to do so, and clients that ignore it (such as the current example) to do so too. [Callback=FunctionOnly, NoInterfaceObject] interface ConnectionPeerConfigurationCallback { void handleEvent(in DOMString configuration, in ConnectionPeer server); }; 5. Should a size limit to text messages be specified? Text messages with UDP-like behavior (unimportant=true) can't really be reliably split into several UDP packets. For such long chunks of data, file transfer seems like a better option anyway. In summary, then, our proposal for a revised ConnectionPeer looks as follows: [Constructor(in DOMString serverConfiguration)] interface ConnectionPeer
[whatwg] Simple video chat use case
Just to start the discussion, we put together the simplest use case you can imagine (with an additional error case). It has been been sent to the RTC-Web community before, but as there is some discussion on ConnectionPeer ongoing we thought the use case could be of interest in this community as well. Some requirements are derived from the use case. The intention is to add more use cases, but before spending any more energy I though it would be good to display it to start the discussion and to get feedback on whether this is a workable model or not. The requirements are split in three categories: 1. User/privacy: the intention is to list what an end user can expect in terms of what an web application can do without user consent and so on 2. API dimension: lists what must be accessible by JS and signalled to JS via APIs 3. Functional dimension: lists functionality that must be supported by the browser (in combination with the underlying operating system, drivers and HW) even though it is not visible from the API. I can't say that I am extremely satisfied with this division. Perhaps we could come up with something better. Also, some of the requirements in the User/privacy and Functional dimensions are not derived (e.g. ask for user consent) from this use case per se, but are instead generic requirements. Perhaps they should be handled separately? Connection is not mentioned as it is not a requirement (the requirement is rather to be able to send and receive streams, a connection would presumably be used as a foundation for the stream transport). OK, to the use case: 1. Simple Video Chat A simple video chat service has been developed. In the service the users are logged on to the same chat web server. The web server publishes information about user login status, pushing updates to the web apps in the browsers. By clicking on an online peer user name, a 1-1 video chat session between the two browsers is initiated. The invited peer is presented with a choice of joining or rejecting the session. The web author developing the application has decided to display a self-view as well as the video from the remote side in rather small windows, but the user can change the display size during the session. The application also supports if a participant (for a longer or shorter time) would like to stop sending audio (but keep video) or video (keep audio) to the other peer (mute). Any of the two participants can at any time end the chat by clicking a button. In this specific case two users are using lap-tops in their respective homes. They are connected to the public Internet with a desktop browser using WiFi behind NATs. One of the users has an ADSL connection to the home, and the other fiber access. Most of the time headsets are used, but not always. 1.1 Requirements 1.1.1 User/privacy dimension !Requirement. The user must:!Comment ! !give explicit consent before a device ! ! !can be used to capture audio or video ! ! !be able to in an intuitive way revoke ! ! !and change capturing permissions ! ! !be able to easily understand that audio! ! !or video is being captured ! ! !be informed about that an invitation to! ! !a peer video chat session has been ! ! !received ! ! !be able to accept or reject an ! ! !invitation to a peer video chat session! ! !be able to stop a media stream from! ! !being transmitted ! ! 1.1.2 API dimension --- !Requirement. !Comment !