Re: [whatwg] Access to live/raw audio and video stream data from both local and remote sources

2011-08-24 Thread Stefan Håkansson LK

Rob,

I'm sorry for the late answer. The W3C DAP and WebRTC chairs have 
discussed this, and come to the following:


- The WebRTC WG deals with access to live (audio and video) streams, and 
also currently have support for local recording of them in the API 
proposal [1].


- DAP has a note about the device element in the HTML Media Capture 
draft, but the device element has been replaced by getUserMedia [1].


- In the WebRTC charter there are references to DAP regarding device 
exploration and media capturing as that was deemed as in DAP scope at 
the time of writing the WebRTC charter. This has however since been 
resolved, for media streams this will be handled by WebRTC.


- WebRTC is planning coordination with the Audio WG to ensure alignment 
regarding media streams.


A question: what do you mean by raw audio and video stream data? The 
MediaStreams discussed in WebRTC are more of logical references (which 
you can attach to audio/video elements for rendering, to a 
PeerConnection for streaming to a peer and so on).


Stefan (for the DAP and WebRTC chairs).

[1] http://dev.w3.org/2011/webrtc/editor/webrtc.html

On 2011-07-27 02:56, Rob Manson wrote:

Hi,

sorry for posting across multiple groups, but I hope you'll see from my
comments below that this is really needed.

This is definitely not intended as criticism of any of the work going
on.  It's intended as constructive feedback that hopefully provides
clarification on a key use case and it's supporting requirements.

 Access to live/raw audio and video stream data from both local
 and remote sources in a consistent way

I've spent quite a bit of time trying to follow a clear thread of
requirements/solutions that provide API access to raw stream data (e.g.
audio, video, etc.).  But I'm a bit concerned this is falling in the gap
between the DAP and RTC WGs.  If this is not the case then please point
me to the relevant docs and I'll happily get back in my box 8)

Here's how the thread seems to flow at the moment based on public
documents.

On the DAP page [1] the mission states:
 the Device APIs and Policy Working Group is to create
 client-side APIs that enable the development of Web Applications
 and Web Widgets that interact with devices services such as
 Calendar, Contacts, Camera, etc

So it seems clear that this is the place to start.  Further down that
page the HTML Media Capture and Media Capture APIs are listed.

HTML Media Capture (camera/microphone interactions through HTML forms)
initially seems like a good candidate, however the intro in the latest
PWD [2] clearly states:
 Providing streaming access to these capabilities is outside of
 the scope of this specification.

Followed by a NOTE that states:
 The Working Group is investigating the opportunity to specify
 streaming access via the proposeddevice  element.

The link on the proposeddevice  element [3] links to a no longer
maintained document that then redirects to the top level of the whatwg
current work page [4].  On that page the most relevant link is the
video conferencing and peer-to-peer communication section [5].  More
about that further below.

So back to the DAP page to follow explore the other Media Capture API
(programmatic access to camera/microphone) [1] and it's latest PWD [6].
The abstract states:
 This specification defines an Application Programming Interface
 (API) that provides access to the audio, image and video capture
 capabilities of the device.

And the introduction states:
 The Capture API defines a high-level interface for accessing
 the microphone and camera of a hosting device. It completes the
 HTML Form Based Media Capturing specification [HTMLMEDIACAPTURE]
 with a programmatic access to start a parametrized capture
 process.

So it seems clear that this is not related to streams in any way either.

The Notes column for this API on the DAP page [1] also states:
 Programmatic API that completes the form based approach
 Need to check if still interest in this
 How does it relate with the Web RTC Working Group?

Is there an updated position on this?

So if you then head over to the WebRTC WG's charter [7] it states:
 ...to define client-side APIs to enable Real-Time
 Communications in Web browsers.

 These APIs should enable building applications that can be run
 inside a browser, requiring no extra downloads or plugins, that
 allow communication between parties using audio, video and
 supplementary real-time communication, without having to use
 intervening servers...

So this is clearly focused upon peer-to-peer communication between
systems and the stream related access is naturally just treated as an
ancillary requirement.  The scope section then states:
 Enabling real-time communications between Web browsers require
   

[whatwg] PeerConnection, MediaStream, getUserMedia(), and other feedback

2011-07-28 Thread Stefan Håkansson LK
On Tue, Jul 26, 2011 at 07:30, Ian Hickson ian at hixie.ch wrote:


  If you send two MediaStream objects constructed from the same
  LocalMediaStream over a PeerConnection there needs to be a way to
  separate them on the receiving side.

 What's the use case for sending the same feed twice?


There's no proper use case as such but the spec allows this.
The question is how serious a problem this is. If you want to fork, and make 
both (all) versions available at the peer, would you not transmit the full 
stream and fork at the receiving end for efficiency reasons? And if you really 
want to fork at the sender, one way to separate them is to use one 
PeerConnection per fork.

Stefan

Re: [whatwg] MTU Size PeerConnection send method (was RE: PeerConnection feedback)

2011-04-28 Thread Stefan Håkansson LK
 Wouldn't it be possible to abstract this away for the web developer? I.e. 
 the send method should, like for WebSockets, not have a max size. Instead 
 the sending UA would be responsible for chopping up (the receiving UA for 
 re-assembling) the message into packets not larger than the minimum path 
 MTU. Depending on the UA (and how integrated with the IP stack of the device 
 it is) different levels of implementation sophistication could be used (e.g. 
 max 576 byte, or select 576/1280 depending on IP version, or even using MTU 
 path discovery to find out max size).
Yes, we could reimplement UDP's defragmentation mechanism at the higher level.

There are a few things to keep in mind if you do that (for instance, there's a 
well known resource exhaustion attack where an attacker sends you the first 
part of UDP packets and never sends you the rest of it, until you run out of 
reassembly buffers, and of course the chance of losing a packet goes up 
significantly when all the fragments need to make it in order to achieve 
correct reassembly).
The attacker in this case would be a (hacked) browser as the web developer can 
do no such thing. Of course larger data chunks increases the risk of not 
getting it over. This may be problematic: how can you explain this to the web 
developer in an understandable way?





[whatwg] MTU Size PeerConnection send method (was RE: PeerConnection feedback)

2011-04-24 Thread Stefan Håkansson LK
On Fri, 22 Apr 2011, Ian Hickson wrote:
On Mon, 11 Apr 2011, Justin Uberti wrote:
 On Mon, Apr 11, 2011 at 7:09 PM, Ian Hickson i...@hixie.ch wrote:
  
   This has made UDP packets larger than the MTU pretty useless.
 
  So I guess the question is do we want to limit the input to a fixed 
  value that is the lowest used MTU (576 bytes per IPv4), or 
  dynamically and regularly determine what the lowest possible MTU is?
 
  The former has a major advantage: if an application works in one 
  environment, you know it'll work elsewhere, because the maximum 
  packet size won't change. This is a erious concern on the Web, where 
  authors tend to do limited testing and thus often fail to handle 
  rare edge cases well.
 
  The latter has a major disadvantage: the path MTU might change, 
  meaning we might start dropping data if we don't keep trying to 
  determine the Path MTU. Also, it's really hard to determine the Path 
  MTU in practice.
 
  For now I've gone with the IPv4 minimum maximum of 576 minus 
  overhead, leaving 504 bytes for user data per packet. It seems 
  small, but I don't know how much data people normally send along 
  these low-latency unreliable channels.
 
  However, if people want to instead have the minimum be dynamically 
  determined, I'm open to that too. I think the best way to approach 
  that would be to have UAs implement it as an experimental extension 
  at first, and for us to get implementation experience on how well it 
  works. If anyone is interested in doing that I'm happy to work with 
  them to work out a way to do this that doesn't interfere with UAs 
  that don't yet implement that extension.

 In practice, applications assume that the minimum MTU is 1280 (the 
 minimum IPv6 MTU), and limit payloads to about 1200 bytes so that with 
 framing they will fit into a 1280-byte MTU. Going down to 576 would 
 significantly increase the packetization overhead.

Interesting.

Is there any data out there about what works in practice? I've seen very 
conflicting information, ranging from anything above what IPv4 allows is 
risky to Ethernet kills everything above 1500. Wikipedia seems to think 
that while IPv4's lowest MTU is 576, practical path MTUs are only generally 
higher, which doesn't seem like a good enough guarantee for Web-platform APIs.

I'm happy to change this, but I'd like solid data to base the decision on.

Wouldn't it be possible to abstract this away for the web developer? I.e. the 
send method should, like for WebSockets, not have a max size. Instead the 
sending UA would be responsible for chopping up (the receiving UA for 
re-assembling) the message into packets not larger than the minimum path MTU. 
Depending on the UA (and how integrated with the IP stack of the device it is) 
different levels of implementation sophistication could be used (e.g. max 576 
byte, or select 576/1280 depending on IP version, or even using MTU path 
discovery to find out max size).

Like for WebSockets, a readonly bufferedAmount attribute could be added. 

Note: I take for granted that some kind of rate control must be added to the 
PeerConnection's data UDP media stream, so allowing large data chunks to be 
sent would not increase the risk for network congestion.

Stefan (this isn't really my area of expertise, so maybe I've misunderstood - 
then please disregard this input)

[whatwg] Initial video resolution (Re: PeerConnection feedback))

2011-04-17 Thread Stefan Håkansson LK
On Wed Apr 13 07:08:21 PDT 2011 Harald Alvestrand wrote
On 04/13/11 13:35, Stefan Håkansson LK wrote:


 -Original Message-
 From: Ian Hickson [mailto:ian at hixie.ch]
 Sent: den 12 april 2011 04:09
 To: whatwg
 Subject: [whatwg] PeerConnection feedback

 On Tue, 29 Mar 2011, Stefan H kansson LK wrote:
 The web application must be able to define the media format to
 be used for the streams sent to a peer.
 Shouldn't this be automatic and renegotiated dynamically via SDP
 offer/answer?
 Yes, this should be (re)negotiated via SDP, but what is unclear is
 how the SDP is populated based on the application's preferences.
 Why would the Web application have any say on this? Surely the user
 agent is in a better position to know what to negotiate, since it will
 be doing the encoding and decoding itself.
 The best format of the coded media being streamed from UA a to UA b
 depends on a lot of factors. An obvious one is that the codec used is
 supported by both UAs As you say much of it can be handled without
 any involvement from the application.

 But let's say that the app in UA a does addStream. The application in
 UA b (the same application as in UA a) has twovideo  elements, one
 using a large display size, one using a small size. The UAs don't know
 in which element the stream will be rendered at this stage (that will be
 known first when the app in UA b connects the stream to one of the
 elements at onaddstream), so I don't understand how the UAs can select
 a suitable video resolution without the application giving some input.
 (Once the stream is being rendered in an element the situation is
 different - then UA b has knowledge about the rendering and could
 somehow inform UA a.)
 I had assumed that the video would at first be sent with some more or less
 arbitrary dimensions (maybe the native ones), and that the receiving UA
 would then renegotiate the dimensions once the stream was being displayed
 somewhere. Since the page can let the user change thevideo  size
 dynamically, it seems the UA would likely need to be able to do that kind
 of dynamic update anyway.
 Yeah, maybe that's the way to do it. But I think the media should be sent 
 with
 some sensible default resolution initially. Having a very high resolution 
 could
 congest the network, and a very low would give bad user experience until the
 format has been renegotiated.
One possible initial resolution is 0x0 (no video sent); if the initial 
addStream callback is called as soon as the ICE negotiation concludes, 
the video recipient can set up the destination path so that it knows 
what a sensible resolution is, and can signal that back.

Of course, this means that after the session negotiation and the ICE 
negotiation, we have to wait for the resolution negotiation before we 
have any video worth showing.
I think this is an interesting idea. Don't transmit until someone
consumes. I guess some assessment should be made of how long the extra
wait would be - but the ICE channel is available so maybe it would not
be too long.

Re: [whatwg] PeerConnection feedback

2011-04-13 Thread Stefan Håkansson LK
 

-Original Message-
From: Ian Hickson [mailto:i...@hixie.ch] 
Sent: den 12 april 2011 04:09
To: whatwg
Subject: [whatwg] PeerConnection feedback

On Tue, 29 Mar 2011, Stefan H kansson LK wrote:
 The web application must be able to define the media format to 
 be used for the streams sent to a peer.

Shouldn't this be automatic and renegotiated dynamically via SDP 
offer/answer?
  
   Yes, this should be (re)negotiated via SDP, but what is unclear is 
   how the SDP is populated based on the application's preferences.
  
  Why would the Web application have any say on this? Surely the user 
  agent is in a better position to know what to negotiate, since it will 
  be doing the encoding and decoding itself.

 The best format of the coded media being streamed from UA a to UA b 
 depends on a lot of factors. An obvious one is that the codec used is 
 supported by both UAs As you say much of it can be handled without 
 any involvement from the application.
 
 But let's say that the app in UA a does addStream. The application in 
 UA b (the same application as in UA a) has two video elements, one 
 using a large display size, one using a small size. The UAs don't know 
 in which element the stream will be rendered at this stage (that will be 
 known first when the app in UA b connects the stream to one of the 
 elements at onaddstream), so I don't understand how the UAs can select 
 a suitable video resolution without the application giving some input. 
 (Once the stream is being rendered in an element the situation is 
 different - then UA b has knowledge about the rendering and could 
 somehow inform UA a.)

I had assumed that the video would at first be sent with some more or less 
arbitrary dimensions (maybe the native ones), and that the receiving UA 
would then renegotiate the dimensions once the stream was being displayed 
somewhere. Since the page can let the user change the video size 
dynamically, it seems the UA would likely need to be able to do that kind 
of dynamic update anyway.
Yeah, maybe that's the way to do it. But I think the media should be sent with
some sensible default resolution initially. Having a very high resolution could
congest the network, and a very low would give bad user experience until the 
format has been renegotiated.

//Stefan


[whatwg] Recording interface (Re: Peer-to-peer communication, video conferencing, and related topics (2))

2011-03-30 Thread Stefan Håkansson LK


 -Original Message-
 From: whatwg-boun...@lists.whatwg.org
 [mailto:whatwg-boun...@lists.whatwg.org] On Behalf Of
 whatwg-requ...@lists.whatwg.org
 Sent: den 29 mars 2011 20:33
 To: whatwg@lists.whatwg.org
 Subject: whatwg Digest, Vol 84, Issue 69
I also believe that the recording interface should be
 removed from
   this  part of the specification; there should be no requirement
   that all  streams be recordable.
  Recording of streams is needed for some use cases unrelated
 to video
  conferencing, such as recording messages.
 Having a recording function is needed in multiple use cases;
 I think we all agree on that.
 This is mostly a matter of style, which I'm happy to defer on.
The streams should be regarded as a control surface,
 not as a data
   channel; in  many cases, the question of what is the
 format of the stream at this point
is literally unanswerable; it may be represented as hardware
   states, memory  buffers, byte streams, or something
 completely different.
  Agreed.
 
 
Recording any of these requires much more specification
 than just
   record here.
  Could you elaborate on what else needs specifying?
 One thing I remember from an API design talk I viewed:
 An ability to record to a file means that the file format is
 part of your API.

 For instance, for audio recording, it's likely that you want
 control over whether the resulting file is in Ogg Vorbis
 format or in MP3 format; for video, it's likely that you may
 want to specify that it will be stored using the VP8 video
 codec, the Vorbis audio codec and the Matroska container
 format. These desires have to be communicated to the
 underlying audio/video engine,  so that the proper transforms
 can be inserted into the processing stream, and I think they
 have to be communicated across this interface; since the
 output of these operations is a blob without any inherent
 type information, the caller has to already know which format
 the media is in.
This is absolutely correct, and it is not only about codecs or
container formats. Maybe you need to supply info like audio
sampling rate, video frame rate, video resolution, ...
There was an input on this already last November:
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-November/029069.html

 Clearer?



 



 --

 Message: 2
 Date: Tue, 29 Mar 2011 15:27:58 +0200
 From: Wilhelm Joys Andersen wilhel...@opera.com
 To: whatwg@lists.whatwg.org
 Subject: [whatwg] details, summary and styling
 Message-ID: op.vs3w0wvrm3w...@kunnskapet.oslo.osa
 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes

 Hi,

 I'm currently writing tests in preparation for Opera's implementation
 of details and summary. In relation to this, I have a few
 questions
 about issues that, as far as I can tell, are currently
 undefined in the
 specification.

 The spec says:

If there is no child summary element [of the details element], the
user agent should provide its own legend (e.g. Details). [1]

 How exactly should this legend be provided? Should the user agent add
 an implied summary element to the DOM, similar to tbody, a
 pseudo-element, or a magic non-element behaving differently from both
 of the above? In the current WebKit implementation[2], the UA-provided
 legend behaves inconsistently from from an author-provided summary
 in the following ways:

   * Although it can be styled with rules applying to
 summary, it does
 not respond to :hover or :first-child.

   * With regards to text selection, it behaves more like an input
 type='submit' than a user-provided summary. Text within this
 implied element may only be selected _together_ with the text
 preceding and following it.

   * A different mouse cursor is used.

 This indicates that it is slightly more magic than I would prefer. I
 believe a closer resemblance to an ordinary element would be more
 convenient for authors - a ::summary pseudo element with Details as
 its content() might be the cleanest approach, although that would
 require a few more bytes in the author's stylesheet to cater to both
 author- and UA-defined summaries:

summary, ::summary {
  color: green;
}

 Furthermore, the rendering spec says:

The first container is expected to contain at least one line box,
and that line box is expected to contain a disclosure
 widget (typically
a triangle), horizontally positioned within the left padding of the
details element. [3]

 For user agents aiming to support the suggested default rendering, how
 should the disclosure widget be embedded? Ideally, graphical browsers
 should all do this in a similar manner, and in a way that
 allows authors
 to style these elements to the same extent as any other element.

 There are several options:

   * A ::marker pseudo element[4].
   * A default, non-repeating background image positioned within
 the recommended 40 pixel left padding.
   * A method similar to 

[whatwg] Media negotiation (RE: Peer-to-peer communication, video conferencing, and related topics (2))

2011-03-29 Thread Stefan Håkansson LK
!The web application must be able to!If the video 
 is going to be displayed !
!define the media format to be used for !in a large 
 window, use higher bit-!
!the streams sent to a peer.
 !rate/resolution. Should media settings!
!   !be allowed to 
 be changed during a !
!   !session (at 
 e.g. window resize)?  !
   
   Shouldn't this be automatic and renegotiated dynamically via SDP 
   offer/answer?
 
  Yes, this should be (re)negotiated via SDP, but what is 
 unclear is how 
  the SDP is populated based on the application's preferences.
 
 Why would the Web application have any say on this? Surely 
 the user agent 
 is in a better position to know what to negotiate, since it 
 will be doing 
 the encoding and decoding itself.
The best format of the coded media being streamed from UA a to UA b depends on 
a lot of factors. An obvious one is that the codec used is supported by both 
UAs As you say much of it can be handled without any involvement from the 
application.

But let's say that the app in UA a does addStream. The application in UA b 
(the same application as in UA a) has two video elements, one using a large 
display size, one using a small size. The UAs don't know in which element the 
stream will be rendered at this stage (that will be known first when the app in 
UA b connects the stream to one of the elements at onaddstream), so I don't 
understand how the UAs can select a suitable video resolution without the 
application giving some input. (Once the stream is being rendered in an element 
the situation is different - then UA b has knowledge about the rendering and 
could somehow inform UA a.)

Stefan

[whatwg] Peer-to-peer and stream APIs

2011-03-25 Thread Stefan Håkansson LK
All,

there are now two different sets of APIs public, one documented in the spec 
(http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html#video-conferencing-and-peer-to-peer-communication)
 and one sent the other day 
(https://sites.google.com/a/alvestrand.com/rtc-web/w3c-activity/api-proposals).

A quick look at the API sets gives me the impression that they are on a top 
level quite similar. The model and the level of the two API sets seem to be 
more or less the same. The first set seem to me clearer, more thought through 
and better documented. The second one also lacks the possibility to send text 
peer-to-peer, something that can be very important for certain cases (e.g. 
gaming).

I could go on discussing details, but my main message is: given that the two 
API sets are, on a top level, quite similar, would we not be better off 
selecting one of them, and use this as a basis for further discussion, testing 
and refinement? 

Working on two parallel tracks could waste implementation efforts, lead to non 
converging parallel discussions and possibly end up in a fragmented situation.

My view is that a good way forward would be to use the API set in the spec as 
starting point, and propose enhancements/additions to it. 

Stefan




Re: [whatwg] Peer-to-peer use case (was Peer-to-peer communication, video conferencing, device, and related topics)

2011-03-22 Thread Stefan Håkansson LK
Some feedback below. (Stuff where I agree and there is no question have left 
out).

 
 On Mon, 31 Jan 2011, Stefan H kansson LK wrote this use case:
 

We've since produced an updated use case doc: 
http://www.ietf.org/id/draft-holmberg-rtcweb-ucreqs-01.txt

...

  The web author developing the application has decided to display a 
  self-view as well as the video from the remote side in rather small 
  windows, but the user can change the display size during 
 the session. 
  The application also supports if a participant (for a 
 longer or shorter 
  time) would like to stop sending audio (but keep video) or 
 video (keep 
  audio) to the other peer (mute).
...
 
 All of this except selectively muting audio vs video is currently 
 possible in the proposed API.
 
 The simplest way to make selective muting possible too would 
 be to change 
 how the pause/resume thing works in GeneratedStream, so that 
 instead of 
 pause() and resume(), we have individual controls for audio 
 and video. 
 Something like:
 
void muteAudio();
void resumeAudio();
readonly attribute boolean audioMuted;
void muteVideo();
void resumeViduo();
readonly attribute boolean videoMuted;
 
 Alternatively, we could just have mutable attributes:
 
attribute boolean audioEnabled;
attribute boolean videoEnabled;
 
 Any opinions on this?
We're looking into this and will produce a more elaborate input related to this.

...

  !The web application must be able to!If the video is 
 going to be displayed !
  !define the media format to be used for !in a large window, 
 use higher bit-!
  !the streams sent to a peer.!rate/resolution. 
 Should media settings!
  !   !be allowed to be 
 changed during a !
  !   !session (at e.g. 
 window resize)?  !
 
 Shouldn't this be automatic and renegotiated dynamically via SDP 
 offer/answer?
Yes, this should be (re)negotiated via SDP, but what is unclear is how the SDP 
is populated based on the application's preferences.

...

  !Streams being transmitted must be  !Do not starve 
 other traffic (e.g. on  !
  !subject to rate control!ADSL link) 
!
 
 Not sure whether this requires any thing special. Could you elaborate?
What I am after is that the RTP/UDP streams sent from one UA to the other must 
have some rate adaptation implemented. HTTP uses TCP transport, and TCP reduces 
the send rate when a packet does not arrive (so that flows share the available 
throughput in a fair way when there is a bottleneck). For UDP there is no such 
mechanism, so unless something is added in the RTP implementation it could 
starve other traffic. I don't think it should be visible in the API though, it 
is a requirment on the implemenation in the UA.

...

 
  !Synchronization between audio and video!   
!
  !must be supported  !   
!
 
 If there's one stream, that's automatic, no?
One audiovisual stream is actually transmitted as two RTP streams (one audio, 
one video). And synchronization at playout is not automatic, it is something 
you do based on RTP timestamps and RTCP stuff. But again, this is a req on the 
implementaion in the UA, not on the API.

...

  !The web application must be made aware !To be able to 
 inform user and take!
  !of when streams from a peer are no !action (one of the 
 peers still has!
  !longer received!connection with 
 the server)   !
  
 --
 --
  !The browser must detect when no streams!   
!
  !are received from a peer   !   
!
 
 These aren't really yet supported in the API, but I intend 
 for us to add 
 this kind of thing at the same time sa we add similar metrics 
 to video 
 and audio. To do this, though, it would really help to have 
 a better 
 idea what the requirements are. What information should be available? 
 Packets received per second (and sent, maybe) seems like 
 an obvious 
 one, but what other information can we collect?
I think more studies are required to answer this one.

//Stefan

Re: [whatwg] ConnectionPeer experiences

2011-01-31 Thread Stefan Håkansson LK
(Patrik has a couple of well deserved days off):

 - In your experimentation, did you find any reasonable underlying 
 protocol to map sendFile, sendBitmap and their corresponding 
 callbacks 
 to, or did you just ignore them for now?
We just ignored them for the time being
 - In connecting, did you operate with connections going via a 
 server, or 
 did you go between browsers? If so, how did you identify the 
 server vs 
 identifying the remote participant? Was the 
 remoteConfiguration method 
 flexible enough?
The connection that is set up using ConnectionPeer is browser-to-browser in our 
case (there is no need to use TURN in our setup). So what we do supply as 
remoteConfiguration (note our proposal to change addRemoteConfiguration to 
setRemoteConfiguration) is basically the candidates obtained in contact with 
the STUN server.

add/setRemoteConfiguration seems flexible enough (we have only experimented 
with a simple case though). Of course the data supplied (acquired from 
getLocalConfig at the remote sida) must be able to express all required 
possibilities.

//Stefan


 
 Message: 4
 Date: Fri, 28 Jan 2011 16:14:52 -0800
 From: Harald Alvestrand har...@alvestrand.no
 To: whatwg@lists.whatwg.org
 Subject: Re: [whatwg] ConnectionPeer experiences
 Message-ID: 4d435bfc.7040...@alvestrand.no
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 
 Thank you Patrik, I enjoyed reading that!
 
 Questions:
 - In your experimentation, did you find any reasonable underlying 
 protocol to map sendFile, sendBitmap and their corresponding 
 callbacks 
 to, or did you just ignore them for now?
 - In connecting, did you operate with connections going via a 
 server, or 
 did you go between browsers? If so, how did you identify the 
 server vs 
 identifying the remote participant? Was the 
 remoteConfiguration method 
 flexible enough?
 
 I'm currently staring at the problem of defining a set of 
 semantics for 
 mapping this to RTC-Web protocols, and am having some problems 
 interpreting what's currently in the spec, so would like your 
 interpretation.
 
  Harald
 
 On 01/26/11 01:04, Patrik Persson J wrote:
  We have done some experimentation with the ConnectionPeer 
 API. We have
  an initial implementation of a subset of the API, using ICE 
 (RFC 5245)
  for the peer-to-peer handshaking.  Our implementation is
  WebKit/GTK+/gstreamer-based, and we of course intend to submit it to
  WebKit, but the implementation is not quite ready for that yet.
 
  More information about our work so far can be found here:
  
 https://labs.ericsson.com/developer-community/blog/beyond-html
5-peer-peer-conversational-video
 
  However, we have bumped into some details that we'd like to discuss
  here right away.  The following is our mix of proposals and 
 questions.
 
  1. We propose adding a readyState attribute, to decouple the
  onconnect() callback from any observers (such as the UI).
 
 const unsigned short CONNECTING = 0;
 const unsigned short CONNECTED = 1;
 const unsigned short CLOSED = 2;
 readonly attribute unsigned short readyState;
 
  2. We propose replacing the onstream event with custom 
 events of type
  RemoteStreamEvent, to distinguish between adding and removing
  streams.
 
 attribute Function onstreamadded;   // RemoteStreamEvent
 attribute Function onstreamremoved; // RemoteStreamEvent
 ...
 interface RemoteStreamEvent : Event {
readonly attribute Stream stream;
 };
 
  The 'stream' attribute indicates which stream was added/removed.
 
  3. We propose renaming addRemoteConfiguration to
  setRemoteConfiguration.  Our understanding of the 
 ConnectionPeer is
  that it provides a single-point-to-single-point 
 connection; hence,
  only one remote peer configuration is to be set, rather 
 than many
  to be added.
 
 void setRemoteConfiguration(in DOMString 
 configuration, in optional DOMString remoteOrigin);
 
  4. We propose swapping the ConnectionPeerConfigurationCallback
  callback parameters. The current example seems to use 
 only one (the
  second one).  Swapping them allows clients that care 
 about 'server'
  to do so, and clients that ignore it (such as the 
 current example)
  to do so too.
 
 [Callback=FunctionOnly, NoInterfaceObject]
 interface ConnectionPeerConfigurationCallback {
void handleEvent(in DOMString configuration, in 
 ConnectionPeer server);
 };
 
  5. Should a size limit to text messages be specified? Text messages
  with UDP-like behavior (unimportant=true) can't really 
 be reliably
  split into several UDP packets.  For such long chunks 
 of data, file
  transfer seems like a better option anyway.
 
  In summary, then, our proposal for a revised ConnectionPeer 
 looks as follows:
 
  [Constructor(in DOMString serverConfiguration)]
  interface ConnectionPeer 

[whatwg] Simple video chat use case

2011-01-31 Thread Stefan Håkansson LK
Just to start the discussion, we put together the simplest use case you can 
imagine (with an additional error case). It has been been sent to the RTC-Web 
community before, but as there is some discussion on ConnectionPeer ongoing we 
thought the use case could be of interest in this community as well. 

Some requirements are derived from the use case. The intention is to add more 
use cases, but before spending any more energy I though it would be good to 
display it to start the discussion and to get feedback on whether this is a 
workable model or not.
 
The requirements are split in three categories: 
1. User/privacy: the intention is to list what an end user can expect in
   terms of what an web application can do without user consent and so on
2. API dimension: lists what must be accessible by JS and signalled to JS
   via APIs
3. Functional dimension: lists functionality that must be supported by the
   browser (in combination with the underlying operating system, drivers 
   and HW) even though it is not visible from the API. 
 
I can't say that I am extremely satisfied with this division. Perhaps we could 
come up with something better. Also, some of the requirements in the 
User/privacy and Functional dimensions are not derived (e.g. ask for user
consent) from this use case per se, but are instead generic requirements.
Perhaps they should be handled separately?

Connection is not mentioned as it is not a requirement (the requirement is 
rather to be able to send and receive streams, a connection would presumably be 
used as a foundation for the stream transport).
 
OK, to the use case:


1. Simple Video Chat

A simple video chat service has been developed. In the service the users are 
logged on to the same chat web server. The web server publishes information 
about user login status, pushing updates to the web apps in the browsers. By 
clicking on an online peer user name, a 1-1 video chat session between the two 
browsers is initiated. The invited peer is presented with a choice of joining 
or rejecting the session.
 
The web author developing the application has decided to display a self-view as 
well as the video from the remote side in rather small windows, but the user 
can change the display size during the session. The application also supports 
if a participant (for a longer or shorter time) would like to stop sending 
audio (but keep video) or video (keep audio) to the other peer (mute).
 
Any of the two participants can at any time end the chat by clicking a button.
 
In this specific case two users are using lap-tops in their respective homes. 
They are connected to the public Internet with a desktop browser using WiFi 
behind NATs. One of the users has an ADSL connection to the home, and the other 
fiber access. Most of the time headsets are used, but not always.

1.1 Requirements


1.1.1 User/privacy dimension


!Requirement. The user must:!Comment   !

!give explicit consent before a device  !  !
!can be used to capture audio or video  !  !

!be able to in an intuitive way revoke  !  !
!and change capturing permissions   !  !

!be able to easily understand that audio!  !
!or video is being captured !  !

!be informed about that an invitation to!  !
!a peer video chat session has been !  !
!received   !  !

!be able to accept or reject an !  !
!invitation to a peer video chat session!  !

!be able to stop a media stream from!  !
!being transmitted  !  !

 
 
1.1.2 API dimension
---

!Requirement.   !Comment   !