Re: [whatwg] Fetch, MSE, and MIX

2015-02-20 Thread Aaron Colwell
Hi Ryan,

Thanks for writing this up. I know you already know this, but I wanted to
publically declare my support as one of the MSE editors. While I wish we
didn't need this, I can understand the concerns of content providers and I
think this is a reasonable compromise.

Aaron

On Thu Feb 19 2015 at 9:06:17 PM Ryan Sleevi sle...@google.com wrote:

 Cross-posting, as this touches on the Fetch [1] spec, Media Source
 Extensions [2], and Mixed Content [3]. This does cross-post WHATWG and
 W3C, apologies if this is a mortal sin.

 TL;DR Proposal first:
 - Amend MIX in [4] to add fetch as an optionally-blockable-request-
 context
   * This means that fetch() can now return HTTP content from HTTPS
 pages. The implications of this, however, are described below, if you
 can handle reading it all.
 - Amend MSE in [5] to introduce a new method, appendResponse(Response
 response), which accepts a Response [6] class
 - In MSE, define a Response Append Loop similar to the Stream Append
 Loop [7], that calls the consume body algorithm [8] on the internal
 response [9] of Response to yield an ArrayBuffer, then executes the
 buffer append [10] algorithm on the SourceBuffer


 MUCH longer justification why:
 As it stands, audio/video/source tags today are optionally
 blockable content, as noted in [4]. Thus, an HTTPS page may set the
 source to HTTP content and load the content (although typically with
 user-agent indication). MSE poses itself as a spec to offer much
 greater control to site authors than audio/video, as noted in its
 use cases, and as a result, has seen a rapid adoption among a number
 of popular video streaming sites. Most notably, the ability to do
 adaptive streaming with MSE helps provide a better quality, better
 performing experience for users. Finally, in some user agents, MSE is
 a pre-requisite for the use of Encrypted Media Extensions [11].

 However, there are limitations to using MSE that don't exist with
 video/audio. The most notable of these is that in order to
 implement the adaptive streaming capabilities, most sites make use of
 XMLHttpRequest to request portions of media content, which can then be
 supplied to the SourceBuffer. Based on the feedback that MSE provides
 the script author, it can then adjust the XHRs they make to use a
 lower bitrate media source, to drop segments, etc. When using XHR, the
 site author loses the ability to mix HTTPS pages with HTTP media, as
 XHR is (rightfully so) treated as blocked content.

 The justification for why XHR does this is that it returns the full
 buffer to the page author. In practice, we saw many sites then taking
 that buffer and making security decisions on it - whether it be
 clearly bad things such as eval()ing the content to more subtle
 things like adjusting UI or links. All of these undermine all of the
 security guarantees that HTTPS tries to provide, and thus XHR is
 blocked.

 The result is that if an HTTPS site wants to use MSE with XHR, all of
 the content needs to be served via HTTPS. We've already seen some
 providers complain that this is prohibitively expensive in their
 current networks [12], although it may be solvable in time, as
 demonstrated by other video sharing sites [13].

 In a choice between using MSE - which offers a better user experience
 over video/audio by reducing bandwidth and improving quality - and
 using HTTPS - which offers better privacy and security controls -
 sites are likely to choose solutions that reduce their costs rather
 than protect their users, a reasonable but unfortunate business
 reality.

 I'm hoping to find a way to close that gap - to allow sites to use MSE
 (and potentially EME) via HTTPS documents, while still sourcing their
 media content via HTTP. This may seem counter-intuitive, and a step
 back from the efforts of the Chrome security team, but I think it is
 actually consistent with our goals and our past comments. In
 particular, this solution tries to provide a means and incentive for
 sites to adopt MSE (improving user experience) AND to begin migrating
 to HTTPS; first with their main document, and then, in time, all of
 their media content.

 This won't protect adversaries from knowing what content the user is
 actively watching, for example, but will help protect other vital
 assets - such as their cookies, session identifiers, user information,
 friends list, past viewing history, etc.

 Allowing fetch() to return HTTP content sourced from HTTPS pages seems
 like it would re-open the XHR hole, but this isn't the case. As
 described in [14], all requests whose mode is CORS or
 CORS-with-forced-preflight are force-failed. This only leaves the
 request modes of no-cors, same-origin, aboutand data. Because
 the origins are different between the document (https) and the request
 URL (http), the request mode will be no-cors, and thus the returned
 Response object will be set to opaque.

 The opaque response prevents direct access to the Response data.
 Similarly, the 

[whatwg] HTML5 video seeking

2011-11-14 Thread Aaron Colwell
Hi,

I was looking at the seeking
algorithmhttp://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#seeking
and
had a question about step 10.

10. Wait until the user agent has established whether or not the media
data for the new playback position is available, and, if it is, until it
has decoded enough data to play back that position.


Does this mean the user agent must resume playback at the exact location
specified?
What if the nearest keyframe is several seconds away?
Is the UA expected to decode and toss the frames instead of starting
playback at the nearest keyframe?

On desktop machines I don't think this would be a problem, but on mobile
devices it might be since the hardware may not be able to decode
significantly faster than realtime. What is the intended behavior for such
constrained devices?

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-08-12 Thread Aaron Colwell
Hi Mark,

comments inline...

On Thu, Aug 11, 2011 at 9:46 AM, Mark Watson wats...@netflix.com wrote:

 I think it would be good if the API recognized the fact that the media data
 may becoming from several different original files/streams (e.g. different
 bitrates) as the player adapts to network or other conditions.


I agree. I intend to document this when I spec out the format of the byte
stream that is passed into this API. Initially I'm focusing on WebM which
requires this type of functionality if the Vorbis initialization data ever
needs to change during playback. My intuition says that Ogg  MP4 will
require similar solutions.



 The different files may have different initialization information (Info and
 Tracks in WebM, Movie Box in mp4 etc.), which could be provided either in
 the first append call for each stream or with a separate API call. But
 subsequently you need to know which initialization information is relevant
 for each appended block. An integer streamId in the append call would be
 sufficient - the absolute value has no meaning - it would just associate
 data from the same stream across calls.


Since I'm using WebM for the byte stream I don't need to add explicit
streamIds to the API or data. StreamIDs are already in the byte stream. Ogg
bitstream serial numbers, and MP4 track numbers should serve the same
purpose.



 The alternatives are:
 (a) to require that all streams have the same or compatible initialization
 information or
 (b) to pass the initialization information every time you change streams

 (a) has the disadvantage of constraining encoding, and making adding new
 streams more dependent on the details of how the existing streams were
 encoded/packaged
 (b) is ok, except that it is nice for the player to know this data is from
 the same stream you were playing a while ago - it can re-use some
 previously established state - rather than every stream change being 'out of
 the blue'.


I'm leaning toward (b) right now. Any time a change in stream parameters is
needed new INFO  TRACKS elements will be appended before the media data
from the new source. This is similar to how Ogg chaining works. I don't
think we need unique IDs for marking this state. The media engine can look
at the new codec config data and see if it matches anything it has seen
before. If so then it can simply reuse whatever resources it see fit.
Another thing to note is that just because we append this data every time a
stream switch occurs, it doesn't mean we have to transfer that data across
the network each time. JavaScript can cache this data and simply append it
when necessary.



 A separate comment is that practically we have found it very useful for the
 media player to know the maximum resolution, frame rate and codec
 level/profile that will be used, which may be different from the resolution
 and codec/level/profile of the first stream.


I agree that this info is useful, but it isn't clear to me that this API
needs to support that. Existing APIs like
canPlayType()http://www.w3.org/TR/html5/video.html#dom-navigator-canplaytype
could
be used to determine whether specific codec parameters are supported. Other
DOM APIs could be used to determine max screen size. This could all be used
to prune the candidate streams sent to the MediaSource API.


Aaron


Re: [whatwg] File API Streaming Blobs

2011-08-11 Thread Aaron Colwell
Comments inline...

On Wed, Aug 10, 2011 at 2:05 PM, Charles Pritchard ch...@jumis.com wrote:

  On 8/9/2011 9:38 AM, Aaron Colwell wrote:

 FYI I'm working on an experimental extension to Chromium to allow media
 data to be streamed into a media element via JavaScript. Here is the draft
 spechttp://html5-mediasource-api.googlecode.com/svn/tags/0.2/draft-spec/mediasource-draft-spec.html
  and
 pending WebKit patch https://bugs.webkit.org/show_bug.cgi?id=64731 related
 to this work. I have simple WebM VOD playback w/ seeking working where all
 media data is fetched via XHR.


 It's nice to see this patch.


Thanks. Hopefully I can get it landed soon so people can start playing with
it in Chrome Dev Channel builds.


 I'm hoping to see streamed array buffers in XHR, though fetching in chunks
 can work,
 given the relatively small overhead of HTTP headers vs Video content.


Eventually I'd like to see streamed array buffers in XHR. For now I'm just
using range requests and allowing the JavaScript code determine how large
the ranges should be to control overhead.


 The WHAWG specs have a Media Stream example which uses URL createObjectURL:
 navigator.getUserMedia('video user', gotStream, noStream);
 function gotStream(stream) {
 video.src = URL.createObjectURL(stream);

 http://www.whatwg.org/specs/web-apps/current-work/complete/video-conferencing-and-peer-to-peer-communication.html#dom-mediastream

 The WHATWG spec seems closer to (mediaElement.createStream()).append()
 semantics.


There was a previous discussion about this on WHATWG. There was concern
about providing compressed data to a MediaStream object since they are
basically format agnostic right now.


 Both WHATWG and the draft spec agree on src=uri;


The benefit of src=uri is that it allows you to leverage all the existing
state transition and behavior defined in the spec.


 File API has to toURL semantics on objects, simlar to the draft spec, for
 getting filesystem:// uris.

 My understanding: The draft spec is simpler, intended only to be used by
 HTMLMediaElement
 and only by one element at a time, without introducing a new object. In the
 long
 run, it may make sense to create a media stream object, consistent with the
 WHATWG direction.


The draft spec was intended to be as simple as possible. Attaching this
functionality to HTMLMediaElement instead of
creating a MediaStream came out of discussions on whatwg
herehttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-July/032283.html
 and 
herehttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-July/032384.html.
I'm definitely open to revisiting this, but I got
the feeling that people wanted to see a more concrete implementation first.
I also like having this functionality part of
HTMLMediaElement because then I only have to deal with the HTMLMediaElement
during seeking instead of having to coordinate behavior
between the MediaStream  the HTMLMediaElement.



 On another note, Mozilla Labs has some experiments on recording video from
 canvas (as well as general webcam, etc):
 https://mozillalabs.com/rainbow/
 https://github.com/mozilla/rainbow
 https://github.com/mozilla/rainbow/blob/master/content/example_canvas.html


I'll take a look at this.

Aaron


Re: [whatwg] File API Streaming Blobs

2011-08-09 Thread Aaron Colwell
FYI I'm working on an experimental extension to Chromium to allow media data
to be streamed into a media element via JavaScript. Here is the draft
spechttp://html5-mediasource-api.googlecode.com/svn/tags/0.2/draft-spec/mediasource-draft-spec.html
and
pending WebKit patch https://bugs.webkit.org/show_bug.cgi?id=64731 related
to this work. I have simple WebM VOD playback w/ seeking working where all
media data is fetched via XHR.

Aaron

On Mon, Aug 8, 2011 at 7:16 PM, Charles Pritchard ch...@jumis.com wrote:

 On 8/8/2011 2:51 PM, Glenn Maynard wrote:

  On Mon, Aug 8, 2011 at 4:31 PM, Simon Heckmann 
 si...@simonheckmann.demailto:
 si...@simonheckmann.de** wrote:

Well, not directly an answer to your question, but the use case I
had in mind is the following:

A large encrypted video (e.g. HD movie with 2GB) file is stored
using the File API, I then want to decrypt this file and start
playing with only a minor delay. I do not want to decrypt the
entire file before it can be viewed. As long as such as use case
gets covered I am fine with everything.


 Assuming you're thinking of DRM, are there any related use cases other
 than crypto?  Encryption for DRM, at least, isn't a very compelling use
 case; client-side Javascript encryption is a very weak level of protection
 (putting aside, for now, the question of whether the web can or should be
 attempting to handle DRM in the first place).  If it's not DRM you're
 thinking of, can you clarify?


 Jonas Sickling brought up a few cases for XHR-based streaming of
 arraybuffers: progressive rendering of word docs and PDFs.
 WebP and WebM have had interesting packaging hacks. Packaging itself,
 whether DRM or not, is compelling.
 PDF supports embedded data, a wide range of formats. GPAC provides many
 related tools (MP4 based, I believe):
 http://gpac.wp.institut-**telecom.fr/http://gpac.wp.institut-telecom.fr/

 The audio and video tags drop frames
 It seems to me that if a listener is not registered to the stream, data
 would just be dropped.

 As an alternative, the author could register a fixed length circular
 buffer.

 For instance, I could create  1 megabyte arrayview, run
 URL.createBlobStream(**ArrayView)
 and use .append(data). That kind of structure may support multicast
 (multiple audio/video elements)
 and improved XHR2 semantics. The circular buffer, itself, is easy to
 prototype: subarray
 works well with typed arrays.

 Otherwise relevant, is the work on raw audio data
 that Firefox and Chromium have released as experimental extensions.
 It does work on a buffer-based system.

 -Charles










Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-14 Thread Aaron Colwell
On Wed, Jul 13, 2011 at 8:00 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Thu, Jul 14, 2011 at 4:35 AM, Aaron Colwell acolw...@google.comwrote:

 I am open to suggestions. My intent was that the browser would not attempt
 to cache any data passed into append(). It would just demux the buffers that
 are sent in. When a seek is requested, it flushes whatever it has and waits
 for more data from append().  If the web application wants to do caching it
 can use the WebStorage or File APIs. If the browser's media engine needs a
 certain amount of preroll data before it starts playback it can signal
 this explicitly through new attributes or just use HAVE_FUTURE_DATA
  HAVE_ENOUGH_DATA readyStates to signal when it has enough.


 OK, I sorta get the idea. I think you're defining a new interface to the
 media processing pipeline that integrates with the demuxer and codecs at a
 different level to regular media resource loading. (For example, all the
 browser's built-in logic for seeking and buffering would have to be disabled
 and/or bypassed.)


Yes.


 As such, it would have to be carefully specified, potentially in a
 container- or codec-dependent way, unlike APIs like Blobs which work just
 like regular media resource loading and can thus work with any
 container/codec.


My hope is that the data passed to append will basically look like the live
streaming form of containers like Ogg  WebM so this isn't totally foreign
to the existing browser code. We'd probably have to spec the level of
support for Ogg chaining and multiple WebM segments but I don't think that
should be too bad. Seeking is where the trickiness happens and I was just
planning on making it look like a new live stream whose starting timestamp
indicates the actual point seeked to.

I was tempted to create an API that just passed in compressed video/audio
frames and made JavaScript do all of the demuxing, but I thought people
might find that too radical.



 I'm not sure what the best way to do this is, to be honest. It comes down
 to the use-cases. If you want to experiment with different seeking
 strategies, can't you just do that in Chrome itself? If you want scriptable
 adaptive streaming (or even if you don't) then I think we want APIs for
 seamless transitioning along a sequence of media resources, or between
 resources loaded in parallel.


I think the best course of action is for me to get my prototype in a state
where others can play with it and I can demonstrate some of the uses that
I'm trying to enable. I think that will make this a little more concrete.
 I'll keep this list posted on my progress.

Thanks for your help,
Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-13 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 5:05 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 12:00 PM, Aaron Colwell acolw...@google.comwrote:

 On Tue, Jul 12, 2011 at 4:44 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

 I had imagined that this API would let the author feed in the same data
 as you would load from some URI. But that can't be what's happening, since
 in some element implementations (e.g., Gecko's) loaded data is buffered
 internally and seeking might not require any new data to be loaded.


  No. The idea is to allow JavaScript to manage fetching the media data so
 various fetching strategies could be implemented without needing to change
 the browser. My initial motivation is for supporting adaptive streaming with
 this mechanism, but I think various media mashup and delivery scenarios
 could be explored with this.


 I don't think you can do that with this API without making huge assumptions
 about what the browser's demuxer, internal caching, etc are doing.


I am open to suggestions. My intent was that the browser would not attempt
to cache any data passed into append(). It would just demux the buffers that
are sent in. When a seek is requested, it flushes whatever it has and waits
for more data from append().  If the web application wants to do caching it
can use the WebStorage or File APIs. If the browser's media engine needs a
certain amount of preroll data before it starts playback it can signal
this explicitly through new attributes or just use HAVE_FUTURE_DATA
 HAVE_ENOUGH_DATA readyStates to signal when it has enough.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Mon, Jul 11, 2011 at 5:54 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 It seems to me that the spec is written assuming only one media element is
 consuming the MediaSource. But nothing stops multiple elements consuming the
 same URL simultaneously. Maybe instead of going through a URL you should add
 API directly to media elements.


You are right that I don't have anything preventing the MediaSource URL from
being passed to multiple media elements. Only one media element will accept
the URL though because whichever one opens the URL first will transition the
source to the OPEN state. Media elements can only open sources in the CLOSED
state. I'm using a URL for initialization to be consistent with how the
media element is initialized in all other cases. I didn't want to create a
new initialization path.

I thought about adding an attribute to HTMLMediaElement that provided a URL
for signalling MediaSource usage. That mechanism would allow you to create a
URL that only works with that element. When this URL is specified, a
MediaSource attribute would be updated on the media element during loading
and JavaScript could use that to pass data to the tag. I couldn't find a
similar pattern in other APIs so I didn't take that path. If people think
that is a better route then I'm all for it.



 bytesAvailable is for flow control? Instead of doing it this way, I would
 follow WebSockets and use a bufferedAmount attribute to indicate how much
 data is currently buffered up. That makes it easy for authors who don't want
 to care about flow control to just append stuff without encountering errors,
 while still allowing authors who care about flow control to do it.


Yes. The intent was to provide a way for the browser to control how much
data was being pushed into it. It looks like WebSocket will just close the
connection if it doesn't have enough buffer space and the API doesn't appear
to provide a mechanism to predict how much buffered data will trigger a
close. Do we want similar semantics for media? It seems like the browser
should provide some hints to indicate that it is not ok to push hours/days
of data into this interface.

Thanks for your comments.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
Hi Harald,

Please point me to specific threads that talk about this. I looked through
the public-web...@w3.org archive and didn't see anything about interactive
media handling. I did look through the Mozilla/Cisco proposal
threadhttp://lists.w3.org/Archives/Public/public-webrtc/2011Jul/0010.html
and
didn't see anything in my proposal that is incompatible with what is being
proposed there.

Aaron

On Tue, Jul 12, 2011 at 12:31 AM, Harald Alvestrand har...@alvestrand.nowrote:

 Not a comment directly on the spec, but you might want to check what people
 are suggesting for interactive media handling in the WEBRTC working group.

 Streaming is different from interactive media, but it would be a shame to
 have incompatibilities that can be avoided.





Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 3:28 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 8:45 AM, Aaron Colwell acolw...@google.comwrote:

 I thought about adding an attribute to HTMLMediaElement that provided a
 URL for signalling MediaSource usage. That mechanism would allow you to
 create a URL that only works with that element. When this URL is specified,
 a MediaSource attribute would be updated on the media element during loading
 and JavaScript could use that to pass data to the tag. I couldn't find a
 similar pattern in other APIs so I didn't take that path. If people think
 that is a better route then I'm all for it.


 I was thinking more of putting the MediaSource functionality
 (open/append/close) on the media element itself.


I'm open to that. In fact that is how my current prototype is implemented
because it was the least painful way to test these ideas in WebKit. My
prototype only implements append() and uses existing media element events as
proxies for the events I've proposed. I only separated this out into a
separate object because I thought people might prefer an object to represent
the source of the media and leave the media element object an endpoint for
controlling media playback.



 Do you need to support seeking in with this API? That's hard. It would be
 simpler if we didn't have to support seeking. Instead of seeking you could
 just open a new stream and pour data in for the new offset.


 I'd like to be able to support seeking so you can use this mechanism for
on-demand playback. In my prototype seeking wasn't too difficult to
implement. I just triggered it off the seeking event. Any append() that
happens after the seeking event fires is associated with the new seek
location. currentTime is updated with the timestamp in the first cluster
passed to append() after the seeking event fires. Once the media engine has
this timestamp and enough preroll data, then it will fire the seeked event
like normal. I haven't tested this with rapid fire seeking yet, but I think
this mechanism should work.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 4:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 11:14 AM, Aaron Colwell acolw...@google.comwrote:


 I'm open to that. In fact that is how my current prototype is implemented
 because it was the least painful way to test these ideas in WebKit. My
 prototype only implements append() and uses existing media element events as
 proxies for the events I've proposed. I only separated this out into a
 separate object because I thought people might prefer an object to represent
 the source of the media and leave the media element object an endpoint for
 controlling media playback.


 We're kinda stuck with media elements handling both playback endpoints and
 resource loading.


Ok.  This makes implementation in WebKit easier for me so I won't push to
hard to keep it separate from the media element. :)





 Do you need to support seeking in with this API? That's hard. It would be
 simpler if we didn't have to support seeking. Instead of seeking you could
 just open a new stream and pour data in for the new offset.


  I'd like to be able to support seeking so you can use this mechanism for
 on-demand playback. In my prototype seeking wasn't too difficult to
 implement. I just triggered it off the seeking event. Any append() that
 happens after the seeking event fires is associated with the new seek
 location. currentTime is updated with the timestamp in the first cluster
 passed to append() after the seeking event fires. Once the media engine has
 this timestamp and enough preroll data, then it will fire the seeked event
 like normal. I haven't tested this with rapid fire seeking yet, but I think
 this mechanism should work.


 How do you communicate the data offset that the element wants to read at
 over to the script that provides the data? In general you can't know the
 strategy the decoder/demuxer uses for seeking, so you don't know what data
 it will request.


I'm doing WebM demuxing and media fetching in JavaScript. When a seek
occurs, I look at currentTime to see where we are seeking to. I then look at
the CUES index data I've fetched to find the file offset for the closest
seek point to the desired time. The appropriate data is fetched and pushed
into the element via append(). The seeked event firing and readyState
transitioning to HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA tells me when I've
sent the element enough data. During playback I just monitor the buffered
attribute to keep a specific duration ahead of the current playback time.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 4:44 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 11:30 AM, Aaron Colwell acolw...@google.comwrote:

 I'm doing WebM demuxing and media fetching in JavaScript. When a seek
 occurs, I look at currentTime to see where we are seeking to. I then look at
 the CUES index data I've fetched to find the file offset for the closest
 seek point to the desired time. The appropriate data is fetched and pushed
 into the element via append(). The seeked event firing and readyState
 transitioning to HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA tells me when I've
 sent the element enough data. During playback I just monitor the buffered
 attribute to keep a specific duration ahead of the current playback time.


 Now I'm rather confused about what you're doing and how you're using this
 feature. What format is the data that you're feeding into the element?


Sorry I wasn't clear about my intent. Currently I'm feeding it WebM. I could
see this expanding to Ogg and perhaps MP4. Theoretically any format that
looks like a packet stream could work.



 I had imagined that this API would let the author feed in the same data as
 you would load from some URI. But that can't be what's happening, since in
 some element implementations (e.g., Gecko's) loaded data is buffered
 internally and seeking might not require any new data to be loaded.


 No. The idea is to allow JavaScript to manage fetching the media data so
various fetching strategies could be implemented without needing to change
the browser. My initial motivation is for supporting adaptive streaming with
this mechanism, but I think various media mashup and delivery scenarios
could be explored with this.

Aaron


[whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-11 Thread Aaron Colwell
Hi,

Based on comments in the File API Streaming
Blobshttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-January/029973.html
thread and
my Extending HTML 5 video for adaptive
streaminghttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032277.html
thread,
I decided on taking a stab at writing a MediaSource API
spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.html
for
streaming data to a media tag.

Please take a look at the
spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.htmland
provide some feedback.

I've tried to start with the simplest thing that would work and hope to
expand from there if need be. For now, I'm intentionally not trying to solve
the generic streaming file case because I believe there might be media
specific requirements around handling seeking especially if we intend to
support non-packetized media streams like WAV.

If the feedback is generally positive on this approach, I'll start working
on patches for WebKit  Chrome so people can experiment with an actual
implementation.

Thanks,
Aaron


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-01 Thread Aaron Colwell
Hi Robert,

comments inline.

On Thu, Jun 30, 2011 at 4:13 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Fri, Jul 1, 2011 at 4:59 AM, Aaron Colwell acolw...@google.com wrote:

 I've also been looking at the WebRTC MediaStream API and was wondering if
 it
 makes more sense to create an object similar to the LocalMediaStream
 object.
 This has the benefits of unifying how media streams are handled
 independent
 of whether they come from a camera or a JavaScript based streaming
 algorithm. This could also enable sending the media stream through a
 Peer-to-peer connection instead of only allowing a camera as a source.
 Here
 is an example of the type of object I'm talking about.


 I think MediaStreams should not be dealing with compressed data except as
 an optimization when access to decoded data is not required anywhere in the
 stream pipeline. If you want to do processing of decoded stream data (which
 I do --- see
 http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html),
 then introducing a decoder inside the stream processing graph creates all
 sorts of complications.

 Nice spec. If I understand correctly, your position is that MediaStreams
should only represent uncompressed media? In the case of camera/mic data
they represent the uncompressed bits before they go to the codec for
transmission over a PeerConnection or before they are rendered by a
audio/video. In the case of standard audio/video playback they would
represent the uncompressed audio before it is sent to the audio card and the
uncompressed video before it is blitted on the screen. From a stream
processing point of view I can see how this makes sense.  I was just
thinking that LocalMediaStream is just a wrapper around a source of media
data and all I was doing was providing a mechanism to provide media data
from JavaScript instead of from hardware.

I think the natural way to support the functionality you're looking for is
 to extend the concept of Blob URLs. Right now you can create a binary Blob,
 mint a URL for it and set that URL as the source for a media element. The
 only extension you need is the ability to append data to the Blob while
 retaining the same URL; you would need to initially mark the Blob as open
 to indicate to URL consumers that the data stream has not ended. That
 extension would be useful for all sorts of things because you can use those
 Blob URLs anywhere. An alternative would be to create a new kind of object
 representing an appendable sequence of Blobs and create an API to mint URLs
 for it.


 I thought about that, but I saw an earlier WHATWG 
 threadhttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032221.html
  which
lead me down this MediaStream path. Using MediaStreams made more sense to me
because my use case felt similar to the live capture case except that I'm
using compressed media and it comes from JavaScript instead of hardware.
Also MediaStream already had a way to pass stream URLs to audio  video
for camera and remote peer stream data so I figured I could just leverage
that.


 Note that with my API proposal above, you can get a MediaStream from a
 media element that's using any URL and send that through a PeerConnection.

 I see that. Interactions with PeerConnection were not a primary concern for
me. I was only mentioning it as a side benefit of using MediaStream.

Thanks for your comments. I appreciate them.

Aaron


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-01 Thread Aaron Colwell
Hi Adam,

On Thu, Jun 30, 2011 at 5:20 PM, Adam Malcontenti-Wilson adman.com@
gmail.com wrote:

 @acolwell:
 Is the appendData method one your suggesting or one already
 specified/existing?

 I'm suggesting it. It was a quick and dirty way to try out some ideas I had
while working on a prototype for Chromium. Now that I actually want to take
this out of the prototype stage, I'm trying to get a sense of whether
appendData() or a MediaStream based solution is more desirable.


 @robert:
 Some problems with concept of blobs being appended to, or as I have
 previously described as Streaming Blobs was mentioned at
 http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032221.html
 I'm not exactly sure what that meant - but I'd expect the ideas
 discussed are similar.

 I saw this thread as well which is why I went down the MediaStream path.
:)

Aaron


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-01 Thread Aaron Colwell
Hi Bob,

Comments inline

On Fri, Jul 1, 2011 at 8:40 AM, Bob Lund b.l...@cablelabs.com wrote:

 Hi Aaron,

 Here are some other aspects of script controlled adaptive bit rate that
 occur to me, perhaps you have already considered these.

 1) I guess script will be responsible for maintaining its own playback
 buffer, monitoring buffer behavior and selecting the appropriate bit rate
 for new fragments. Are there any other network related events/metrics script
 might need to determine which bit-rate to fetch for the next segment? Is
 there any other information from the user agent about playback performance
 that script might need?


The script would be responsible for managing buffering. It can use the
currentTime  buffered attributes on the video tag to monitor the
consumption of the data passed in via appendData(). I believe the attributes
being proposed in the video metrics
proposalhttp://wiki.whatwg.org/wiki/Video_Metrics#Proposal could
also be helpful. Right now I'm just using XMLHttpRequest to fetch WebM
clusters and measuring how long it takes to fetch them to create a bandwidth
estimate. I haven't spent much time on the BW measurement  adaptation
algorithms yet. I'm just trying to nail down mechanism for passing the media
data to the browser first.


 2) If a media resource is a multi-track resource then it would seem script
 will also have to fetch fragments for those tracks which implies that the
 audio element would need the append method. Timed text tracks would also
 need to be processed and Cues appended.


The idea is that appendData() can receive media for multiple tracks. In the
case of WebM each cluster can have blocks from different tracks multiplexed
together. The initial stream config information contains the the track
mappings necessary to demux the cluster. I was also planning to allow both
multiplexed and demultiplexed clusters. Cluster timecodes must be in
monotonically increasing order, but it would be possible to call
appendData() with an cluster with only audio data followed by a cluster with
only video data. This would allow straight forward support for deployments
where audio  video tracks for a single presentation are in separate WebM
files.


 There is a new media pipeline task force in the Web and TV IG (
 http://www.w3.org/2011/webtv/wiki/MPTF) that is also planning to examine
 this topic. You may want to participate.


I have signed up to the mailing list and will take some time to catch up
with the archives.

Thanks for your comments.

Aaron


[whatwg] Extending HTML 5 video for adaptive streaming

2011-06-30 Thread Aaron Colwell
Hi,

I've been working on an adaptive streaming prototype that uses JavaScript to
fetch chunks of media and feeds them to the video tag for decoding. The idea
is to let the adaptation algorithm and CDN interactions happen in JavaScript
so that they can evolve without the need for browser changes. I'm looking
for some guidance about the preferred method for adding this type of
functionality. I'm new to this process so please bear with me.

My initial implementation is built around WebM, but I believe this could
work for Ogg  MP4 as well. The basic idea is to initialize the video tag
with stream initialization data (ie WebM info  tracks elements) via the
video src attribute and then send media chunks (ie WebM clusters) to the
tag via a new appendData() method on video. Here is a simple example of
what I'm talking about.

  video id=v autoplay /video
  script
function needMoreData(e) {
  e.target.appendData(getNextCluster());
}

function onSeeking(e) {
  var video = e.target;
  video.appendData(findClusterForTime(video.currentTime));
}

var video = document.getElementById('v');

video.addEventListener('loadstart', needMoreData);
video.addEventListener('stalled', needMoreData);
video.addEventListener('seeking', onSeeking);

video.src = URL.createObjectURL(createStreamInitBlob());
  /script

AppendData() expects to recieve a Uint8Array that contains WebM cluster
elements. The first cluster passed to appendData() initializes the starting
playback position. Also after a seeking event fires the first appendData()
updates the current position to the seek point.

I've also been looking at the WebRTC MediaStream API and was wondering if it
makes more sense to create an object similar to the LocalMediaStream object.
This has the benefits of unifying how media streams are handled independent
of whether they come from a camera or a JavaScript based streaming
algorithm. This could also enable sending the media stream through a
Peer-to-peer connection instead of only allowing a camera as a source. Here
is an example of the type of object I'm talking about.

interface GeneratedMediaStream : MediaStream {
  void init(in DOMString type, in UInt8Array init_data);
  void appendData(in DOMString trackId, in UInt8Array data);
  void endOfStream();

  readonly attribute MultipleTrackList audioTracks;
  readonly attribute ExclusiveTrackList videoTracks;
};

type - identifies the type of stream we are generating(ie
video/x-webm-cluster-stream or video/ogg-page-stream)
init_data - Provides initialization data that indicates the number of
tracks, codec configs, etc. (ie WebM info  tracks elements or Ogg header
pages)
trackId - Indicates what track the data is for. If this is an empty string
than multiplexed data is being passed in. If not empty trackId matches an id
of a track in the TrackList objects.
data - media data chunk (ie WebM cluster or Ogg page). Data is expected to
have monotonically increasing timestamps, no gaps, etc.

Here are my questions:
- Is there a preference for appendData() vs new MediaStream object?
- If the MediaStream object is preferred, should this be constructed through
Navigator.getUserMedia()? I'm unclear about what the criteria is for adding
this to Navigator vs allowing direct object construction.
- Are there existing efforts along these lines? If so, please point me to
them.

Thanks for your help,

Aaron


[whatwg] Redirect handling for audio video

2011-03-03 Thread Aaron Colwell
Hi,

I was looking at the resource fetch
algorithmhttp://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#concept-media-load-resourcesection
and fetching
resources 
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#fetch
sections of the HTML5 spec to determine what the proper behavior is for
handling
redirects. Both YouTube and Vimeo do 302 redirects to different hostnames
from
the URLs specified in the src attribute. It looks like the spec says that
playback should fail in these cases because they are from different
origins (Section 2.7 Fetching resources bullet 7). This leads me to a few
questions.

1. Is my interpretation of the spec correct? Sample YouTube  Vimeo URLs are
   shown below.
   YouTube : src  : http://v22.lscache6.c.youtube.com/videoplayback? ...
 redirect : http://tc.v22.cache6.c.youtube.com/videoplayback?
...

   Vimeo   : src  : http://player.vimeo.com/play_redirect? ...
 redirect : http://av.vimeo.com/05 ...

2. What about http: - https: redirects? Some content is required to be
delivered
   only via https and this sort of redirect enforces that but isn't really a
different origin.

3. If my interpretation of the spec is correct, are there proposals to
change this
   or other specs that allow content providers to signal that these
different hostnames
   actually represent the same origin.

Thanks for your help,
Aaron