Re: [whatwg] HTTP/2 push detection and control in JavaScript

2015-02-24 Thread Brendan Long
Discussion within MPEG just reminded me of another thing we could use in
XHR or Fetch: The ability to control HTTP/2 stream priorities
http://http2.github.io/http2-spec/#StreamPriority.

For example, I might want to request the next 5 segments, but indicate
that later segments are dependent
http://http2.github.io/http2-spec/#pri-depend on earliest segments.
This would let us fully use the available bandwidth without having later
segments cannibalize bandwidth from earlier segments.

HTTP/2 associates priority information with HEADERS
http://http2.github.io/http2-spec/#HEADERS (but as a flag, not as a
normal header), so maybe it would make sense to add this to Fetch's
Headers https://fetch.spec.whatwg.org/#headers-class. I'm not sure if
it makes sense to put it on Request
https://fetch.spec.whatwg.org/#request-class, since it seems to only
expose readonly attributes.

On 02/20/2015 05:37 AM, Brendan Long wrote:

 On Feb 20, 2015, at 11:53 AM, Kornel Lesiński kor...@geekhood.net wrote:

 For server push we already have Server-Sent Events:

 https://html.spec.whatwg.org/multipage/comms.html#server-sent-events 
 https://html.spec.whatwg.org/multipage/comms.html#server-sent-events
 Using an entirely different protocol to inform the JavaScript client of 
 events that the browser is already aware of seems like overkill.

 This also doesn’t solve the problem of canceling pushed streams.

 I’m not really concerned with how this is solved, but an example would be 
 to add to XMLHTTPRequest:
 XHR is dead.

 https://fetch.spec.whatwg.org/
 I’ll look into what would need to be added to this. Presumably we could just 
 add an onpush event to Request which is fired with an event containing a new 
 Request (containing info from the PUSH_PROMISE).

 It’s not clear to me how we would cancel a pushed stream, or retrieve 
 streaming body data without waiting for the request to completely finish.



[whatwg] HTTP/2 push detection and control in JavaScript

2015-02-20 Thread Brendan Long
Hi,

I’m wondering if there is any work in progress to define an API for JavaScript 
clients to detect that an HTTP/2 server has pushed content, and to control the 
“half closed” vs “closed” state of HTTP/2 streams (to control whether or not a 
server can push content).

Use Case

The use-case I have in mind is MPEG DASH or HLS live streaming:

A server updates the manifest (MPD or m3u8) and pushes it. The client browser 
will cache it, but JavaScript applications don’t know that it exists. For newly 
generated segments, we need the MPD to use them anyway, but for adaptive 
streaming we need to know when the request started and when it finished so we 
can estimate bandwidth. If a server is pushing MPD updates, we want a way to 
tell it to stop (RST_STREAM on a pushed stream to end the transfer, or 
RST_STREAM on the original client request to prevent future pushes for that 
request).

The obvious question to ask is “why not just poll the server”? The answer its 
that live streaming latency depends (among other things) on how quickly you 
poll. Unless you can perfectly predict when the server will have an update 
available, you need to either poll slightly late (introducing latency) or poll 
significantly more often than the server creates updates. Using server push is 
equivalent to to polling infinitely fast, while simultaneously reducing load on 
the server by making fewer requests (win/win).

I imagine that this would be useful for JavaScript implementations of HTTP/2 
REST API’s, especially API’s using something like ATOM or RSS. Firefox has an 
API for addons to detect pushed streams, so presumably they think there is some 
use for this (but they didn’t propose this as a public API, so maybe not?).

Solution

I’m not really concerned with how this is solved, but an example would be to 
add to XMLHTTPRequest:

interface XMLHttpRequestEventTarget : EventTarget {
  // event handlers
  attribute EventHandler onloadstart;
  attribute EventHandler onprogress;
  attribute EventHandler onabort;
  attribute EventHandler onerror;
  attribute EventHandler onload;
  attribute EventHandler ontimeout;
  attribute EventHandler onloadend;
  attribute EventHandler onpush;
};

enum XMLHttpRequestLeaveRequestOpenType {
  “true”,
  “false”,
  “no-preference
};

interface XMLHttpRequest : XMLHttpRequestEventTarget {
  // most of this interface left out for readability

  void open(ByteString method, USVString url, boolean async, optional 
USVString? username = null, optional USVString? password = null, optional 
XMLHttpRequestLeaveRequestOpenType leaveRequestOpen = “no-preference);

  void abort(); // can already be used to fully close client request stream
};

interface XMLHttpPushResponse {
  // event handlers
  attribute EventHandler onprogress;
  attribute EventHandler onabort;
  attribute EventHandler onerror;
  attribute EventHandler onload;
  attribute EventHandler ontimeout;
  attribute EventHandler onloadend;

  void abort();

  // response
  readonly attribute USVString responseURL;
  readonly attribute unsigned short status;
  readonly attribute ByteString statusText;
  ByteString? getResponseHeader(ByteString name);
  ByteString getAllResponseHeaders();
  void overrideMimeType(DOMString mime);
   attribute XMLHttpRequestResponseType responseType;
  readonly attribute any response;
  readonly attribute USVString responseText;
  [Exposed=Window] readonly attribute Document? responseXML;
}

interface XMLHttpRequestPushEvent : Event {
  attribute XMLHttpPushResponse response;
};

And then we’d need some text to say:

If leaveRequestOpen is “true” and the request was made over HTTP/2, the user 
agent should not fully close the request stream until XMLHttpRequest is garbage 
collected or abort() is called on it. If it is “false”, the user agent must 
immediately close the HTTP/2 stream after receiving a complete response.

If a user agent receives a PUSH_PROMISE on the HTTP/2 stream for an 
XMLHttpRequest, it must create an XMLHttpPushResponse, attach it to a 
XMLHttpRequestPushEvent, and fire it for XMLHttpRequestEventTarget.onpush.

XMLHttpPushResponse's attributes and methods all have identical semantics to 
the versions on XMLHttpRequest.

Thoughts?

Brendan Long


Re: [whatwg] HTTP/2 push detection and control in JavaScript

2015-02-20 Thread Brendan Long


 On Feb 20, 2015, at 11:53 AM, Kornel Lesiński kor...@geekhood.net wrote:
 
 For server push we already have Server-Sent Events:
 
 https://html.spec.whatwg.org/multipage/comms.html#server-sent-events 
 https://html.spec.whatwg.org/multipage/comms.html#server-sent-events

Using an entirely different protocol to inform the JavaScript client of events 
that the browser is already aware of seems like overkill.

This also doesn’t solve the problem of canceling pushed streams.

 
 I’m not really concerned with how this is solved, but an example would be to 
 add to XMLHTTPRequest:
 
 XHR is dead.
 
 https://fetch.spec.whatwg.org/

I’ll look into what would need to be added to this. Presumably we could just 
add an onpush event to Request which is fired with an event containing a new 
Request (containing info from the PUSH_PROMISE).

It’s not clear to me how we would cancel a pushed stream, or retrieve streaming 
body data without waiting for the request to completely finish.


Re: [whatwg] How to expose caption tracks without TextTrackCues

2014-11-03 Thread Brendan Long

On 10/27/2014 08:43 PM, Silvia Pfeiffer wrote:
 On Tue, Oct 28, 2014 at 2:41 AM, Philip Jägenstedt phil...@opera.com wrote:
 On Sun, Oct 26, 2014 at 8:28 AM, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:
 On Thu, Oct 23, 2014 at 2:01 AM, Philip Jägenstedt phil...@opera.com 
 wrote:
 On Sun, Oct 12, 2014 at 11:45 AM, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:
 Using the VideoTrack interface it would list them as a kind=captions
 and would thus also be able to be activated by JavaScript. The
 downside would that if you have N video tracks and m caption tracks in
 the media file, you'd have to expose NxM videoTracks in the interface.
 VideoTrackList can have at most one video track selected at a time, so
 representing this as a VideoTrack would require some additional
 tweaking to the model.
 The captions video track is one that has video and captions rendered
 together, so you only need the one video track active. If you want to
 turn off captions, you merely activate a different video track which
 is one without captions.

 There is no change to the model necessary - in fact, it fits perfectly
 to what the spec is currently describing without any change.
 Ah, right! Unless I'm misunderstanding again, your suggestion is to
 expose extra video tracks with kind captions or subtitles, requiring
 no spec change at all. That sounds good to me.
 Yes, that was my suggestion for dealing with UA rendered tracks.
Doesn't this still leave us with the issue: if you have N video tracks
and m caption tracks in
the media file, you'd have to expose NxM videoTracks in the interface?
We would also need to consider:

  * How do you label this combined video and text track?
  * What is the track's id?
  * How do you present this to users in a way that isn't confusing?
  * What if the video track's kind isn't main? For example, what if we
have a sign language track and we also want to display captions?
What is the generated track's kind?
  * The language attribute could also have conflicts.
  * I think it might also be possible to create files where the video
track and text track are different lengths, so we'd need to figure
out what to do when one of them ends.



Re: [whatwg] How to expose caption tracks without TextTrackCues

2014-11-03 Thread Brendan Long

On 11/03/2014 04:20 PM, Silvia Pfeiffer wrote:
 On Tue, Nov 4, 2014 at 3:56 AM, Brendan Long s...@brendanlong.com wrote:
 Right, that was the original concern. But how realistic is the
 situation of n video tracks and m caption tracks with n being larger
 than 2 or 3 without a change of the audio track anyway?
I think the situation gets confusing at N=2. See below.

 We would also need to consider:

   * How do you label this combined video and text track?
 That's not specific to the approach that we pick and will always need
 to be decided. Note that label isn't something that needs to be unique
 to a track, so you could just use the same label for all burnt-in
 video tracks and identify them to be different only in the language.
But the video and the text track might both have their own label in the
underlying media file. Presumably we'd want to preserve both.

   * What is the track's id?
 This would need to be unique, but I think it will be easy to come up
 with a scheme that works. Something like video_[n]_[captiontrackid]
 could work.
This sounds much more complicated and likely to cause problems for
JavaScript developers than just indicating that a text track has cues
that can't be represented in JavaScript.

   * How do you present this to users in a way that isn't confusing?
 No different to presenting caption tracks.
I think VideoTracks with kind=caption are confusing too, and we should
avoid creating more situations where we need to do that.

Even when we only have one video, it's confusing that captions could
exist in multiple places.

   * What if the video track's kind isn't main? For example, what if we
 have a sign language track and we also want to display captions?
 What is the generated track's kind?
 How would that work? Are you saying we're not displaying the main
 video, but only displaying the sign language track? Is that realistic
 and something anybody would actually do?
It's possible, so the spec should handle it. Maybe it doesn't matter though?

   * The language attribute could also have conflicts.
 How so?
The underlying streams could have their own metadata, and it could
conflict. I'm not sure if it would ever be reasonable to author a file
like that, but it would be trivial to create. At the very least, we'd
need language to say which takes precedence if the two streams have
conflicting metadata.

   * I think it might also be possible to create files where the video
 track and text track are different lengths, so we'd need to figure
 out what to do when one of them ends.
 The timeline of a video is well defined in the spec - I don't think we
 need to do more than what is already defined.
What I mean is that this could be confusing for users. Say I'm watching
a video with two video streams (main camera angle, secondary camera
angle) and two captions tracks (for sports for example). If I'm watching
the secondary camera angle and looking at one of the captions tracks,
but then the secondary camera angle goes away, my player is now forced
to randomly select one of the caption tracks combined with the primary
video, because it's not obvious which one corresponds with the captions
I was reading before.

In fact, if I was making a video player for my website where multiple
people give commentary on baseball games with multiple camera angles, I
would probably create my own controls that parse the video track ids and
separates them back into video and text tracks so that I could have
offer separate video and text controls, since combining them just makes
the UI more complicated.


So, what's the advantage of combining video and captions, rather than
just indicating that a text track can't be represented as TextTrackCues?


Re: [whatwg] How to expose caption tracks without TextTrackCues

2014-11-03 Thread Brendan Long

On 11/03/2014 05:41 PM, Silvia Pfeiffer wrote:
 On Tue, Nov 4, 2014 at 10:24 AM, Brendan Long s...@brendanlong.com wrote:
 On 11/03/2014 04:20 PM, Silvia Pfeiffer wrote:
 On Tue, Nov 4, 2014 at 3:56 AM, Brendan Long s...@brendanlong.com wrote:
 Right, that was the original concern. But how realistic is the
 situation of n video tracks and m caption tracks with n being larger
 than 2 or 3 without a change of the audio track anyway?
 I think the situation gets confusing at N=2. See below.

 We would also need to consider:

   * How do you label this combined video and text track?
 That's not specific to the approach that we pick and will always need
 to be decided. Note that label isn't something that needs to be unique
 to a track, so you could just use the same label for all burnt-in
 video tracks and identify them to be different only in the language.
 But the video and the text track might both have their own label in the
 underlying media file. Presumably we'd want to preserve both.

   * What is the track's id?
 This would need to be unique, but I think it will be easy to come up
 with a scheme that works. Something like video_[n]_[captiontrackid]
 could work.
 This sounds much more complicated and likely to cause problems for
 JavaScript developers than just indicating that a text track has cues
 that can't be represented in JavaScript.

   * How do you present this to users in a way that isn't confusing?
 No different to presenting caption tracks.
 I think VideoTracks with kind=caption are confusing too, and we should
 avoid creating more situations where we need to do that.

 Even when we only have one video, it's confusing that captions could
 exist in multiple places.

   * What if the video track's kind isn't main? For example, what if we
 have a sign language track and we also want to display captions?
 What is the generated track's kind?
 How would that work? Are you saying we're not displaying the main
 video, but only displaying the sign language track? Is that realistic
 and something anybody would actually do?
 It's possible, so the spec should handle it. Maybe it doesn't matter though?

   * The language attribute could also have conflicts.
 How so?
 The underlying streams could have their own metadata, and it could
 conflict. I'm not sure if it would ever be reasonable to author a file
 like that, but it would be trivial to create. At the very least, we'd
 need language to say which takes precedence if the two streams have
 conflicting metadata.

   * I think it might also be possible to create files where the video
 track and text track are different lengths, so we'd need to figure
 out what to do when one of them ends.
 The timeline of a video is well defined in the spec - I don't think we
 need to do more than what is already defined.
 What I mean is that this could be confusing for users. Say I'm watching
 a video with two video streams (main camera angle, secondary camera
 angle) and two captions tracks (for sports for example). If I'm watching
 the secondary camera angle and looking at one of the captions tracks,
 but then the secondary camera angle goes away, my player is now forced
 to randomly select one of the caption tracks combined with the primary
 video, because it's not obvious which one corresponds with the captions
 I was reading before.

 In fact, if I was making a video player for my website where multiple
 people give commentary on baseball games with multiple camera angles, I
 would probably create my own controls that parse the video track ids and
 separates them back into video and text tracks so that I could have
 offer separate video and text controls, since combining them just makes
 the UI more complicated.
 That's what I meant with multiple video tracks: if you have several
 that require different captions, then you're in a world of hurt in any
 case and this has nothing to do with whether you're representing the
 non-cue-exposed caption tracks as UARendered or as a video track.
I mean multiple video tracks that are valid for multiple caption tracks.
The example I had in my head was sports commentary, with multiple people
commenting on the same game, which is available from multiple camera angles.

We probably do need a way to indicate that tracks go together when they
don't all go together though. I think it's come up before. Maybe the
obvious answer is, don't have tracks that don't go together in the same
file.

 So, what's the advantage of combining video and captions, rather than
 just indicating that a text track can't be represented as TextTrackCues?
 One important advantage: there's no need to change the spec.

 If we change the spec, we still have to work through all the issues
 that you listed above and find a solution.

 Silvia.
I suppose not changing the spec is nice, but I think the changes are
simpler if we have no-cue text tracks, since the answer to all of my
questions becomes we don't do that, we just keep the two tracks

Re: [whatwg] Proposal: Specify SHA512 hash of JavaScript files in script tag

2014-06-26 Thread Brendan Long

On 06/26/2014 01:18 AM, Mikko Rantalainen wrote:
 However, the suggested hash signature is far from enough. Most popular
 libraries have means to load additional files and plugins and the
 suggested hash is able to sign only the main file. If you cannot
 trust the CDN provider, you cannot trust that the rest of the files
 have not been modified. An attacker could use *any* file in the CDN
 network for an attack. If your signature cannot cover *all* files,
 adding the signature is wasted effort. There's no need to provide any
 additional tools for false sense of security.
Couldn't the main file check any additional files it downloads, either
by loading them via script tag with hash or by manually hashing them as
they're downloaded (presumably easier once WebCrypto is adopted)?


Re: [whatwg] NoDatabase databases

2013-08-16 Thread Brendan Long
On 05/01/2013 10:57 PM, Brett Zamir wrote:
 I wanted to propose (if work has not already been done in this area)
 creating an HTTP extension to allow querying for retrieval and
 updating of portions of HTML (or XML) documents where the server is so
 capable and enabled, obviating the need for a separate database (or
 more accurately, bringing the database to the web server layer).
Can't you use JavaScript to do this already? Just put each part of the
page in a separate HTML or XML files, then have JavaScript request the
parts it needs and put insert them into the DOM as needed.


Re: [whatwg] Proposal: Media element - add attributes for discovery of playback rate support

2013-07-19 Thread Brendan Long
On Jul 19, 2013 3:14 PM, Ian Hickson i...@hixie.ch wrote:

  What if we added a supportedPlaybackRates attribute, which holds an
  array of playback rates supported by the server, plus (optionally) any
  rates the user agent can support due to the currently buffered data
  (possibly requiring that the user agent have enough data buffered to
  play at that speed for some amount of time).

 Wouldn't that be 0.0 .. Infinity, basically?

I've been thinking about this more, and it seems like the buffered
attribute[1] is enough for a JavaScript application to determine for itself
if it's safe to play faster. It does seem useful to expose the rates
supported by the server though, so an application can realize that it's
safe to play at those rates, even if there's not much buffered yet.

[1]
http://www.w3.org/TR/2011/WD-html5-20110405/video.html#dom-media-buffered


Re: [whatwg] Proposal: Media element - add attributes for discovery of playback rate support

2013-07-18 Thread Brendan Long
On 07/18/2013 06:54 AM, John Mellor wrote:
 If the user is speeding up playback to improve their productivity (spend
 less time watching e.g. a lecture), then they may well be willing to wait
 until enough of the video is buffered, since they can do something else in
 the meantime.

 For example by spending 30m buffering the first half of a 1 hour live
 stream, the user could then watch the whole hour at double speed.
This is how DVR's work with live TV and people seem to like it (well,
they like it more than not being able to fast-forward at all..).


Re: [whatwg] Proposal: Media element - add attributes for discovery of playback rate support

2013-07-18 Thread Brendan Long
On 07/18/2013 03:17 PM, Eric Carlson wrote:
 Even a DVR, however, won't always let you change the playback speed.
 For example it isn't possible to play at greater than 1x past the
 current time when watching a live stream. If I am watching a live
 stream and I try to play past the end of the buffered video, my DVR
 drops back to 1x and won't let me change the speed. It doesn't
 automatically pause and buffer for a while so it can play at a faster
 rate. It isn't always possible to play a media stream at an arbitrary
 speed. It is foolish to pretend otherwise as the current spec does. 
That makes sense, but we also don't want to limit ourselves to playback
speeds that a server supports when the client /does/ have data buffered.

What if we added a supportedPlaybackRates attribute, which holds an
array of playback rates supported by the server, plus (optionally) any
rates the user agent can support due to the currently buffered data
(possibly requiring that the user agent have enough data buffered to
play at that speed for some amount of time).