Re: [whatwg] video feedback

2012-12-21 Thread Ian Hickson
On Thu, 20 Dec 2012, Jer Noble wrote:
 On Dec 17, 2012, at 4:01 PM, Ian Hickson i...@hixie.ch wrote:
  
  Should we add a preciseSeek() method with two arguments that does a 
  seek using the given rational time?
 
 This method would be more useful if there were a way to retrieve the 
 media's time scale.  Otherwise, the script would have to pick an 
 arbitrary scale value, or provide the correct media scale through other 
 means (such as querying the server hosting the media).  Additionally, 
 authors like Rob are going to want to retrieve this precise 
 representation of the currentTime.  If rational time values were 
 encapsulated into their own interface, a preciseCurrentTime (or 
 similar) read-write attribute could be used instead.

Ok. I assume this is something you (Apple) are interested in implementing; 
is this something any other browser vendors want to support? If so, I'll 
be happy to add something along these lines.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] video feedback

2012-12-20 Thread Mark Callow
On 2012/12/18 9:01, Ian Hickson wrote:
 On Tue, 2 Oct 2012, Jer Noble wrote:
 The nature of floating point math makes precise frame navigation 
 difficult, if not impossible.  Rob's test is especially hairy, given 
 that each frame has a timing bound of [startTime, endTime), and his test 
 attempts to navigate directly to the startTime of a given frame, a value 
 which gives approximately zero room for error.

 ...
 That makes sense.

 Should we add a preciseSeek() method with two arguments that does a seek 
 using the given rational time?


I draw your attention to Don't Store that in a float
http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/
and its suggestion to use a double starting at 2^32 to avoid the issue
around precision changing with magnitude as the time increases.

Regards

-Mark

-- 
注意:この電子メールには、株式会社エイチアイの機密情報が含まれている場合
が有ります。正式なメール受信者では無い場合はメール複製、 再配信または情
報の使用を固く禁じております。エラー、手違いでこのメールを受け取られまし
たら削除を行い配信者にご連絡をお願いいたし ます.

NOTE: This electronic mail message may contain confidential and
privileged information from HI Corporation. If you are not the intended
recipient, any disclosure, photocopying, distribution or use of the
contents of the received information is prohibited. If you have received
this e-mail in error, please notify the sender immediately and
permanently delete this message and all related copies.



Re: [whatwg] video feedback

2012-12-20 Thread Ian Hickson
On Thu, 20 Dec 2012, Mark Callow wrote:
 On 2012/12/18 9:01, Ian Hickson wrote:
  On Tue, 2 Oct 2012, Jer Noble wrote:
  The nature of floating point math makes precise frame navigation 
  difficult, if not impossible.  Rob's test is especially hairy, given 
  that each frame has a timing bound of [startTime, endTime), and his 
  test attempts to navigate directly to the startTime of a given frame, 
  a value which gives approximately zero room for error.
 
  That makes sense.
 
  Should we add a preciseSeek() method with two arguments that does a 
  seek using the given rational time?
 
 I draw your attention to Don't Store that in a float 
 http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/ 
 and its suggestion to use a double starting at 2^32 to avoid the issue 
 around precision changing with magnitude as the time increases.

Everything in the Web platform already uses doubles.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] video feedback

2012-12-20 Thread Boris Zbarsky

On 12/20/12 9:54 AM, Ian Hickson wrote:

Everything in the Web platform already uses doubles.


Except WebGL.  And Audio API wave tables, sample rates, AudioParams, PCM 
data (though thankfully times in Audio API do use doubles).  And 
graphics libraries used to implement canvas, in many cases...


I think the only safe claim about everything in the web platform is 
that it's all different.  ;)


-Boris


Re: [whatwg] video feedback

2012-12-20 Thread Mark Callow
On 2012/12/21 2:54, Ian Hickson wrote:
 On Thu, 20 Dec 2012, Mark Callow wrote:
 I draw your attention to Don't Store that in a float 
 http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/ 
 and its suggestion to use a double starting at 2^32 to avoid the issue 
 around precision changing with magnitude as the time increases.
 Everything in the Web platform already uses doubles.
Yes, except as noted by Boris. The important point is the idea of using
2^32 as zero time which means the precision barely changes across the
range of time values of interest to games, videos, etc.

Regards

-Mark

-- 
注意:この電子メールには、株式会社エイチアイの機密情報が含まれている場合
が有ります。正式なメール受信者では無い場合はメール複製、 再配信または情
報の使用を固く禁じております。エラー、手違いでこのメールを受け取られまし
たら削除を行い配信者にご連絡をお願いいたし ます.

NOTE: This electronic mail message may contain confidential and
privileged information from HI Corporation. If you are not the intended
recipient, any disclosure, photocopying, distribution or use of the
contents of the received information is prohibited. If you have received
this e-mail in error, please notify the sender immediately and
permanently delete this message and all related copies.



Re: [whatwg] video feedback

2012-12-20 Thread Ian Hickson
On Fri, 21 Dec 2012, Mark Callow wrote:
 On 2012/12/21 2:54, Ian Hickson wrote:
  On Thu, 20 Dec 2012, Mark Callow wrote:
  I draw your attention to Don't Store that in a float 
  http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/ 
  and its suggestion to use a double starting at 2^32 to avoid the issue 
  around precision changing with magnitude as the time increases.
  Everything in the Web platform already uses doubles.
 Yes, except as noted by Boris. The important point is the idea of using 
 2^32 as zero time which means the precision barely changes across the 
 range of time values of interest to games, videos, etc.

Ah, well, for video that ship has sailed, really.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] video feedback

2012-12-20 Thread Jer Noble

On Dec 20, 2012, at 7:27 PM, Mark Callow callow.m...@artspark.co.jp wrote:

 On 2012/12/21 2:54, Ian Hickson wrote:
 On Thu, 20 Dec 2012, Mark Callow wrote:
 I draw your attention to Don't Store that in a float 
 http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/ 
 and its suggestion to use a double starting at 2^32 to avoid the issue 
 around precision changing with magnitude as the time increases.
 Everything in the Web platform already uses doubles.
 Yes, except as noted by Boris. The important point is the idea of using 2^32 
 as zero time which means the precision barely changes across the range of 
 time values of interest to games, videos, etc. 

I don't believe the frame accuracy problem in question had to do with 
precision instability, per se.  Many of Rob Coenen's frame accuracy issues were 
found within the first second of video.  Admittedly, this is where the 
avaliable precision is changing most rapidly, but it is also where available 
precision is greatest by far.

An integral rational number has a benefit over even the 2^32 zero time 
suggestion: for common time scale values[1], it is intrinsically stable over 
the range of time t=[0..2^43).  It has the added benefit of being exactly the 
representation used by the underlying media engine.


On Dec 17, 2012, at 4:01 PM, Ian Hickson i...@hixie.ch wrote:

 Should we add a preciseSeek() method with two arguments that does a seek 
 using the given rational time?


This method would be more useful if there were a way to retrieve the media's 
time scale.  Otherwise, the script would have to pick an arbitrary scale value, 
or provide the correct media scale through other means (such as querying the 
server hosting the media).  Additionally, authors like Rob are going to want to 
retrieve this precise representation of the currentTime.  If rational time 
values were encapsulated into their own interface, a preciseCurrentTime (or 
similar) read-write attribute could be used instead.

-Jer

[i] E.g., 1001 is a common time scale for 29.997 and 23.976 FPS video.


Re: [whatwg] video feedback

2012-12-17 Thread Ian Hickson
On Tue, 2 Oct 2012, Jer Noble wrote:
 On Sep 17, 2012, at 12:43 PM, Ian Hickson i...@hixie.ch wrote:
  On Mon, 9 Jul 2012, adam k wrote:
 
  i'm aware that crooked framerates (i.e. the notorious 29.97) were not 
  supported when frame accuracy was implemented.  in my tests, 29.97DF 
  timecodes were incorrect by 1 to 3 frames at any given point.
  
  will there ever be support for crooked framerate accuracy?  i would 
  be more than happy to contribute whatever i can to help test it and 
  make it possible.  can someone comment on this?
  
  This is a Quality of Implementation issue, basically. I believe 
  there's nothing inherently in the API that would make accuracy to such 
  timecodes impossible.
 
 The nature of floating point math makes precise frame navigation 
 difficult, if not impossible.  Rob's test is especially hairy, given 
 that each frame has a timing bound of [startTime, endTime), and his test 
 attempts to navigate directly to the startTime of a given frame, a value 
 which gives approximately zero room for error.
 
 I'm most familiar with MPEG containers, but I believe the following is 
 also true of the WebM container: times are represented by a rational 
 number, timeValue / timeScale, where both numerator and denominator are 
 unsigned integers.  To seek to a particular media time, we must convert 
 a floating-point time value into this rational time format (e.g. when 
 calculating the 4th frame's start time, from 3 * 1/29.97 to 3 * 
 1001/3).  If there is a floating-point error in the wrong direction 
 (e.g., as above, a numerator of 3002 vs 3003), the end result will not 
 be the frame's startTime, but one timeScale before it.
 
 We've fixed some frame accuracy bugs in WebKit (and Chromium) by 
 carefully rounding the incoming floating point time value, taking into 
 account the media's time scale, and rounding to the nearest 1/timeScale 
 value.  This fixes Rob's precision test, but at the expense of 
 precision. (I.e. in a 30 fps movie, currentTime = 0.99 / 30 will 
 navigate to the second frame, not the first, due to rounding, which is 
 technically incorrect.)
 
 This is a common problem, and Apple media frameworks (for example) 
 therefore provide rational time classes which provide enough accuracy 
 for precise navigation (e.g. QTTime, CMTime). Using a floating point 
 number to represent time with any precision is not generally accepted as 
 good practice when these rational time classes are available.

That makes sense.

Should we add a preciseSeek() method with two arguments that does a seek 
using the given rational time?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] video feedback

2012-10-02 Thread Jer Noble
On Sep 17, 2012, at 12:43 PM, Ian Hickson i...@hixie.ch wrote:

 On Mon, 9 Jul 2012, adam k wrote:
 
 i have a 25fps video, h264, with a burned in timecode.  it seems to be 
 off by 1 frame when i compare the burned in timecode to the calculated 
 timecode.  i'm using rob coenen's test app at 
 http://www.massive-interactive.nl/html5_video/smpte_test_universal.html 
 to load my own video.
 
 what's the process here to report issues?  please let me know whatever 
 formal or informal steps are required and i'll gladly follow them.
 
 Depends on the browser. Which browser?
 
 
 i'm aware that crooked framerates (i.e. the notorious 29.97) were not 
 supported when frame accuracy was implemented.  in my tests, 29.97DF 
 timecodes were incorrect by 1 to 3 frames at any given point.
 
 will there ever be support for crooked framerate accuracy?  i would be 
 more than happy to contribute whatever i can to help test it and make it 
 possible.  can someone comment on this?
 
 This is a Quality of Implementation issue, basically. I believe there's 
 nothing inherently in the API that would make accuracy to such timecodes 
 impossible.

TLDR; for precise navigation, you need to use a a rational time class, rather 
than a float value.

The nature of floating point math makes precise frame navigation difficult, if 
not impossible.  Rob's test is especially hairy, given that each frame has a 
timing bound of [startTime, endTime), and his test attempts to navigate 
directly to the startTime of a given frame, a value which gives approximately 
zero room for error.

I'm most familiar with MPEG containers, but I believe the following is also 
true of the WebM container: times are represented by a rational number, 
timeValue / timeScale, where both numerator and denominator are unsigned 
integers.  To seek to a particular media time, we must convert a floating-point 
time value into this rational time format (e.g. when calculating the 4th 
frame's start time, from 3 * 1/29.97 to 3 * 1001/3).  If there is a 
floating-point error in the wrong direction (e.g., as above, a numerator of 
3002 vs 3003), the end result will not be the frame's startTime, but one 
timeScale before it. 

We've fixed some frame accuracy bugs in WebKit (and Chromium) by carefully 
rounding the incoming floating point time value, taking into account the 
media's time scale, and rounding to the nearest 1/timeScale value.  This fixes 
Rob's precision test, but at the expense of precision. (I.e. in a 30 fps movie, 
currentTime = 0.99 / 30 will navigate to the second frame, not the first, 
due to rounding, which is technically incorrect.)

This is a common problem, and Apple media frameworks (for example) therefore 
provide rational time classes which provide enough accuracy for precise 
navigation (e.g. QTTime, CMTime). Using a floating point number to represent 
time with any precision is not generally accepted as good practice when these 
rational time classes are available.

-Jer


Re: [whatwg] video feedback

2012-10-02 Thread Silvia Pfeiffer
On Wed, Oct 3, 2012 at 6:41 AM, Jer Noble jer.no...@apple.com wrote:
 On Sep 17, 2012, at 12:43 PM, Ian Hickson i...@hixie.ch wrote:

 On Mon, 9 Jul 2012, adam k wrote:

 i have a 25fps video, h264, with a burned in timecode.  it seems to be
 off by 1 frame when i compare the burned in timecode to the calculated
 timecode.  i'm using rob coenen's test app at
 http://www.massive-interactive.nl/html5_video/smpte_test_universal.html
 to load my own video.

 what's the process here to report issues?  please let me know whatever
 formal or informal steps are required and i'll gladly follow them.

 Depends on the browser. Which browser?


 i'm aware that crooked framerates (i.e. the notorious 29.97) were not
 supported when frame accuracy was implemented.  in my tests, 29.97DF
 timecodes were incorrect by 1 to 3 frames at any given point.

 will there ever be support for crooked framerate accuracy?  i would be
 more than happy to contribute whatever i can to help test it and make it
 possible.  can someone comment on this?

 This is a Quality of Implementation issue, basically. I believe there's
 nothing inherently in the API that would make accuracy to such timecodes
 impossible.

 TLDR; for precise navigation, you need to use a a rational time class, rather 
 than a float value.

 The nature of floating point math makes precise frame navigation difficult, 
 if not impossible.  Rob's test is especially hairy, given that each frame has 
 a timing bound of [startTime, endTime), and his test attempts to navigate 
 directly to the startTime of a given frame, a value which gives approximately 
 zero room for error.

 I'm most familiar with MPEG containers, but I believe the following is also 
 true of the WebM container: times are represented by a rational number, 
 timeValue / timeScale, where both numerator and denominator are unsigned 
 integers.


FYI: the Ogg container also uses rational numbers to represent time.


  To seek to a particular media time, we must convert a floating-point time 
 value into this rational time format (e.g. when calculating the 4th frame's 
 start time, from 3 * 1/29.97 to 3 * 1001/3).  If there is a 
 floating-point error in the wrong direction (e.g., as above, a numerator of 
 3002 vs 3003), the end result will not be the frame's startTime, but one 
 timeScale before it.

 We've fixed some frame accuracy bugs in WebKit (and Chromium) by carefully 
 rounding the incoming floating point time value, taking into account the 
 media's time scale, and rounding to the nearest 1/timeScale value.  This 
 fixes Rob's precision test, but at the expense of precision. (I.e. in a 30 
 fps movie, currentTime = 0.99 / 30 will navigate to the second frame, 
 not the first, due to rounding, which is technically incorrect.)

 This is a common problem, and Apple media frameworks (for example) therefore 
 provide rational time classes which provide enough accuracy for precise 
 navigation (e.g. QTTime, CMTime). Using a floating point number to represent 
 time with any precision is not generally accepted as good practice when these 
 rational time classes are available.

 -Jer


Re: [whatwg] Video feedback

2011-07-08 Thread Ian Hickson
On Thu, 7 Jul 2011, Eric Winkelman wrote:
 On Thursday, June 02 Ian Hickson wrote:
  On Fri, 18 Mar 2011, Eric Winkelman wrote:
  
   For in-band metadata tracks, there is neither a standard way to 
   represent the type of metadata in the HTMLTrackElement interface nor 
   is there a standard way to represent multiple different types of 
   metadata tracks.
  
  There can be a standard way. The idea is that all the types of 
  metadata tracks that browsers will support should be specified so that 
  all browsers can map them the same way. I'm happy to work with anyone 
  interested in writing such a mapping spec, just let me know.
 
 I would be very interested in working on this spec.

It would be several specs, probably, each focusing on a particular set of 
metadata in a particular format (e.g. advertising timings in an MPEG 
wrapper, or whatever).


 What's the next step?

First, research: what formats and metadata streams are you interested in? 
Who uses them? How are they implemented in producers and (more 
importantly) consumers today? What are the use cases?

Second, describe the problem: make a clear statement of purpose that 
scopes the effort to provide guidelines to prevent feature creep.

Third, listen to implementors: find those that are interested in 
implementing this particular mapping of metadata to the DOM API, get their 
input, see what they want.

Fourth, implement: make or have someone else make an experimental 
implementation of a mapping that addresses the problem described in the 
earlier steps.

Fifth, specify: write a specification that describes the mapping described 
in step two, based on what you've researched in step one and based on the 
feedback from steps three and four.

Sixth, test: update the experimental implement to fit the spec, get other 
implementations to implement the spec. Have real users play with it.

Seventh, simplify: remove what you don't need.

Finally, iterate: repeat all these steps for as long as there's any 
interest in this mapping, fixing problems, adding new features if they're 
needed, removing old features that didn't get used or implemented, etc.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Video feedback

2011-07-07 Thread Bob Lund

 -Original Message-
 From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-
 boun...@lists.whatwg.org] On Behalf Of Mark Watson
 Sent: Monday, June 20, 2011 2:29 AM
 To: Eric Carlson
 Cc: Silvia Pfeiffer; whatwg Group; Simon Pieters
 Subject: Re: [whatwg] Video feedback
 
 
 On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:
 
 
  On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:
 
  On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters sim...@opera.com
 wrote:
  On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
  silviapfeiff...@gmail.com wrote:
 
  For commercial video providers, the tracks in a live stream change
  all the time; this is not limited to audio and video tracks but
  would include text tracks as well.
 
  OK, all this indicates to me that we probably want a
 metadatachanged
  event to indicate there has been a change and that JS may need to
  check some of its assumptions.
 
  We already have durationchange. Duration is metadata. If we want to
  support changes to width/height, and the script is interested in
  when that happens, maybe there should be a dimensionchange event
  (but what's the use case for changing width/height mid-stream?).
  Does the spec support changes to text tracks mid-stream?
 
  It's not about what the spec supports, but what real-world streams
 provide.
 
  I don't think it makes sense to put an event on every single type of
  metadata that can change. Most of the time, when you have a stream
  change, many variables will change together, so a single event is a
  lot less events to raise. It's an event that signifies that the media
  framework has reset the video/audio decoding pipeline and loaded a
  whole bunch of new stuff. You should imagine it as a concatenation of
  different media resources. And yes, they can have different track
  constitution and different audio sampling rate (which the audio API
  will care about) etc etc.
 
   In addition, it is possible for a stream to lose or gain an audio
 track. In this case the dimensions won't change but a script may want to
 react to the change in audioTracks.
 
 The TrackList object has an onchanged event, which I assumed would fire
 when any of the information in the TrackList changes (e.g. tracks added
 or removed). But actually the spec doesn't state when this event fires
 (as far as I could tell - unless it is implied by some general
 definition of events called onchanged).
 
 Should there be some clarification here ?
 
 
   I agree with Silvia, a more generic metadata changed event makes
 more sense.
 
 Yes, and it should support the case in which text tracks are
 added/removed too.

Has there been a bug submitted to add a metadata changed event when video, 
audio or text tracks are added or deleted from a media resource?

Thanks,
Bob Lund

 
 Also, as Eric (C) pointed out, one of the things which can change is
 which of several available versions of the content is being rendered
 (for adaptive bitrate cases). This doesn't necessarily change any of the
 metadata currently exposed on the video element, but nevertheless it's
 information that the application may need. It would be nice to expose
 some kind of identifier for the currently rendered stream and have an
 event when this changes. I think that a stream-format-supplied
 identifier would be sufficient.
 
 ...Mark
 
 
  eric
 
 



Re: [whatwg] Video feedback

2011-07-07 Thread Eric Winkelman
On Thursday, June 02 Ian Hickson wrote:

 On Fri, 18 Mar 2011, Eric Winkelman wrote:
 
  For in-band metadata tracks, there is neither a standard way to 
  represent the type of metadata in the HTMLTrackElement interface nor 
  is there a standard way to represent multiple different types of 
  metadata tracks.
 
 There can be a standard way. The idea is that all the types of 
 metadata tracks that browsers will support should be specified so that 
 all browsers can map them the same way. I'm happy to work with anyone 
 interested in writing such a mapping spec, just let me know.

I would be very interested in working on this spec.  

CableLabs works with numerous groups delivering content containing a variety of 
metadata, so we have a good idea what is currently used.  We're also working 
with the groups defining adaptive bit rate delivery protocols about how 
metadata might be carried.

What's the next step?

Eric


Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:

 
 On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:
 
 On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters sim...@opera.com wrote:
 On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:
 
 For commercial video providers, the tracks in a live stream change all
 the time; this is not limited to audio and video tracks but would include
 text tracks as well.
 
 OK, all this indicates to me that we probably want a metadatachanged
 event to indicate there has been a change and that JS may need to
 check some of its assumptions.
 
 We already have durationchange. Duration is metadata. If we want to support
 changes to width/height, and the script is interested in when that happens,
 maybe there should be a dimensionchange event (but what's the use case for
 changing width/height mid-stream?). Does the spec support changes to text
 tracks mid-stream?
 
 It's not about what the spec supports, but what real-world streams provide.
 
 I don't think it makes sense to put an event on every single type of
 metadata that can change. Most of the time, when you have a stream
 change, many variables will change together, so a single event is a
 lot less events to raise. It's an event that signifies that the media
 framework has reset the video/audio decoding pipeline and loaded a
 whole bunch of new stuff. You should imagine it as a concatenation of
 different media resources. And yes, they can have different track
 constitution and different audio sampling rate (which the audio API
 will care about) etc etc.
 
  In addition, it is possible for a stream to lose or gain an audio track. In 
 this case the dimensions won't change but a script may want to react to the 
 change in audioTracks. 

The TrackList object has an onchanged event, which I assumed would fire when 
any of the information in the TrackList changes (e.g. tracks added or removed). 
But actually the spec doesn't state when this event fires (as far as I could 
tell - unless it is implied by some general definition of events called 
onchanged).

Should there be some clarification here ?

 
  I agree with Silvia, a more generic metadata changed event makes more 
 sense. 

Yes, and it should support the case in which text tracks are added/removed too.

Also, as Eric (C) pointed out, one of the things which can change is which of 
several available versions of the content is being rendered (for adaptive 
bitrate cases). This doesn't necessarily change any of the metadata currently 
exposed on the video element, but nevertheless it's information that the 
application may need. It would be nice to expose some kind of identifier for 
the currently rendered stream and have an event when this changes. I think that 
a stream-format-supplied identifier would be sufficient.

...Mark

 
 eric
 
 



Re: [whatwg] Video feedback

2011-06-20 Thread Silvia Pfeiffer
On Mon, Jun 20, 2011 at 6:29 PM, Mark Watson wats...@netflix.com wrote:

 On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:


 On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:

 On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters sim...@opera.com wrote:
 On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:

 For commercial video providers, the tracks in a live stream change all
 the time; this is not limited to audio and video tracks but would include
 text tracks as well.

 OK, all this indicates to me that we probably want a metadatachanged
 event to indicate there has been a change and that JS may need to
 check some of its assumptions.

 We already have durationchange. Duration is metadata. If we want to support
 changes to width/height, and the script is interested in when that happens,
 maybe there should be a dimensionchange event (but what's the use case for
 changing width/height mid-stream?). Does the spec support changes to text
 tracks mid-stream?

 It's not about what the spec supports, but what real-world streams provide.

 I don't think it makes sense to put an event on every single type of
 metadata that can change. Most of the time, when you have a stream
 change, many variables will change together, so a single event is a
 lot less events to raise. It's an event that signifies that the media
 framework has reset the video/audio decoding pipeline and loaded a
 whole bunch of new stuff. You should imagine it as a concatenation of
 different media resources. And yes, they can have different track
 constitution and different audio sampling rate (which the audio API
 will care about) etc etc.

  In addition, it is possible for a stream to lose or gain an audio track. In 
 this case the dimensions won't change but a script may want to react to the 
 change in audioTracks.

 The TrackList object has an onchanged event, which I assumed would fire when 
 any of the information in the TrackList changes (e.g. tracks added or 
 removed). But actually the spec doesn't state when this event fires (as far 
 as I could tell - unless it is implied by some general definition of events 
 called onchanged).

 Should there be some clarification here ?

I understood that to relate to a change of cues only, since it is on
the tracklist. I.e. it's an aggregate event from the oncuechange event
of a cue inside the track. I didn't think it would relate to a change
of existence of that track.

Note that the even is attached to the TrackList, not the TrackList[],
so it cannot be raised when a track is added or removed, only when
something inside the TrackList changes.


  I agree with Silvia, a more generic metadata changed event makes more 
 sense.

 Yes, and it should support the case in which text tracks are added/removed 
 too.

Yes, it needs to be an event on the MediaElement.


 Also, as Eric (C) pointed out, one of the things which can change is which of 
 several available versions of the content is being rendered (for adaptive 
 bitrate cases). This doesn't necessarily change any of the metadata currently 
 exposed on the video element, but nevertheless it's information that the 
 application may need. It would be nice to expose some kind of identifier for 
 the currently rendered stream and have an event when this changes. I think 
 that a stream-format-supplied identifier would be sufficient.


I don't know about the adaptive streaming situation. I think that is
more about statistics/metrics rather than about change of resource.
All the alternatives in an adaptive streaming resource should
provide the same number of tracks and the same video dimensions, just
at different bitrate/quality, no? Different video dimensions should be
provided through the source element and @media attribute, but within
an adaptive stream, the alternatives should be consistent because the
target device won't change. I guess this is a discussion for another
thread... :-)

Cheers,
Silvia.


Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 20, 2011, at 10:42 AM, Silvia Pfeiffer wrote:

On Mon, Jun 20, 2011 at 6:29 PM, Mark Watson 
wats...@netflix.commailto:wats...@netflix.com wrote:

On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:


On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:

On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters 
sim...@opera.commailto:sim...@opera.com wrote:
On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
silviapfeiff...@gmail.commailto:silviapfeiff...@gmail.com wrote:

For commercial video providers, the tracks in a live stream change all
the time; this is not limited to audio and video tracks but would include
text tracks as well.

OK, all this indicates to me that we probably want a metadatachanged
event to indicate there has been a change and that JS may need to
check some of its assumptions.

We already have durationchange. Duration is metadata. If we want to support
changes to width/height, and the script is interested in when that happens,
maybe there should be a dimensionchange event (but what's the use case for
changing width/height mid-stream?). Does the spec support changes to text
tracks mid-stream?

It's not about what the spec supports, but what real-world streams provide.

I don't think it makes sense to put an event on every single type of
metadata that can change. Most of the time, when you have a stream
change, many variables will change together, so a single event is a
lot less events to raise. It's an event that signifies that the media
framework has reset the video/audio decoding pipeline and loaded a
whole bunch of new stuff. You should imagine it as a concatenation of
different media resources. And yes, they can have different track
constitution and different audio sampling rate (which the audio API
will care about) etc etc.

 In addition, it is possible for a stream to lose or gain an audio track. In 
this case the dimensions won't change but a script may want to react to the 
change in audioTracks.

The TrackList object has an onchanged event, which I assumed would fire when 
any of the information in the TrackList changes (e.g. tracks added or removed). 
But actually the spec doesn't state when this event fires (as far as I could 
tell - unless it is implied by some general definition of events called 
onchanged).

Should there be some clarification here ?

I understood that to relate to a change of cues only, since it is on
the tracklist. I.e. it's an aggregate event from the oncuechange event
of a cue inside the track. I didn't think it would relate to a change
of existence of that track.

Note that the even is attached to the TrackList, not the TrackList[],
so it cannot be raised when a track is added or removed, only when
something inside the TrackList changes.

Are we talking about the same thing ? There is no TrackList array and TrackList 
is only used for audio/video, not text, so I don't understand the comment about 
cues.

I'm talking about 
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
 which is the base class for MultipleTrackList and ExclusiveTrackList used to 
represent all the audio and video tracks (respectively). One instance of the 
object represents all the tracks, so I would assume that a change in the number 
of tracks is a change to this object.




 I agree with Silvia, a more generic metadata changed event makes more sense.

Yes, and it should support the case in which text tracks are added/removed too.

Yes, it needs to be an event on the MediaElement.


Also, as Eric (C) pointed out, one of the things which can change is which of 
seve
ral available versions of the content is being rendered (for adaptive bitrate 
cases). This doesn't necessarily change any of the metadata currently exposed 
on the video element, but nevertheless it's information that the application 
may need. It would be nice to expose some kind of identifier for the currently 
rendered stream and have an event when this changes. I think that a 
stream-format-supplied identifier would be sufficient.


I don't know about the adaptive streaming situation. I think that is
more about statistics/metrics rather than about change of resource.
All the alternatives in an adaptive streaming resource should
provide the same number of tracks and the same video dimensions, just
at different bitrate/quality, no?

I think of the different adaptive versions on a per-track basis (i.e. the 
alternatives are *within* each track), not a bunch of alternatives each of 
which contains several tracks. Both are possible, of course.

It's certainly possible (indeed common) for different bitrate video encodings 
to have different resolutions - there are video encoding reasons to do this. Of 
course the aspect ratio should not change and nor should the dimensions on the 
screen (both would be a little peculiar for the user).

Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not the 
same as the resolution (for a start, they are in CSS pixels, which 

Re: [whatwg] Video feedback

2011-06-20 Thread Silvia Pfeiffer
On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wats...@netflix.com wrote:

 The TrackList object has an onchanged event, which I assumed would fire when
 any of the information in the TrackList changes (e.g. tracks added or
 removed). But actually the spec doesn't state when this event fires (as far
 as I could tell - unless it is implied by some general definition of events
 called onchanged).

 Should there be some clarification here ?

 I understood that to relate to a change of cues only, since it is on
 the tracklist. I.e. it's an aggregate event from the oncuechange event
 of a cue inside the track. I didn't think it would relate to a change
 of existence of that track.

 Note that the even is attached to the TrackList, not the TrackList[],
 so it cannot be raised when a track is added or removed, only when
 something inside the TrackList changes.

 Are we talking about the same thing ? There is no TrackList array and
 TrackList is only used for audio/video, not text, so I don't understand the
 comment about cues.
 I'm talking
 about http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist which
 is the base class for MultipleTrackList and ExclusiveTrackList used to
 represent all the audio and video tracks (respectively). One instance of the
 object represents all the tracks, so I would assume that a change in the
 number of tracks is a change to this object.

Ah yes, you're right: I got confused.

It says Whenever the selected track is changed, the user agent must
queue a task to fire a simple event named change at the
MultipleTrackList object. This means it fires when the selectedIndex
is changed, i.e. the user chooses a different track for rendering. I
still don't think it relates to changes in the composition of tracks
of a resource. That should be something different and should probably
be on the MediaElement and not on the track list to also cover changes
in text tracks.


 Also, as Eric (C) pointed out, one of the things which can change is which
 of several available versions of the content is being rendered (for adaptive
 bitrate cases). This doesn't necessarily change any of the metadata
 currently exposed on the video element, but nevertheless it's information
 that the application may need. It would be nice to expose some kind of
 identifier for the currently rendered stream and have an event when this
 changes. I think that a stream-format-supplied identifier would be
 sufficient.

 I don't know about the adaptive streaming situation. I think that is
 more about statistics/metrics rather than about change of resource.
 All the alternatives in an adaptive streaming resource should
 provide the same number of tracks and the same video dimensions, just
 at different bitrate/quality, no?

 I think of the different adaptive versions on a per-track basis (i.e. the
 alternatives are *within* each track), not a bunch of alternatives each of
 which contains several tracks. Both are possible, of course.

 It's certainly possible (indeed common) for different bitrate video
 encodings to have different resolutions - there are video encoding reasons
 to do this. Of course the aspect ratio should not change and nor should the
 dimensions on the screen (both would be a little peculiar for the user).

 Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
 the same as the resolution (for a start, they are in CSS pixels, which are
 square), but I think it quite likely that if the resolution of the video
 changes than the videoWidth and videoHeight might change. I'd be interested
 to hear how existing implementations relate resolution to videoWidth and
 videoHeight.

Well, if videoWidth and videoHeight change and no dimensions on the
video are provided through CSS, then surely the video will change size
and the display will shrink. That would be a terrible user experience.
For that reason I would suggest that such a change not be made in
alternative adaptive streams.


 Different video dimensions should be
 provided through the source element and @media attribute, but within
 an adaptive stream, the alternatives should be consistent because the
 target device won't change. I guess this is a discussion for another
 thread... :-)

 Possibly ;-) The device knows much better than the page author what
 capabilities it has and so what resolutions are suitable for the device. So
 it is better to provide all the alternatives as a single resource and have
 the device work out which subset it can support. Or at least, the list
 should be provided all at the same level - there is no rationale for a
 hierarchy of alternatives.

The way in which HTML deals with different devices and their different
capabilities is through media queries. As a author you provide your
content with different versions of media-dependent style sheets and
content, so that when you view the page with a different device, the
capabilities of the device select the right style sheet and 

Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote:

 On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wats...@netflix.com wrote:
 
 The TrackList object has an onchanged event, which I assumed would fire 
 when
 any of the information in the TrackList changes (e.g. tracks added or
 removed). But actually the spec doesn't state when this event fires (as far
 as I could tell - unless it is implied by some general definition of events
 called onchanged).
 
 Should there be some clarification here ?
 
 I understood that to relate to a change of cues only, since it is on
 the tracklist. I.e. it's an aggregate event from the oncuechange event
 of a cue inside the track. I didn't think it would relate to a change
 of existence of that track.
 
 Note that the even is attached to the TrackList, not the TrackList[],
 so it cannot be raised when a track is added or removed, only when
 something inside the TrackList changes.
 
 Are we talking about the same thing ? There is no TrackList array and
 TrackList is only used for audio/video, not text, so I don't understand the
 comment about cues.
 I'm talking
 about 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
  which
 is the base class for MultipleTrackList and ExclusiveTrackList used to
 represent all the audio and video tracks (respectively). One instance of the
 object represents all the tracks, so I would assume that a change in the
 number of tracks is a change to this object.
 
 Ah yes, you're right: I got confused.
 
 It says Whenever the selected track is changed, the user agent must
 queue a task to fire a simple event named change at the
 MultipleTrackList object. This means it fires when the selectedIndex
 is changed, i.e. the user chooses a different track for rendering. I
 still don't think it relates to changes in the composition of tracks
 of a resource. That should be something different and should probably
 be on the MediaElement and not on the track list to also cover changes
 in text tracks.

Fair enough.

 
 
 Also, as Eric (C) pointed out, one of the things which can change is which
 of several available versions of the content is being rendered (for 
 adaptive
 bitrate cases). This doesn't necessarily change any of the metadata
 currently exposed on the video element, but nevertheless it's information
 that the application may need. It would be nice to expose some kind of
 identifier for the currently rendered stream and have an event when this
 changes. I think that a stream-format-supplied identifier would be
 sufficient.
 
 I don't know about the adaptive streaming situation. I think that is
 more about statistics/metrics rather than about change of resource.
 All the alternatives in an adaptive streaming resource should
 provide the same number of tracks and the same video dimensions, just
 at different bitrate/quality, no?
 
 I think of the different adaptive versions on a per-track basis (i.e. the
 alternatives are *within* each track), not a bunch of alternatives each of
 which contains several tracks. Both are possible, of course.
 
 It's certainly possible (indeed common) for different bitrate video
 encodings to have different resolutions - there are video encoding reasons
 to do this. Of course the aspect ratio should not change and nor should the
 dimensions on the screen (both would be a little peculiar for the user).
 
 Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
 the same as the resolution (for a start, they are in CSS pixels, which are
 square), but I think it quite likely that if the resolution of the video
 changes than the videoWidth and videoHeight might change. I'd be interested
 to hear how existing implementations relate resolution to videoWidth and
 videoHeight.
 
 Well, if videoWidth and videoHeight change and no dimensions on the
 video are provided through CSS, then surely the video will change size
 and the display will shrink. That would be a terrible user experience.
 For that reason I would suggest that such a change not be made in
 alternative adaptive streams.

That seems backwards to me! I would say For that reason I would suggest that 
dimensions are provided through CSS or through the width and height attributes.

Alternatively, we change the specification of the video element to accommodate 
this aspect of adaptive streaming (for example, the videoWidth and videoHeight 
could be defined to be based on the highest resolution bitrate being 
considered.)

There are good video encoding reasons for different bitrates to be encoded at 
different resolutions which are far more important than any reasons not to do 
either of the above.

 
 
 Different video dimensions should be
 provided through the source element and @media attribute, but within
 an adaptive stream, the alternatives should be consistent because the
 target device won't change. I guess this is a discussion for another
 thread... :-)
 
 Possibly ;-) The device knows much 

Re: [whatwg] Video feedback

2011-06-20 Thread Silvia Pfeiffer
On Tue, Jun 21, 2011 at 12:07 AM, Mark Watson wats...@netflix.com wrote:

 On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote:

 On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wats...@netflix.com wrote:

 The TrackList object has an onchanged event, which I assumed would fire 
 when
 any of the information in the TrackList changes (e.g. tracks added or
 removed). But actually the spec doesn't state when this event fires (as 
 far
 as I could tell - unless it is implied by some general definition of 
 events
 called onchanged).

 Should there be some clarification here ?

 I understood that to relate to a change of cues only, since it is on
 the tracklist. I.e. it's an aggregate event from the oncuechange event
 of a cue inside the track. I didn't think it would relate to a change
 of existence of that track.

 Note that the even is attached to the TrackList, not the TrackList[],
 so it cannot be raised when a track is added or removed, only when
 something inside the TrackList changes.

 Are we talking about the same thing ? There is no TrackList array and
 TrackList is only used for audio/video, not text, so I don't understand the
 comment about cues.
 I'm talking
 about 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
  which
 is the base class for MultipleTrackList and ExclusiveTrackList used to
 represent all the audio and video tracks (respectively). One instance of the
 object represents all the tracks, so I would assume that a change in the
 number of tracks is a change to this object.

 Ah yes, you're right: I got confused.

 It says Whenever the selected track is changed, the user agent must
 queue a task to fire a simple event named change at the
 MultipleTrackList object. This means it fires when the selectedIndex
 is changed, i.e. the user chooses a different track for rendering. I
 still don't think it relates to changes in the composition of tracks
 of a resource. That should be something different and should probably
 be on the MediaElement and not on the track list to also cover changes
 in text tracks.

 Fair enough.



 Also, as Eric (C) pointed out, one of the things which can change is which
 of several available versions of the content is being rendered (for 
 adaptive
 bitrate cases). This doesn't necessarily change any of the metadata
 currently exposed on the video element, but nevertheless it's information
 that the application may need. It would be nice to expose some kind of
 identifier for the currently rendered stream and have an event when this
 changes. I think that a stream-format-supplied identifier would be
 sufficient.

 I don't know about the adaptive streaming situation. I think that is
 more about statistics/metrics rather than about change of resource.
 All the alternatives in an adaptive streaming resource should
 provide the same number of tracks and the same video dimensions, just
 at different bitrate/quality, no?

 I think of the different adaptive versions on a per-track basis (i.e. the
 alternatives are *within* each track), not a bunch of alternatives each of
 which contains several tracks. Both are possible, of course.

 It's certainly possible (indeed common) for different bitrate video
 encodings to have different resolutions - there are video encoding reasons
 to do this. Of course the aspect ratio should not change and nor should the
 dimensions on the screen (both would be a little peculiar for the user).

 Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
 the same as the resolution (for a start, they are in CSS pixels, which are
 square), but I think it quite likely that if the resolution of the video
 changes than the videoWidth and videoHeight might change. I'd be interested
 to hear how existing implementations relate resolution to videoWidth and
 videoHeight.

 Well, if videoWidth and videoHeight change and no dimensions on the
 video are provided through CSS, then surely the video will change size
 and the display will shrink. That would be a terrible user experience.
 For that reason I would suggest that such a change not be made in
 alternative adaptive streams.

 That seems backwards to me! I would say For that reason I would suggest that 
 dimensions are provided through CSS or through the width and height 
 attributes.

 Alternatively, we change the specification of the video element to 
 accommodate this aspect of adaptive streaming (for example, the videoWidth 
 and videoHeight could be defined to be based on the highest resolution 
 bitrate being considered.)

 There are good video encoding reasons for different bitrates to be encoded at 
 different resolutions which are far more important than any reasons not to do 
 either of the above.



 Different video dimensions should be
 provided through the source element and @media attribute, but within
 an adaptive stream, the alternatives should be consistent because the
 target device won't change. I guess this is a 

Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 20, 2011, at 5:28 PM, Silvia Pfeiffer wrote:

 On Tue, Jun 21, 2011 at 12:07 AM, Mark Watson wats...@netflix.com wrote:
 
 On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote:
 
 On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wats...@netflix.com wrote:
 
 The TrackList object has an onchanged event, which I assumed would fire 
 when
 any of the information in the TrackList changes (e.g. tracks added or
 removed). But actually the spec doesn't state when this event fires (as 
 far
 as I could tell - unless it is implied by some general definition of 
 events
 called onchanged).
 
 Should there be some clarification here ?
 
 I understood that to relate to a change of cues only, since it is on
 the tracklist. I.e. it's an aggregate event from the oncuechange event
 of a cue inside the track. I didn't think it would relate to a change
 of existence of that track.
 
 Note that the even is attached to the TrackList, not the TrackList[],
 so it cannot be raised when a track is added or removed, only when
 something inside the TrackList changes.
 
 Are we talking about the same thing ? There is no TrackList array and
 TrackList is only used for audio/video, not text, so I don't understand the
 comment about cues.
 I'm talking
 about 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
  which
 is the base class for MultipleTrackList and ExclusiveTrackList used to
 represent all the audio and video tracks (respectively). One instance of 
 the
 object represents all the tracks, so I would assume that a change in the
 number of tracks is a change to this object.
 
 Ah yes, you're right: I got confused.
 
 It says Whenever the selected track is changed, the user agent must
 queue a task to fire a simple event named change at the
 MultipleTrackList object. This means it fires when the selectedIndex
 is changed, i.e. the user chooses a different track for rendering. I
 still don't think it relates to changes in the composition of tracks
 of a resource. That should be something different and should probably
 be on the MediaElement and not on the track list to also cover changes
 in text tracks.
 
 Fair enough.
 
 
 
 Also, as Eric (C) pointed out, one of the things which can change is 
 which
 of several available versions of the content is being rendered (for 
 adaptive
 bitrate cases). This doesn't necessarily change any of the metadata
 currently exposed on the video element, but nevertheless it's information
 that the application may need. It would be nice to expose some kind of
 identifier for the currently rendered stream and have an event when this
 changes. I think that a stream-format-supplied identifier would be
 sufficient.
 
 I don't know about the adaptive streaming situation. I think that is
 more about statistics/metrics rather than about change of resource.
 All the alternatives in an adaptive streaming resource should
 provide the same number of tracks and the same video dimensions, just
 at different bitrate/quality, no?
 
 I think of the different adaptive versions on a per-track basis (i.e. the
 alternatives are *within* each track), not a bunch of alternatives each of
 which contains several tracks. Both are possible, of course.
 
 It's certainly possible (indeed common) for different bitrate video
 encodings to have different resolutions - there are video encoding reasons
 to do this. Of course the aspect ratio should not change and nor should the
 dimensions on the screen (both would be a little peculiar for the user).
 
 Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
 the same as the resolution (for a start, they are in CSS pixels, which are
 square), but I think it quite likely that if the resolution of the video
 changes than the videoWidth and videoHeight might change. I'd be interested
 to hear how existing implementations relate resolution to videoWidth and
 videoHeight.
 
 Well, if videoWidth and videoHeight change and no dimensions on the
 video are provided through CSS, then surely the video will change size
 and the display will shrink. That would be a terrible user experience.
 For that reason I would suggest that such a change not be made in
 alternative adaptive streams.
 
 That seems backwards to me! I would say For that reason I would suggest 
 that dimensions are provided through CSS or through the width and height 
 attributes.
 
 Alternatively, we change the specification of the video element to 
 accommodate this aspect of adaptive streaming (for example, the videoWidth 
 and videoHeight could be defined to be based on the highest resolution 
 bitrate being considered.)
 
 There are good video encoding reasons for different bitrates to be encoded 
 at different resolutions which are far more important than any reasons not 
 to do either of the above.
 
 
 
 Different video dimensions should be
 provided through the source element and @media attribute, but within
 an adaptive stream, the alternatives 

Re: [whatwg] Video feedback

2011-06-09 Thread Simon Pieters
On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


For commercial video providers, the tracks in a live stream change all  
the time; this is not limited to audio and video tracks but would  
include text tracks as well.


OK, all this indicates to me that we probably want a metadatachanged
event to indicate there has been a change and that JS may need to
check some of its assumptions.


We already have durationchange. Duration is metadata. If we want to  
support changes to width/height, and the script is interested in when that  
happens, maybe there should be a dimensionchange event (but what's the use  
case for changing width/height mid-stream?). Does the spec support changes  
to text tracks mid-stream?


--
Simon Pieters
Opera Software


Re: [whatwg] Video feedback

2011-06-09 Thread Silvia Pfeiffer
On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters sim...@opera.com wrote:
 On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:

 For commercial video providers, the tracks in a live stream change all
 the time; this is not limited to audio and video tracks but would include
 text tracks as well.

 OK, all this indicates to me that we probably want a metadatachanged
 event to indicate there has been a change and that JS may need to
 check some of its assumptions.

 We already have durationchange. Duration is metadata. If we want to support
 changes to width/height, and the script is interested in when that happens,
 maybe there should be a dimensionchange event (but what's the use case for
 changing width/height mid-stream?). Does the spec support changes to text
 tracks mid-stream?

It's not about what the spec supports, but what real-world streams provide.

I don't think it makes sense to put an event on every single type of
metadata that can change. Most of the time, when you have a stream
change, many variables will change together, so a single event is a
lot less events to raise. It's an event that signifies that the media
framework has reset the video/audio decoding pipeline and loaded a
whole bunch of new stuff. You should imagine it as a concatenation of
different media resources. And yes, they can have different track
constitution and different audio sampling rate (which the audio API
will care about) etc etc.

The durationchange is a different type of event. It has not much to do
with having a change of a media format, but more one with getting new
information that more data is available than previously expected. It's
one that allows streaming of long video resources, even if they are
just a of a single encoding setting. In contrast what we are talking
about is that the encoding settings change mid-stream.

Cheers,
Silvia.


Re: [whatwg] Video feedback

2011-06-09 Thread Eric Carlson

On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:

 On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters sim...@opera.com wrote:
 On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:
 
 For commercial video providers, the tracks in a live stream change all
 the time; this is not limited to audio and video tracks but would include
 text tracks as well.
 
 OK, all this indicates to me that we probably want a metadatachanged
 event to indicate there has been a change and that JS may need to
 check some of its assumptions.
 
 We already have durationchange. Duration is metadata. If we want to support
 changes to width/height, and the script is interested in when that happens,
 maybe there should be a dimensionchange event (but what's the use case for
 changing width/height mid-stream?). Does the spec support changes to text
 tracks mid-stream?
 
 It's not about what the spec supports, but what real-world streams provide.
 
 I don't think it makes sense to put an event on every single type of
 metadata that can change. Most of the time, when you have a stream
 change, many variables will change together, so a single event is a
 lot less events to raise. It's an event that signifies that the media
 framework has reset the video/audio decoding pipeline and loaded a
 whole bunch of new stuff. You should imagine it as a concatenation of
 different media resources. And yes, they can have different track
 constitution and different audio sampling rate (which the audio API
 will care about) etc etc.
 
  In addition, it is possible for a stream to lose or gain an audio track. In 
this case the dimensions won't change but a script may want to react to the 
change in audioTracks. 

  I agree with Silvia, a more generic metadata changed event makes more 
sense. 

eric



Re: [whatwg] Video feedback

2011-06-08 Thread Philip Jägenstedt
On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt phil...@opera.com  
wrote:

On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer
silviapfeiff...@gmail.com wrote:



On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson i...@hixie.ch wrote:


On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:


I do not know how technically the change of stream composition works  
in
MPEG, but in Ogg we have to end a current stream and start a new one  
to
switch compositions. This has been called sequential multiplexing  
or

chaining. In this case, stream setup information is repeated, which
would probably lead to creating a new steam handler and possibly a  
new
firing of loadedmetadata. I am not sure how chaining is  
implemented in

browsers.


Per spec, chaining isn't currently supported. The closest thing I can
find
in the spec to this situation is handling a non-fatal error, which  
causes

the unexpected content to be ignored.


On Fri, 17 Dec 2010, Eric Winkelman wrote:


The short answer for changing stream composition is that there is a
Program Map Table (PMT) that is repeated every 100 milliseconds and
describes the content of the stream.  Depending on the programming,  
the
stream's composition could change entering/exiting every  
advertisement.


If this is something that browser vendors want to support, I can  
specify

how to handle it. Anyone?


Icecast streams have chained files, so streaming Ogg to an audio
element would hit this problem. There is a bug in FF for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
also a webkit bug for icecast streaming, which is probably related
https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
is able to deal with icecast streams, but it seems to deal with it.

The thing is: you can implement playback and seeking without any
further changes to the spec. But then the browser-internal metadata
states will change depending on the chunk you're on. Should that also
update the exposed metadata in the API then? Probably yes, because
otherwise the JS developer may deal with contradictory information.
Maybe we need a metadatachange event for this?


An Icecast stream is conceptually just one infinite audio stream, even
though at the container level it is several chained Ogg streams.  
duration
will be Infinity and currentTime will be constantly increasing. This  
doesn't
seem to be a case where any spec change is needed. Am I missing  
something?



That is all correct. However, because it is a sequence of Ogg streams,
there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new metadatachange event
to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.


Nothing exposed via the current API would change, AFAICT. I agree that if  
we start exposing things like sampling rate or want to support arbitrary  
chained Ogg, then there is a problem.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-08 Thread Silvia Pfeiffer
On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt phil...@opera.com wrote:
 On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:

 On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt phil...@opera.com
 wrote:

 On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:


 On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson i...@hixie.ch wrote:

 On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:

 I do not know how technically the change of stream composition works
 in
 MPEG, but in Ogg we have to end a current stream and start a new one
 to
 switch compositions. This has been called sequential multiplexing or
 chaining. In this case, stream setup information is repeated, which
 would probably lead to creating a new steam handler and possibly a new
 firing of loadedmetadata. I am not sure how chaining is implemented
 in
 browsers.

 Per spec, chaining isn't currently supported. The closest thing I can
 find
 in the spec to this situation is handling a non-fatal error, which
 causes
 the unexpected content to be ignored.


 On Fri, 17 Dec 2010, Eric Winkelman wrote:

 The short answer for changing stream composition is that there is a
 Program Map Table (PMT) that is repeated every 100 milliseconds and
 describes the content of the stream.  Depending on the programming,
 the
 stream's composition could change entering/exiting every
 advertisement.

 If this is something that browser vendors want to support, I can
 specify
 how to handle it. Anyone?

 Icecast streams have chained files, so streaming Ogg to an audio
 element would hit this problem. There is a bug in FF for this:
 https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
 bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
 also a webkit bug for icecast streaming, which is probably related
 https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
 is able to deal with icecast streams, but it seems to deal with it.

 The thing is: you can implement playback and seeking without any
 further changes to the spec. But then the browser-internal metadata
 states will change depending on the chunk you're on. Should that also
 update the exposed metadata in the API then? Probably yes, because
 otherwise the JS developer may deal with contradictory information.
 Maybe we need a metadatachange event for this?

 An Icecast stream is conceptually just one infinite audio stream, even
 though at the container level it is several chained Ogg streams. duration
 will be Infinity and currentTime will be constantly increasing. This
 doesn't
 seem to be a case where any spec change is needed. Am I missing
 something?


 That is all correct. However, because it is a sequence of Ogg streams,
 there are new Ogg headers in the middle. These new Ogg headers will
 lead to new metadata loaded in the media framework - e.g. because the
 new Ogg stream is encoded with a different audio sampling rate and a
 different video width/height etc. So, therefore, the metadata in the
 media framework changes. However, what the browser reports to the JS
 developer doesn't change. Or if it does change, the JS developer is
 not informed of it because it is a single infinite audio (or video)
 stream. Thus the question whether we need a new metadatachange event
 to expose this to the JS developer. It would then also signify that
 potentially the number of tracks that are available may have changed
 and other such information.

 Nothing exposed via the current API would change, AFAICT.

Thus, after a change mid-stream to, say,  a smaller video width and
height, would the video.videoWidth and video.videoHeight attributes
represent the width and height of the previous stream or the current
one?


 I agree that if we
 start exposing things like sampling rate or want to support arbitrary
 chained Ogg, then there is a problem.

I think we already have a problem with width and height for chained
Ogg and we cannot stop people from putting chained Ogg into the @src.

I actually took this discussion away from MPEG PTM, which is where
Eric's question came from, because I don't understand how it works
with MPEG. But I can see that it's not just a problem of MPEG, but
also of Ogg (and possibly of WebM which can have multiple Segments).
So, I think we need a generic solution for it.

Cheers,
Silvia.


Re: [whatwg] Video feedback

2011-06-08 Thread Philip Jägenstedt
On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt phil...@opera.com  
wrote:

On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
silviapfeiff...@gmail.com wrote:


That is all correct. However, because it is a sequence of Ogg streams,
there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new metadatachange event
to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.


Nothing exposed via the current API would change, AFAICT.


Thus, after a change mid-stream to, say,  a smaller video width and
height, would the video.videoWidth and video.videoHeight attributes
represent the width and height of the previous stream or the current
one?



I agree that if we
start exposing things like sampling rate or want to support arbitrary
chained Ogg, then there is a problem.


I think we already have a problem with width and height for chained
Ogg and we cannot stop people from putting chained Ogg into the @src.

I actually took this discussion away from MPEG PTM, which is where
Eric's question came from, because I don't understand how it works
with MPEG. But I can see that it's not just a problem of MPEG, but
also of Ogg (and possibly of WebM which can have multiple Segments).
So, I think we need a generic solution for it.


OK, I don't think we disagree. I'm just saying that for Icecast audio  
streams, there is no problem.


As for Ogg and WebM, I'm inclined to say that we just shouldn't support  
that, unless there's some compelling use case for it. There's also the  
option of tweaking the muxers so that all the streams are known up-front,  
even if there won't be any data arriving for them until half-way through  
the file.


I also know nothing about MPEG or the use cases involved, so no opinions  
there.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-08 Thread Silvia Pfeiffer
On Wed, Jun 8, 2011 at 9:18 PM, Philip Jägenstedt phil...@opera.com wrote:
 On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:

 On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt phil...@opera.com
 wrote:

 On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
 silviapfeiff...@gmail.com wrote:

 That is all correct. However, because it is a sequence of Ogg streams,
 there are new Ogg headers in the middle. These new Ogg headers will
 lead to new metadata loaded in the media framework - e.g. because the
 new Ogg stream is encoded with a different audio sampling rate and a
 different video width/height etc. So, therefore, the metadata in the
 media framework changes. However, what the browser reports to the JS
 developer doesn't change. Or if it does change, the JS developer is
 not informed of it because it is a single infinite audio (or video)
 stream. Thus the question whether we need a new metadatachange event
 to expose this to the JS developer. It would then also signify that
 potentially the number of tracks that are available may have changed
 and other such information.

 Nothing exposed via the current API would change, AFAICT.

 Thus, after a change mid-stream to, say,  a smaller video width and
 height, would the video.videoWidth and video.videoHeight attributes
 represent the width and height of the previous stream or the current
 one?


 I agree that if we
 start exposing things like sampling rate or want to support arbitrary
 chained Ogg, then there is a problem.

 I think we already have a problem with width and height for chained
 Ogg and we cannot stop people from putting chained Ogg into the @src.

 I actually took this discussion away from MPEG PTM, which is where
 Eric's question came from, because I don't understand how it works
 with MPEG. But I can see that it's not just a problem of MPEG, but
 also of Ogg (and possibly of WebM which can have multiple Segments).
 So, I think we need a generic solution for it.

 OK, I don't think we disagree. I'm just saying that for Icecast audio
 streams, there is no problem.

Hmm.. because there is nothing in the API that actually exposes audio metadata?


 As for Ogg and WebM, I'm inclined to say that we just shouldn't support
 that, unless there's some compelling use case for it.

You know that you can also transmit video with icecast...?

Silvia.

 There's also the
 option of tweaking the muxers so that all the streams are known up-front,
 even if there won't be any data arriving for them until half-way through the
 file.

 I also know nothing about MPEG or the use cases involved, so no opinions
 there.

 --
 Philip Jägenstedt
 Core Developer
 Opera Software



Re: [whatwg] Video feedback

2011-06-08 Thread Philip Jägenstedt
On Wed, 08 Jun 2011 13:38:18 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Wed, Jun 8, 2011 at 9:18 PM, Philip Jägenstedt phil...@opera.com  
wrote:

On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer
silviapfeiff...@gmail.com wrote:


On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt phil...@opera.com
wrote:


On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
silviapfeiff...@gmail.com wrote:

That is all correct. However, because it is a sequence of Ogg  
streams,

there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new metadatachange  
event

to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.


Nothing exposed via the current API would change, AFAICT.


Thus, after a change mid-stream to, say,  a smaller video width and
height, would the video.videoWidth and video.videoHeight attributes
represent the width and height of the previous stream or the current
one?



I agree that if we
start exposing things like sampling rate or want to support arbitrary
chained Ogg, then there is a problem.


I think we already have a problem with width and height for chained
Ogg and we cannot stop people from putting chained Ogg into the @src.

I actually took this discussion away from MPEG PTM, which is where
Eric's question came from, because I don't understand how it works
with MPEG. But I can see that it's not just a problem of MPEG, but
also of Ogg (and possibly of WebM which can have multiple Segments).
So, I think we need a generic solution for it.


OK, I don't think we disagree. I'm just saying that for Icecast audio
streams, there is no problem.


Hmm.. because there is nothing in the API that actually exposes audio  
metadata?


Yes.


As for Ogg and WebM, I'm inclined to say that we just shouldn't support
that, unless there's some compelling use case for it.


You know that you can also transmit video with icecast...?


Nope :) I guess that invalidates everything I've said about Icecast.  
Practically, though, no one is using Icecast to mix audio tracks with  
audio+video tracks and getting upset that it doesn't work in browsers,  
right?


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-08 Thread Eric Carlson

On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote:

 Nothing exposed via the current API would change, AFAICT.
 
 Thus, after a change mid-stream to, say,  a smaller video width and
 height, would the video.videoWidth and video.videoHeight attributes
 represent the width and height of the previous stream or the current
 one?
 
 
 I agree that if we
 start exposing things like sampling rate or want to support arbitrary
 chained Ogg, then there is a problem.
 
 I think we already have a problem with width and height for chained
 Ogg and we cannot stop people from putting chained Ogg into the @src.
 
 I actually took this discussion away from MPEG PTM, which is where
 Eric's question came from, because I don't understand how it works
 with MPEG. But I can see that it's not just a problem of MPEG, but
 also of Ogg (and possibly of WebM which can have multiple Segments).
 So, I think we need a generic solution for it.
 
  The characteristics of an Apple HTTP live stream can change on the fly. For 
example if the user's bandwidth to the streaming server changes, the video 
width and height can change as the stream resolution is switched up or down, or 
the number of tracks can change when a stream switches from video+audio to 
audio only. In addition, a server can insert segments with different 
characteristics into a stream on the fly, eg. inserting an ad or emergency 
announcement.

  It is not possible to predict these changes before they occur.

eric



Re: [whatwg] Video feedback

2011-06-08 Thread Bob Lund


 -Original Message-
 From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-
 boun...@lists.whatwg.org] On Behalf Of Eric Carlson
 Sent: Wednesday, June 08, 2011 9:34 AM
 To: Silvia Pfeiffer; Philip Jägenstedt
 Cc: whatwg@lists.whatwg.org
 Subject: Re: [whatwg] Video feedback
 
 
 On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote:
 
  Nothing exposed via the current API would change, AFAICT.
 
  Thus, after a change mid-stream to, say,  a smaller video width and
  height, would the video.videoWidth and video.videoHeight attributes
  represent the width and height of the previous stream or the current
  one?
 
 
  I agree that if we
  start exposing things like sampling rate or want to support arbitrary
  chained Ogg, then there is a problem.
 
  I think we already have a problem with width and height for chained
  Ogg and we cannot stop people from putting chained Ogg into the @src.
 
  I actually took this discussion away from MPEG PTM, which is where
  Eric's question came from, because I don't understand how it works
  with MPEG. But I can see that it's not just a problem of MPEG, but
  also of Ogg (and possibly of WebM which can have multiple Segments).
  So, I think we need a generic solution for it.
 
   The characteristics of an Apple HTTP live stream can change on the
 fly. For example if the user's bandwidth to the streaming server
 changes, the video width and height can change as the stream resolution
 is switched up or down, or the number of tracks can change when a stream
 switches from video+audio to audio only. In addition, a server can
 insert segments with different characteristics into a stream on the fly,
 eg. inserting an ad or emergency announcement.
 
   It is not possible to predict these changes before they occur.
 
 eric

For commercial video providers, the tracks in a live stream change all the 
time; this is not limited to audio and video tracks but would include text 
tracks as well. 

Bob Lund



Re: [whatwg] Video feedback

2011-06-08 Thread Silvia Pfeiffer
On Thu, Jun 9, 2011 at 1:57 AM, Bob Lund b.l...@cablelabs.com wrote:


 -Original Message-
 From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-
 boun...@lists.whatwg.org] On Behalf Of Eric Carlson
 Sent: Wednesday, June 08, 2011 9:34 AM
 To: Silvia Pfeiffer; Philip Jägenstedt
 Cc: whatwg@lists.whatwg.org
 Subject: Re: [whatwg] Video feedback


 On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote:

  Nothing exposed via the current API would change, AFAICT.
 
  Thus, after a change mid-stream to, say,  a smaller video width and
  height, would the video.videoWidth and video.videoHeight attributes
  represent the width and height of the previous stream or the current
  one?
 
 
  I agree that if we
  start exposing things like sampling rate or want to support arbitrary
  chained Ogg, then there is a problem.
 
  I think we already have a problem with width and height for chained
  Ogg and we cannot stop people from putting chained Ogg into the @src.
 
  I actually took this discussion away from MPEG PTM, which is where
  Eric's question came from, because I don't understand how it works
  with MPEG. But I can see that it's not just a problem of MPEG, but
  also of Ogg (and possibly of WebM which can have multiple Segments).
  So, I think we need a generic solution for it.
 
   The characteristics of an Apple HTTP live stream can change on the
 fly. For example if the user's bandwidth to the streaming server
 changes, the video width and height can change as the stream resolution
 is switched up or down, or the number of tracks can change when a stream
 switches from video+audio to audio only. In addition, a server can
 insert segments with different characteristics into a stream on the fly,
 eg. inserting an ad or emergency announcement.

   It is not possible to predict these changes before they occur.

 eric

 For commercial video providers, the tracks in a live stream change all the 
 time; this is not limited to audio and video tracks but would include text 
 tracks as well.

OK, all this indicates to me that we probably want a metadatachanged
event to indicate there has been a change and that JS may need to
check some of its assumptions.

Silvia.


Re: [whatwg] Video feedback

2011-06-07 Thread Philip Jägenstedt
On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:




On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson i...@hixie.ch wrote:

On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:


I do not know how technically the change of stream composition works in
MPEG, but in Ogg we have to end a current stream and start a new one to
switch compositions. This has been called sequential multiplexing or
chaining. In this case, stream setup information is repeated, which
would probably lead to creating a new steam handler and possibly a new
firing of loadedmetadata. I am not sure how chaining is implemented  
in

browsers.


Per spec, chaining isn't currently supported. The closest thing I can  
find
in the spec to this situation is handling a non-fatal error, which  
causes

the unexpected content to be ignored.


On Fri, 17 Dec 2010, Eric Winkelman wrote:


The short answer for changing stream composition is that there is a
Program Map Table (PMT) that is repeated every 100 milliseconds and
describes the content of the stream.  Depending on the programming, the
stream's composition could change entering/exiting every advertisement.


If this is something that browser vendors want to support, I can specify
how to handle it. Anyone?


Icecast streams have chained files, so streaming Ogg to an audio
element would hit this problem. There is a bug in FF for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
also a webkit bug for icecast streaming, which is probably related
https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
is able to deal with icecast streams, but it seems to deal with it.

The thing is: you can implement playback and seeking without any
further changes to the spec. But then the browser-internal metadata
states will change depending on the chunk you're on. Should that also
update the exposed metadata in the API then? Probably yes, because
otherwise the JS developer may deal with contradictory information.
Maybe we need a metadatachange event for this?


An Icecast stream is conceptually just one infinite audio stream, even  
though at the container level it is several chained Ogg streams. duration  
will be Infinity and currentTime will be constantly increasing. This  
doesn't seem to be a case where any spec change is needed. Am I missing  
something?


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-03 Thread Philip Jägenstedt

On Fri, 03 Jun 2011 01:28:45 +0200, Ian Hickson i...@hixie.ch wrote:


 On Fri, 22 Oct 2010, Simon Pieters wrote:


Actually it was me, but that's OK :)


  There was also some discussion about metadata. Language is sometimes
  necessary for the font engine to pick the right glyph.

 Could you elaborate on this? My assumption was that we'd just use CSS,
 which doesn't rely on language for this.

It's not in any spec that I'm aware of, but some browsers (including
Opera) pick different glyphs depending on the language of the text,
which really helps when rendering CJK when you have several CJK fonts on
the system. Browsers will already know the language from track
srclang, so this would be for external players.


How is this problem solved in SRT players today?


Not at all, it seems. Both VLC and Totem allow setting the character  
encoding and font used for subtitles in the (global) preferences menu, so  
presumably you would change that if the default doesn't work. Font  
switching seems to mainly be an issue when your system has other default  
fonts than the text you're reading, and it appears that is rare enough  
that very little software does anything about it, browsers perhaps being  
an exception.





On Mon, 3 Jan 2011, Philip Jägenstedt wrote:


  * The bad cue handling is stricter than it should be. After
  collecting an id, the next line must be a timestamp line. Otherwise,
  we skip everything until a blank line, so in the following the
  parser would jump to bad cue on line 2 and skip the whole cue.
 
  1
  2
  00:00:00.000 -- 00:00:01.000
  Bla
 
  This doesn't match what most existing SRT parsers do, as they simply
  look for timing lines and ignore everything else. If we really need
  to collect the id instead of ignoring it like everyone else, this
  should be more robust, so that a valid timing line always begins a
  new cue. Personally, I'd prefer if it is simply ignored and that we
  use some form of in-cue markup for styling hooks.

 The IDs are useful for referencing cues from script, so I haven't
 removed them. I've also left the parsing as is for when neither the
 first nor second line is a timing line, since that gives us a lot of
 headroom for future extensions (we can do anything so long as the
 second line doesn't start with a timestamp and -- and another
 timestamp).

In the case of feeding future extensions to current parsers, it's way
better fallback behavior to simply ignore the unrecognized second line
than to discard the entire cue. The current behavior seems unnecessarily
strict and makes the parser more complicated than it needs to be. My
preference is just ignore anything preceding the timing line, but even
if we must have IDs it can still be made simpler and more robust than
what is currently spec'ed.


If we just ignore content until we hit a line that happens to look like a
timing line, then we are much more constrained in what we can do in the
future. For example, we couldn't introduce a comment block syntax,  
since

any comment containing a timing line wouldn't be ignored. On the other
hand if we keep the syntax as it is now, we can introduce a comment block
just by having its first line include a -- but not have it match the
timestamp syntax, e.g. by having it be -- COMMENT or some such.


One of us must be confused, do you mean something like this?

1
-- COMMENT
00:00.000 -- 00:01.000
Cue text

Adding this syntax would break the *current* parser, as it would fail in  
step 39 (Collect WebVTT cue timings and settings) and then skip the rest  
of the cue. If we want any room for extensions along these lines, then  
multiple lines preceding the timing line must be handled gracefully.



Looking at the parser more closely, I don't really see how doing anything
more complex than skipping the block entirely would be simpler than what
we have now, anyway.


I suggest:

 * Step 31: Try to collect WebVTT cue timings and settings instead of  
checking for the substring --. If it succeeds, jump to what is now step  
40. If it fails, continue at what is now step 32. (This allows adding any  
syntax as long as it doesn't exactly match a timing line, including --  
COMMENT. As a bonus, one can fail faster when trying to parse an entire  
timing line rather than doing a substring search for --.)


 * Step 32: Only set the id line if it's not already set. (Assuming we  
want the first line to be the id line in future extensions.)


 * Step 39: Jump to the new step 31.

In case not every detail is correct, the idea is to first try to match a  
timing line and to take the first line that is not a timing line (if any)  
as the id, leaving everything in between open for future syntax changes,  
even if they use --.


I think it's fairly important that we handle this. Double id lines is an  
easy mistake to make when copying things around. Silently dropping those  
cues would be worse than what many existing (line-based, id-ignoring) SRT  
parsers do.





On Sat, 22 Jan 2011, 

Re: [whatwg] Video feedback

2011-06-03 Thread Silvia Pfeiffer
I'll be replying to WebVTT related stuff in a separate thread. Here
just feedback on the other stuff.

(Incidentally: why is there details element feedback in here with
video? I don't really understand the connection.)



On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson i...@hixie.ch wrote:
 On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:

 I do not know how technically the change of stream composition works in
 MPEG, but in Ogg we have to end a current stream and start a new one to
 switch compositions. This has been called sequential multiplexing or
 chaining. In this case, stream setup information is repeated, which
 would probably lead to creating a new steam handler and possibly a new
 firing of loadedmetadata. I am not sure how chaining is implemented in
 browsers.

 Per spec, chaining isn't currently supported. The closest thing I can find
 in the spec to this situation is handling a non-fatal error, which causes
 the unexpected content to be ignored.


 On Fri, 17 Dec 2010, Eric Winkelman wrote:

 The short answer for changing stream composition is that there is a
 Program Map Table (PMT) that is repeated every 100 milliseconds and
 describes the content of the stream.  Depending on the programming, the
 stream's composition could change entering/exiting every advertisement.

 If this is something that browser vendors want to support, I can specify
 how to handle it. Anyone?

Icecast streams have chained files, so streaming Ogg to an audio
element would hit this problem. There is a bug in FF for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
also a webkit bug for icecast streaming, which is probably related
https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
is able to deal with icecast streams, but it seems to deal with it.

The thing is: you can implement playback and seeking without any
further changes to the spec. But then the browser-internal metadata
states will change depending on the chunk you're on. Should that also
update the exposed metadata in the API then? Probably yes, because
otherwise the JS developer may deal with contradictory information.
Maybe we need a metadatachange event for this?



 On Tue, 24 May 2011, Silvia Pfeiffer wrote:

 Ian and I had a brief conversation recently where I mentioned a problem
 with extended text descriptions with screen readers (and worse still
 with braille devices) and the suggestion was that the paused for user
 interaction state of a media element may be the solution. I would like
 to pick this up and discuss in detail how that would work to confirm my
 sketchy understanding.

 *The use case:*

 In the specification for media elements we have a track kind of
 descriptions, which are:
 Textual descriptions of the video component of the media resource,
 intended for audio synthesis when the visual component is unavailable
 (e.g. because the user is interacting with the application without a
 screen while driving, or because the user is blind). Synthesized as a
 separate audio track.

 I'm for now assuming that the synthesis will be done through a screen
 reader and not through the browser itself, thus making the
 descriptions available to users as synthesized audio or as braille if
 the screen reader is set up for a braille device.

 The textual descriptions are provided as chunks of text with a start
 and a end time (so-called cues). The cues are processed during video
 playback as the video's playback time starts to fall within the time
 frame of the cue. Thus, it is expected the that cues are consumed
 during the cue's time frame and are not present any more when the end
 time of the cue is reached, so they don't conflict with the video's
 normal audio.

 However, on many occasions, it is not possible to consume the cue text
 in the given time frame. In particular not in the following
 situations:

 1. The screen reader takes longer to read out the cue text than the
 cue's time frame provides for. This is particularly the case with long
 cue text, but also when the screen reader's reading rate is slower
 than what the author of the cue text expected.

 2. The braille device is used for reading. Since reading braille is
 much slower than listening to read-out text, the cue time frame will
 invariably be too short.

 3. The user seeked right into the middle of a cue and thus the time
 frame that is available for reading out the cue text is shorter than
 the cue author calculated with.

 Correct me if I'm wrong, but it seems that what we need is a way for
 the screen reader to pause the video element from continuing to play
 while the screen reader is still busy delivering the cue text. (In
 a11y talk: what is required is a means to deal with extended
 descriptions, which extend the timeline of the video.) Once it's
 finished presenting, it can resume the video element's playback.

 Is it a requirement that the user be able to use the regular 

Re: [whatwg] Video feedback

2011-06-02 Thread Glenn Maynard
On Thu, Jun 2, 2011 at 7:28 PM, Ian Hickson i...@hixie.ch wrote:
 We can add comments pretty easily (e.g. we could say that ! starts a
 comment and  ends it -- that's already being ignored by the current
 parser), if people really need them. But are comments really that useful?
 Did SRT have problem due to not supporting inline comments? (Or did it
 support inline comments?)

I've only worked with SSA subtitles (fansubbing), where {text in
braces} effectively worked as a comment.  We used them a lot to
communicate between editors on a phrase-by-phrase basis.

But for that use case, using hidden spans makes more sense, since you
can toggle them on and off to view them inline, etc.

Given that, I'd be fine with a comment format that doesn't allow
mid-cue comments, if it makes the format simpler.

 The text on the left is a transcription, the top is a transliteration,
 and the bottom is a translation.

 Aren't these three separate text tracks?

They're all in the same track, in practice, since media players don't
play multiple subtitle tracks.

It's true that having them in separate tracks would be better, so they
can be disabled individually.  This is probably rare enough that it
should just be sorted out with scripts, at least to start.

 It's not clear to me that we need language information to apply proper
 font selection and word wrapping, since CSS doesn't do it.

But it doesn't have to, since HTML does this with @lang.

 Mixing one CJK language with one non-CJK language seems fine. That should
 always work, assuming you specify good fonts in the CSS.

The font is ultimately in the user's control.  I tell Firefox to
always use Tahoma for Western text and MS Gothic for Japanese text,
ignoring the often ugly site-specified fonts.  The only control sites
have over my fonts is the language they say the text is (or which the
whole page is detected as).  The same principle seems to apply for
captions.

(That's not to say that it's important enough to add yet and I'm fine
with punting on this, at least for now.  I just don't think specifying
fonts is the right solution.)

The most straightforward solution would seems to be having @lang be a
CSS property; I don't know the rationale for this being done by HTML
instead.

 I don't understand why we can't have good typography for CJK and non-CJK
 together. Surely there are fonts that get both right?

I've never seen a Japanese font that didn't look terrible for English
text.  Also, I don't want my font selection to be severely limited due
to the need to use a single font for both languages, instead of using
the right font for the right text.

 One example of how this can be tricky: at 0:17, a caption on the bottom
 wraps and takes two lines, which then pushes the line at 0:19 upward
 (that part's simple enough).  If instead the top part had appeared
 first, the renderer would need to figure out in advance to push it
 upwards, to make space for the two-line caption underneith it.
 Otherwise, the captions would be forced to switch places.

 Right, without lookahead I don't know how you'd solve it. With lookahead
 things get pretty dicey pretty quickly.

The problem is that, at least here, the whole scene is nearly
incomprehensible if the top/bottom arrangement isn't maintained.
Lacking anything better, I suspect authors would use similar brittle
hacks with WebVTT.

Anyway, I don't have a simple solution either.

 I think that, no matter what you do, people will insert line breaks in
 cues.  I'd follow the HTML model here: convert newlines to spaces and
 have a separate, explicit line break like br if needed, so people
 don't manually line-break unless they actually mean to.

 The line-breaks-are-line-breaks feature is one of the features that
 originally made SRT seem like a good idea. It still seems like the neatest
 way of having a line break.

But does this matter?  Line breaks within a cue are relatively
uncommon in my experience (perhaps it's different for other
languages), compared to how many people will insert line breaks in a
text editor simply to break lines while authoring.  If you do this
while testing on a large monitor, it's likely to look reasonable when
rendered; the brokenness won't show up until it's played in a smaller
window.  Anyone using a non-programmer's text editor that doesn't
handle long lines cleanly is likely to do this.

Wrapping lines manually in SRTs also appears to be common (even
standard) practice, perhaps due to inadequate line wrapping in SRT
renderers.  Making line breaks explicit should help keep people from
translating this habit to WebVTT.

 Related to line breaking, should there be an nbsp; escape?  Inserting
 nbsp literally into files is somewhat annoying for authoring, since
 they're indistinguishable from regular spaces.

 How common would nbsp; be?

I guess the main cases I've used nbsp for don't apply so much to
captions, eg. ©nbsp;2011 (likely to come at the start of a caption,
so not likely to be wrapped anyway).

 We 

Re: [whatwg] video feedback

2010-02-10 Thread Brian Campbell
On Feb 9, 2010, at 9:03 PM, Ian Hickson wrote:

 On Sat, 31 Oct 2009, Brian Campbell wrote:
 
 As a multimedia developer, I am wondering about the purpose of the timeupdate
 event on media elements.
 
 It's primary use is keeping the UIs updated (specifically the timers and 
 the scrubber bars).
 
 
 On first glance, it would appear that this event would be useful for 
 synchronizing animations, bullets, captions, UI, and the like.
 
 Synchronising accompanying slides and animations won't work that well with 
 an event, since you can't guarantee the timing of the event or anything 
 like that. For anything where we want reliable synchronisation of multiple 
 media, I think we need a more serious solution -- either something like 
 SMIL, or the SMIL subset found in SVG, or some other solution.

Yes, but that doesn't exist at the moment, so our current choices are to use 
timeupdate and to use setInterval().

 At 4 timeupdate events per second, it isn't all that useful. I can 
 replace it with setInterval, at whatever rate I want, query the time, 
 and get the synchronization I need, but that makes the timeupdate event 
 seem to be redundant.
 
 The important thing with timeupdate is that it also fires whenever the 
 time changes in a significant way, e.g. immediately after a seek, or when 
 reaching the end of the resource, etc. Also, the user agent can start 
 lowering the rate in the face of high CPU load, which makes it more 
 user-friendly than setInterval().

I agree, it is important to be able to reduce the rate in the face of high CPU 
load, but as currently implemented in WebKit, if you use timeupdate to keep 
anything in sync with the video, it feels fairly laggy and jerky. This means 
that for higher quality synchronization, you need to use setInterval, which 
defeats the purpose of making timeupdate more user friendly.

Perhaps this is just a bug I should file to WebKit, as they are choosing an 
update interval at the extreme end of the allowed range for their default 
behavior; but I figured that it might make sense to mention a reasonable 
default value (such as 30 times per second, or once per frame displayed) in the 
spec, to give some guidance to browser vendors about what authors will be 
expecting.

 On Thu, 5 Nov 2009, Brian Campbell wrote:
 
 Would something like video firing events for every frame rendered 
 help you out?  This would help also fix the canvas over/under 
 painting issue and improve synchronization.
 
 Yes, this would be considerably better than what is currently specced.
 
 There surely is a better solution than copying data from the video 
 element to a canvas on every frame for whatever the problem that that 
 solves is. What is the actual use case where you'd do that?

This was not my use case (my use case was just synchronizing bullets, slide 
transitions, and animations to video), but an example I can think of is using 
this to composite video. Most (if not all) video formats supported by video 
in the various browsers do not store alpha channel information. In order to 
composite video against a dynamic background, authors may copy video data to a 
canvas, then paint transparent to all pixels matching a given color.

This use case would clearly be better served by video formats that include 
alpha information, and implementations that support compositing video over 
other content, but given that we're having trouble finding any video format at 
all that the browsers can agree on, this seems to be a long way off, so 
stop-gap measures may be useful in the interim.

Compositing video over dynamic content is actually an extremely important use 
case for rich, interactive multimedia, which I would like to encourage browser 
vendors to implement, but I'm not even sure where to start, given the situation 
on formats and codecs. I believe I've seen this discussed in Theora, but never 
went anywhere, and I don't have any idea how I'd even start getting involved in 
the MPEG standardization process.

 On Thu, 5 Nov 2009, Andrew Scherkus wrote:
 
 I'll see if we can do something for WebKit based browsers, because today 
 it literally is hardcoded to 250ms for all ports. 
 http://trac.webkit.org/browser/trunk/WebCore/html/HTMLMediaElement.cpp#L1254
 
 Maybe we'll end up firing events based on frame updates for video, and 
 something arbitrary for audio (as it is today).
 
 I strongly recommend making the ontimeupdate rate be sensitive to system 
 load, and no faster than one frame per second.

I'm assuming that you mean no faster than once per frame?

 On Fri, 6 Nov 2009, Philip Jägenstedt wrote:
 
 We've considered firing it for each frame, but there is one problem. If 
 people expect that it fires once per frame they will probably write 
 scripts which do frame-based animations by moving things n pixels per 
 frame or similar. Some animations are just easier to do this way, so 
 there's no reason to think that people won't do it. This will break 
 horribly if a browser is 

Re: [whatwg] video feedback

2010-02-10 Thread Eric Carlson

On Feb 10, 2010, at 8:01 AM, Brian Campbell wrote:

 On Feb 9, 2010, at 9:03 PM, Ian Hickson wrote:
 
 On Sat, 31 Oct 2009, Brian Campbell wrote:
 
 At 4 timeupdate events per second, it isn't all that useful. I can 
 replace it with setInterval, at whatever rate I want, query the time, 
 and get the synchronization I need, but that makes the timeupdate event 
 seem to be redundant.
 
 The important thing with timeupdate is that it also fires whenever the 
 time changes in a significant way, e.g. immediately after a seek, or when 
 reaching the end of the resource, etc. Also, the user agent can start 
 lowering the rate in the face of high CPU load, which makes it more 
 user-friendly than setInterval().
 
 I agree, it is important to be able to reduce the rate in the face of high 
 CPU load, but as currently implemented in WebKit, if you use timeupdate to 
 keep anything in sync with the video, it feels fairly laggy and jerky. This 
 means that for higher quality synchronization, you need to use setInterval, 
 which defeats the purpose of making timeupdate more user friendly.
 
 Perhaps this is just a bug I should file to WebKit, as they are choosing an 
 update interval at the extreme end of the allowed range for their default 
 behavior; but I figured that it might make sense to mention a reasonable 
 default value (such as 30 times per second, or once per frame displayed) in 
 the spec, to give some guidance to browser vendors about what authors will be 
 expecting.
 
  I disagree that 30 times per second is a reasonable default. I understand 
that it would be useful for what you want to do, but your use case is not a 
typical. I think most pages won't listen for 'timeupdate' events at all so 
instead of making every page incur the extra overhead of waking up, allocating, 
queueing, and firing an event 30 times per second, WebKit sticks with  the 
minimum frequency the spec mandates figuring that people like you that need 
something more can roll their own.


 On Thu, 5 Nov 2009, Brian Campbell wrote:
 
 Would something like video firing events for every frame rendered 
 help you out?  This would help also fix the canvas over/under 
 painting issue and improve synchronization.
 
 Yes, this would be considerably better than what is currently specced.
 
 There surely is a better solution than copying data from the video 
 element to a canvas on every frame for whatever the problem that that 
 solves is. What is the actual use case where you'd do that?
 
 This was not my use case (my use case was just synchronizing bullets, slide 
 transitions, and animations to video), but an example I can think of is using 
 this to composite video. Most (if not all) video formats supported by video 
 in the various browsers do not store alpha channel information. In order to 
 composite video against a dynamic background, authors may copy video data to 
 a canvas, then paint transparent to all pixels matching a given color.
 
 This use case would clearly be better served by video formats that include 
 alpha information, and implementations that support compositing video over 
 other content, but given that we're having trouble finding any video format 
 at all that the browsers can agree on, this seems to be a long way off, so 
 stop-gap measures may be useful in the interim.
 
 Compositing video over dynamic content is actually an extremely important use 
 case for rich, interactive multimedia, which I would like to encourage 
 browser vendors to implement, but I'm not even sure where to start, given the 
 situation on formats and codecs. I believe I've seen this discussed in 
 Theora, but never went anywhere, and I don't have any idea how I'd even start 
 getting involved in the MPEG standardization process.
 
  Have you actually tried this? Rendering video frames to a canvas and 
processing every pixel from script is *extremely* processor intensive, you are 
unlikely to get reasonable frame rate. 

  The H.262 does support alpha (see AVC spec 2nd edition, section 7.3.2.1.2 
Sequence parameter set extension), but we do not support it correctly in WebKit 
at the moment. *Please* file bugs against WebKit if you would like to see this 
properly supported. QuickTime movies support alpha for a number of video 
formats (eg. png, animation, lossless, etc), you might give that a try.

eric


Re: [whatwg] video feedback

2010-02-10 Thread Boris Zbarsky

On 2/10/10 1:37 PM, Eric Carlson wrote:

   Have you actually tried this? Rendering video frames to a canvas and 
processing every pixel from script is *extremely* processor intensive, you are 
unlikely to get reasonable frame rate.


There's a demo that does just this at 
http://people.mozilla.com/~prouget/demos/green/green.xhtml


-Boris


Re: [whatwg] video feedback

2010-02-10 Thread Brian Campbell
On Feb 10, 2010, at 1:37 PM, Eric Carlson wrote:

 
 On Feb 10, 2010, at 8:01 AM, Brian Campbell wrote:
 
 On Feb 9, 2010, at 9:03 PM, Ian Hickson wrote:
 
 On Sat, 31 Oct 2009, Brian Campbell wrote:
 
 At 4 timeupdate events per second, it isn't all that useful. I can 
 replace it with setInterval, at whatever rate I want, query the time, 
 and get the synchronization I need, but that makes the timeupdate event 
 seem to be redundant.
 
 The important thing with timeupdate is that it also fires whenever the 
 time changes in a significant way, e.g. immediately after a seek, or when 
 reaching the end of the resource, etc. Also, the user agent can start 
 lowering the rate in the face of high CPU load, which makes it more 
 user-friendly than setInterval().
 
 I agree, it is important to be able to reduce the rate in the face of high 
 CPU load, but as currently implemented in WebKit, if you use timeupdate to 
 keep anything in sync with the video, it feels fairly laggy and jerky. This 
 means that for higher quality synchronization, you need to use setInterval, 
 which defeats the purpose of making timeupdate more user friendly.
 
 Perhaps this is just a bug I should file to WebKit, as they are choosing an 
 update interval at the extreme end of the allowed range for their default 
 behavior; but I figured that it might make sense to mention a reasonable 
 default value (such as 30 times per second, or once per frame displayed) in 
 the spec, to give some guidance to browser vendors about what authors will 
 be expecting.
 
 I disagree that 30 times per second is a reasonable default. I understand 
 that it would be useful for what you want to do, but your use case is not a 
 typical. I think most pages won't listen for 'timeupdate' events at all so 
 instead of making every page incur the extra overhead of waking up, 
 allocating, queueing, and firing an event 30 times per second, WebKit sticks 
 with  the minimum frequency the spec mandates figuring that people like you 
 that need something more can roll their own.

Do browsers fire events for which there are no listeners? It seems like it 
would be easiest to just not fire these events if no one is listening to them.

And as Ian pointed out, just basic video UI can be better served by having at 
least 10 updates per second, if you want to show time at a resolution of tenths 
of a second.

 On Thu, 5 Nov 2009, Brian Campbell wrote:
 
 Would something like video firing events for every frame rendered 
 help you out?  This would help also fix the canvas over/under 
 painting issue and improve synchronization.
 
 Yes, this would be considerably better than what is currently specced.
 
 There surely is a better solution than copying data from the video 
 element to a canvas on every frame for whatever the problem that that 
 solves is. What is the actual use case where you'd do that?
 
 This was not my use case (my use case was just synchronizing bullets, slide 
 transitions, and animations to video), but an example I can think of is 
 using this to composite video. Most (if not all) video formats supported by 
 video in the various browsers do not store alpha channel information. In 
 order to composite video against a dynamic background, authors may copy 
 video data to a canvas, then paint transparent to all pixels matching a 
 given color.
 
 This use case would clearly be better served by video formats that include 
 alpha information, and implementations that support compositing video over 
 other content, but given that we're having trouble finding any video format 
 at all that the browsers can agree on, this seems to be a long way off, so 
 stop-gap measures may be useful in the interim.
 
 Compositing video over dynamic content is actually an extremely important 
 use case for rich, interactive multimedia, which I would like to encourage 
 browser vendors to implement, but I'm not even sure where to start, given 
 the situation on formats and codecs. I believe I've seen this discussed in 
 Theora, but never went anywhere, and I don't have any idea how I'd even 
 start getting involved in the MPEG standardization process.
 
 Have you actually tried this? Rendering video frames to a canvas and 
 processing every pixel from script is *extremely* processor intensive, you 
 are unlikely to get reasonable frame rate. 

Mozilla has a demo of this working, in Firefox only:

https://developer.mozilla.org/samples/video/chroma-key/index.xhtml

But no, this isn't something I would consider to be production quality. But 
perhaps if the WebGL typed arrays catch on, and start being used in more 
places, you might be able to start doing this with reasonable performance.

 The H.262 does support alpha (see AVC spec 2nd edition, section 7.3.2.1.2 
 Sequence parameter set extension), but we do not support it correctly in 
 WebKit at the moment. *Please* file bugs against WebKit if you would like to 
 see this properly supported. QuickTime movies support alpha for 

Re: [whatwg] video feedback

2010-02-10 Thread Boris Zbarsky

On 2/10/10 2:19 PM, Brian Campbell wrote:

Do browsers fire events for which there are no listeners?


It varies.  Gecko, for example, fires image load events not matter what 
but only fires mutation events if there are listeners.


-Boris


Re: [whatwg] video feedback

2010-02-10 Thread Jonas Sicking
On Wed, Feb 10, 2010 at 11:29 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 2/10/10 2:19 PM, Brian Campbell wrote:

 Do browsers fire events for which there are no listeners?

 It varies.  Gecko, for example, fires image load events not matter what but
 only fires mutation events if there are listeners.

However checking for listeners has a non-trivial cost. You have to
walk the full parentNode chain and see if any of the parents has a
listener. This applies to both bubbling and non-bubbling events due to
the capture phase.

Also, feature which requires implementations to optimize for the
feature not being used seems like a questionable feature to me. We
want people to use the stuff we're creating, there's little point
otherwise.

/ Jonas


Re: [whatwg] video feedback

2010-02-10 Thread Robert O'Callahan
On Thu, Feb 11, 2010 at 8:19 AM, Brian Campbell lam...@continuation.orgwrote:

 But no, this isn't something I would consider to be production quality. But
 perhaps if the WebGL typed arrays catch on, and start being used in more
 places, you might be able to start doing this with reasonable performance.


With WebGL you could do the chroma-key processing on the GPU, and
performance should be excellent. In fact you could probably prototype this
today in Firefox.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] video feedback

2010-02-10 Thread Gregory Maxwell
On Wed, Feb 10, 2010 at 4:37 PM, Robert O'Callahan rob...@ocallahan.org wrote:
 On Thu, Feb 11, 2010 at 8:19 AM, Brian Campbell lam...@continuation.org
 wrote:

 But no, this isn't something I would consider to be production quality.
 But perhaps if the WebGL typed arrays catch on, and start being used in more
 places, you might be able to start doing this with reasonable performance.

 With WebGL you could do the chroma-key processing on the GPU, and
 performance should be excellent. In fact you could probably prototype this
 today in Firefox.

You're not going to get solid professional quality keying results just
by depending on a client side keying algorithm, even a computationally
expensive one, without the ability to perform manual fixups.

Being able to manipulate video data on the client is a powerful tool,
but its not necessarily the right tool for every purpose.


Re: [whatwg] video feedback

2010-02-10 Thread Silvia Pfeiffer
On Thu, Feb 11, 2010 at 3:01 AM, Brian Campbell lam...@continuation.org wrote:
 On Feb 9, 2010, at 9:03 PM, Ian Hickson wrote:

 On Sat, 7 Nov 2009, Silvia Pfeiffer wrote:

 I use timeupdate to register a callback that will update
 captions/subtitles.

 That's only a temporary situation, though, so it shouldn't inform our
 decision. We should in due course develop much better solutions for
 captions and time-synchronised animations.

 The problem is, due to the slow pace of standards and browser development, we 
 can sometimes be stuck with a temporary feature for many years. How long 
 until enough IE users support HTML6 (or whatever standard includes a 
 time-synchronization feature) for it to be usable? 10, 15 years?

Even when we have a standard means of associate captions/subtitles
with audio/video, we still want to allow for overriding the default
presentation of these and do it all in JavaScript ourselves.

I have just been pointed to a cool lyrics demo at
http://svg-wow.org/audio/animated-lyrics.html which uses an audio file
and essentially a caption file to display the lyrics in sync in svg.
Problem is: they are using setInterval and setTimeout on the audio and
that breaks synchronisation for me - probably because loading the
audio over the distance takes longer than no time.

Honestly, you cannot use setInterval for synchronising with a/v. You
really need timeupdate.

Maybe one option for pages that need a higher event firing rate than
the default of the browser is to introduce a javascript api that lets
it be set to anything between once per frame (25Hz) and every 250ms
(4Hz)? I'm just wary what it may do to the responsiveness of the
browser and whether the browser could refuse if it knew it would kill
the performance.

Cheers,
Silvia.


Re: [whatwg] video feedback

2009-04-28 Thread Ian Hickson
On Thu, 26 Mar 2009, Matthew Gregan wrote:
 At 2009-03-25T10:16:32+, Ian Hickson wrote:
  On Fri, 13 Mar 2009, Matthew Gregan wrote:
   It's possible that neither a 'play' nor 'playing' event will be fired
   when a media element that has ended playback is played again.  When
   first played, paused is set to false.  When played again, playback has
   ended, so play() seeks to the beginning, but paused does not change (as
   it's already false), so the substeps that may fire play or playing are
   not run.
 
  'playing' should fire, though, since the readyState will have dropped down
  to HAVE_CURRENT_DATA when the clip is ended, and will drop back up to
  HAVE_FUTURE_DATA after seeking.
 
 Right, so your intention is to interpret it thusly: readyState becomes
 HAVE_CURRENT_DATA when playback ends because it's not possible for the
 playback position to advance any further, and thus it's not possible to have
 data beyond the current playback position (which HAVE_FUTURE_DATA is
 predicated upon).
 
 Makes sense, but can the spec be made clearer about the behaviour in this
 case?  HAVE_FUTURE_DATA talks about advancing *without reverting to
 HAVE_METADATA*, which doesn't apply in this case because we have all the
 data available locally.

Clarified.


 Based on that interpretation, when the user sets playbackRate to -1 
 after playback ends, the readyState would change from HAVE_CURRENT_DATA 
 to HAVE_FUTURE_DATA because the current playback position can now 
 advance.

I've made a bunch of changes to fix how things work when the direction of 
playback is backwards; there were some odd things in the way it was 
defined before (for example the previous definition actually had the 
playback position go infinitely negative and didn't stop at the start of 
the clip!).



 Following this logic, if playbackRate is set to 0 at any time, the 
 readyState becomes HAVE_ENOUGH_DATA, as advancing the playback position 
 by 0 units means the playback position can never overtake the available 
 data before playback ends.  Except this case seems to be specially 
 handled by:
 
The playbackRate can be 0.0, in which case the current playback
position doesn't move, despite playback not being paused (paused
doesn't become true, and the pause event doesn't fire).
 
 ...which uses the term move rather than advance, but suggests that 
 the concept of the playbackRate advancing by 0 isn't consider advancing, 
 which seems logical.

I've clarified the uses of advance that I could find.


Let me know if the spec is still ambiguous.

Thanks!

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] video feedback

2009-03-25 Thread Matthew Gregan
At 2009-03-25T10:16:32+, Ian Hickson wrote:
 On Fri, 13 Mar 2009, Matthew Gregan wrote:
  It's possible that neither a 'play' nor 'playing' event will be fired
  when a media element that has ended playback is played again.  When
  first played, paused is set to false.  When played again, playback has
  ended, so play() seeks to the beginning, but paused does not change (as
  it's already false), so the substeps that may fire play or playing are
  not run.

 'playing' should fire, though, since the readyState will have dropped down
 to HAVE_CURRENT_DATA when the clip is ended, and will drop back up to
 HAVE_FUTURE_DATA after seeking.

Right, so your intention is to interpret it thusly: readyState becomes
HAVE_CURRENT_DATA when playback ends because it's not possible for the
playback position to advance any further, and thus it's not possible to have
data beyond the current playback position (which HAVE_FUTURE_DATA is
predicated upon).

Makes sense, but can the spec be made clearer about the behaviour in this
case?  HAVE_FUTURE_DATA talks about advancing *without reverting to
HAVE_METADATA*, which doesn't apply in this case because we have all the
data available locally.

(Also, note that after the seek it'd return directly to HAVE_ENOUGH_DATA in
the case I'm talking about, since the media is fully cached.  That still
requires a 'playing' event to fire, so that's fine.)

Based on that interpretation, when the user sets playbackRate to -1 after
playback ends, the readyState would change from HAVE_CURRENT_DATA to
HAVE_FUTURE_DATA because the current playback position can now advance.

Following this logic, if playbackRate is set to 0 at any time, the
readyState becomes HAVE_ENOUGH_DATA, as advancing the playback position by 0
units means the playback position can never overtake the available data
before playback ends.  Except this case seems to be specially handled by:

   The playbackRate can be 0.0, in which case the current playback
   position doesn't move, despite playback not being paused (paused
   doesn't become true, and the pause event doesn't fire).

...which uses the term move rather than advance, but suggests that the
concept of the playbackRate advancing by 0 isn't consider advancing, which
seems logical.

Cheers,
-mjg
-- 
Matthew Gregan |/
  /|kine...@flim.org