Re: [whatwg] VIDEO and pitchAdjustment
> On Nov 12, 2015, at 9:34 AM, Philip Jägenstedt <phil...@opera.com> wrote: > > On Thu, Nov 12, 2015 at 9:07 AM, Garrett Smith <dhtmlkitc...@gmail.com > <mailto:dhtmlkitc...@gmail.com>> wrote: >> >> On 10/19/15, Philip Jägenstedt <phil...@opera.com> wrote: >>> On Tue, Sep 1, 2015 at 11:21 AM, Philip Jägenstedt <phil...@opera.com> >>> wrote: >>>> On Mon, Aug 31, 2015 at 9:48 PM, Domenic Denicola <d...@domenic.me> wrote: >>>>> From: Eric Carlson [mailto:eric.carl...@apple.com] >>>>> >> >> [...] >>> I've filed a spec issue to make it so: >>> https://github.com/whatwg/html/issues/262 >>> >>> If there's any implementor interest in pitch control that goes beyond >>> (independently) or that, please file a separate issue. >>> >> >> They won't. >> >> You can hold the standard of "they need to come here and post up >> cogent arguments in favor of feature X", but it ain't gonna happen >> that way. >> >> There just isn't a whole lot of money in music education. How many >> music education companies are W3C members? >> >> Unlike technology companies like Facebook, Google, Nokia, Opera, and >> other companies the post here, small music education operations like >> artistworks, jammit, licklibrary, etc are more about their domain — >> "music" — than they are about technology. >> >> Major music education websites are still using Flash; their developers >> are busy fixing broken links, making the login feature, database, etc >> work, etc. Flash is not nice but they apparently were not funded or >> motivated enough by the existing HTML5 HTMLMediaElement to use it >> instead. >> >> Control over playbackRate has a greater value than pitch control. But >> those sites don't even allow the students to change the playbackRate >> because they're still using Flash. >> >> You won't read posts here about what students have to say about it the >> value of having HTML5 vs Flash, or independent control over pitch and >> playbackRate. > > Have you investigated whether you can achieve your use cases using the > Web Audio API? If it isn't possible, is there a small addition to Web > Audio that would solve the problem? > > It is unfortunately quite hard to get all browser vendors (or even > one) interested in implementing support for something that is only > expected to benefit a niche use case, but we should strive to make > available the primitives that would it possible to implement yourself. > I am actually quite ambivalent about this feature - it is currently broken on OSX and it has never been implemented on iOS, but as far as I can see we haven’t received a single bug report about this. eric
Re: [whatwg] VIDEO and pitchAdjustment
> On Aug 31, 2015, at 12:04 PM, Domenic Denicolawrote: > > My subthread was more concerned with making the spec reflect current reality. > If you can convince implementers to support backward videos, then that's > separate, and we can change the spec again. > FWIW, Safari supports negative playback rates on the desktop and on iOS. > On Aug 27, 2015, at 11:02 AM, Garrett Smith wrote: > > Negative playbackRate, to watch videos backwards, currently crashes > Safari 8 The crash Garrett noted in Safari 8 is a bug that “only" happens with MSE content. eric
Re: [whatwg] Proposal: Media element - add attributes for discovery of playback rate support
On Jul 18, 2013, at 1:13 PM, Brendan Long s...@brendanlong.com wrote: On 07/18/2013 06:54 AM, John Mellor wrote: If the user is speeding up playback to improve their productivity (spend less time watching e.g. a lecture), then they may well be willing to wait until enough of the video is buffered, since they can do something else in the meantime. For example by spending 30m buffering the first half of a 1 hour live stream, the user could then watch the whole hour at double speed. This is how DVR's work with live TV and people seem to like it (well, they like it more than not being able to fast-forward at all..). And it works because a DVR has lots of disk space. This is not the case with all devices that support the media element. Even a DVR, however, won't always let you change the playback speed. For example it isn't possible to play at greater than 1x past the current time when watching a live stream. If I am watching a live stream and I try to play past the end of the buffered video, my DVR drops back to 1x and won't let me change the speed. It doesn't automatically pause and buffer for a while so it can play at a faster rate. It isn't always possible to play a media stream at an arbitrary speed. It is foolish to pretend otherwise as the current spec does. eric
[whatwg] Forced subtitles
In working with real-world content with in-band subtitle tracks, I have realized that the spec doesn't accommodate forced subtitles. Forced subtitles are used when a video has dialog or text in a language that is different from the main language. For example in the Lord of the Rings, dialog in Elvish is subtitled so those of us that don't speak Elvish can understand. This is only an issue for users that do not already have subtitles/captions enabled, because standard caption/subitle tracks are expected to mix the translations into the other captions in the track. In other words, if I enable an English caption track I will get English captions for the dialog spoken in English and the dialog spoken in Elvish. However, users that do not typically have subtitles enabled also need to have the Elvish dialog translated so subtitle providers typically provide a second subtitle track with *only* the forced subtitles. UAs are expected to automatically enable a forced-only subtitle track when no other caption/subtitle track is visible and there is a forced-only track in the same language of the primary audio track. This means that when I watch a version of LOTR that has been dubbed into French and I do not have a subtitle or caption track enabled, the UA will automatically show French forced subtitles if they are available. Because forced subtitles are meant to be enabled automatically by the UA, it is essential that the UA is able to differentiate between normal and forced subtitles. It is also important because forced subtitles are not typically listed in the caption menu, again because the captions in them are also in the normal subtitles/captions. I therefore propose that we add a new @kind value for forced subtitles. Forced is a widely used term in the industry, so I think forced is the appropriate value. eric
Re: [whatwg] Forced subtitles
On Apr 11, 2013, at 3:54 PM, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: I think Eric is right - we need a new @kind=forced or @kind=forcedSubtitles value on track elements, because they behave differently from the subtitle kind: * are not listed in a track menu * are turned on by browser when no other subtitle or caption track is on * multiple forced subtitles tracks can be on at the same time (see discussion at https://www.w3.org/Bugs/Public/show_bug.cgi?id=21667 ) I only wonder how the browser is meant to identify for which language it needs to turn on the forced subtitles. If it should depend on the language of the audio track of the video rather than the browser's default language setting, maybe it will need to be left to the server to pick which tracks to list and all forced tracks are on, no matter what? Did you have any ideas on this, Eric? I believe it should be the language of the video's primary audio track, because forced subtitles are enabled in a situation where the user can presumably understand the dialog being spoken in the track's language and has not indicated a preference for captions or subtitles. eric On Fri, Apr 12, 2013 at 4:08 AM, Eric Carlson eric.carl...@apple.com wrote: In working with real-world content with in-band subtitle tracks, I have realized that the spec doesn't accommodate forced subtitles. Forced subtitles are used when a video has dialog or text in a language that is different from the main language. For example in the Lord of the Rings, dialog in Elvish is subtitled so those of us that don't speak Elvish can understand. This is only an issue for users that do not already have subtitles/captions enabled, because standard caption/subitle tracks are expected to mix the translations into the other captions in the track. In other words, if I enable an English caption track I will get English captions for the dialog spoken in English and the dialog spoken in Elvish. However, users that do not typically have subtitles enabled also need to have the Elvish dialog translated so subtitle providers typically provide a second subtitle track with *only* the forced subtitles. UAs are expected to automatically enable a forced-only subtitle track when no other caption/subtitle track is visible and there is a forced-only track in the same language of the primary audio track. This means that when I watch a version of LOTR that has been dubbed into French and I do not have a subtitle or caption track enabled, the UA will automatically show French forced subtitles if they are available. Because forced subtitles are meant to be enabled automatically by the UA, it is essential that the UA is able to differentiate between normal and forced subtitles. It is also important because forced subtitles are not typically listed in the caption menu, again because the captions in them are also in the normal subtitles/captions. I therefore propose that we add a new @kind value for forced subtitles. Forced is a widely used term in the industry, so I think forced is the appropriate value. eric
Re: [whatwg] HTML Audio Element removal from DOM
On Jan 17, 2012, at 1:32 PM, Andrew Scherkus wrote: On Tue, Jan 17, 2012 at 1:19 PM, Charles Pritchard ch...@jumis.com wrote: When an audio element is removed from the DOM while playing, is that element paused? That seems to be the behavior in Chrome. I'm looking for clarification. I was able to repro this in both Safari 5.1.1 and Chrome 17.0.963.26 dev so perhaps it's a bug in WebKit as the spec states the following: Media elements that are potentially playing while not in a Document must not play any video, but should play any audio component. Media elements must not stop playing just because all references to them have been removed; only once a media element is in a state where no further audio could ever be played by that element may the element be garbage collected. That is for an element that is playing when it is not in the document. Look at the end of http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#playing-the-media-resource for the definition of what to do when an element is removed from the DOM: When a media element is removed from a Document, the user agent must run the following steps: 1. Asynchronously await a stable state, allowing the task that removed the media element from the Document to continue. The synchronous section consists of all the remaining steps of this algorithm. (Steps in the synchronous section are marked with.) 2. If the media element is in a Document, abort these steps. 3. If the media element's networkState attribute has the value NETWORK_EMPTY, abort these steps. 4. Pause the media element. eric
Re: [whatwg] Fullscreen
On Oct 15, 2011, at 2:05 AM, Olli Pettay wrote: On 10/15/2011 07:27 AM, Anne van Kesteren wrote: I went with fullscreen rather than full screen as that seemed cleaner and easier to type. I also used enter and exit rather than request and cancel as they seemed somewhat nicer too. I'm less attached to this latter change though. To me enterFullscreen() sounds like something which couldn't fail. requestFullscreen() is closer to what actually happens: script asks UA to go to fullscreen mode, but it may fail if user or UA for some reason denies the request. I agree. requestFullscreen describes what happens much more accurately. eric
Re: [whatwg] Video feedback
On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote: On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters sim...@opera.com wrote: On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: For commercial video providers, the tracks in a live stream change all the time; this is not limited to audio and video tracks but would include text tracks as well. OK, all this indicates to me that we probably want a metadatachanged event to indicate there has been a change and that JS may need to check some of its assumptions. We already have durationchange. Duration is metadata. If we want to support changes to width/height, and the script is interested in when that happens, maybe there should be a dimensionchange event (but what's the use case for changing width/height mid-stream?). Does the spec support changes to text tracks mid-stream? It's not about what the spec supports, but what real-world streams provide. I don't think it makes sense to put an event on every single type of metadata that can change. Most of the time, when you have a stream change, many variables will change together, so a single event is a lot less events to raise. It's an event that signifies that the media framework has reset the video/audio decoding pipeline and loaded a whole bunch of new stuff. You should imagine it as a concatenation of different media resources. And yes, they can have different track constitution and different audio sampling rate (which the audio API will care about) etc etc. In addition, it is possible for a stream to lose or gain an audio track. In this case the dimensions won't change but a script may want to react to the change in audioTracks. I agree with Silvia, a more generic metadata changed event makes more sense. eric
Re: [whatwg] Video feedback
On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote: Nothing exposed via the current API would change, AFAICT. Thus, after a change mid-stream to, say, a smaller video width and height, would the video.videoWidth and video.videoHeight attributes represent the width and height of the previous stream or the current one? I agree that if we start exposing things like sampling rate or want to support arbitrary chained Ogg, then there is a problem. I think we already have a problem with width and height for chained Ogg and we cannot stop people from putting chained Ogg into the @src. I actually took this discussion away from MPEG PTM, which is where Eric's question came from, because I don't understand how it works with MPEG. But I can see that it's not just a problem of MPEG, but also of Ogg (and possibly of WebM which can have multiple Segments). So, I think we need a generic solution for it. The characteristics of an Apple HTTP live stream can change on the fly. For example if the user's bandwidth to the streaming server changes, the video width and height can change as the stream resolution is switched up or down, or the number of tracks can change when a stream switches from video+audio to audio only. In addition, a server can insert segments with different characteristics into a stream on the fly, eg. inserting an ad or emergency announcement. It is not possible to predict these changes before they occur. eric
Re: [whatwg] video ... script race condition
On May 13, 2011, at 4:35 AM, Philip Jägenstedt wrote: I wasn't asking how to work around the problem once you know it exists, I was wondering if any browser vendors have done anything to make this problem less likely to happen on pages like http://html5demos.com/video that don't do the right thing? WebKit has not. It seems to me that the right way to fix the problem is let people know it is sloppy code, not to figure out a way to work around it. eric
Re: [whatwg] Full Screen API Feedback
On May 13, 2011, at 12:46 AM, Henri Sivonen wrote: On Thu, 2011-05-12 at 20:29 -0400, Aryeh Gregor wrote: In particular, Flash has allowed this for years, with 95%+ penetration rates, so we should already have a good idea of how this feature can be exploited in practice. I don't know of exploits in the wild, but I've read about proof-of-concept exploits that overwhelmed the user's attention visually so that the user didn't notice the Press ESC to exit full screen message. This allowed subsequent UI spoofing. (I was unable to find the citation for this.) Maybe you were thinking of this: http://www.bunnyhero.org/2008/05/10/scaring-people-with-fullscreen/. eric
Re: [whatwg] How to handle multitrack media resources in HTML
On Apr 11, 2011, at 5:26 PM, Ian Hickson wrote: On Mon, 11 Apr 2011, Jeroen Wijering wrote: On Apr 8, 2011, at 8:54 AM, Ian Hickson wrote: There's a big difference between text tracks, audio tracks, and video tracks. While it makes sense, for instance, to have text tracks enabled but not showing, it makes no sense to do that with audio tracks. Audio and video tracks require more data, hence it's less preferred to allow them being enabled but not showing. If data wasn't an issue, it would be great if this were possible; it'd allow instant switching between multiple audio dubs, or camera angles. I think we mean different things by active here. The hidden state for a text track is one where the UA isn't rendering the track but the UA is still firing all the events and so forth. I don't understand what the parallel would be for a video or audio track. Text tracks are discontinuous units of potentially overlapping textual data with position information and other metadata that can be styled with CSS and can be mutated from script. Audio and video tracks are continuous streams of immutable media data. Video and audio tracks do not necessarily produce continuous output - it is perfectly legal to have gaps in either, eg. segments that do not render. Both audio and video tracks can have metadata that affect their rendering: an audio track has a volume metadata that attenuates its contribution to the overall mix-down, and a video track has matrix that controls its rendering. The only thing preventing us from styling a video track with CSS is the lack of definition. I don't really see what they have in common other than us using the word track to refer to both of them, and that's mostly just an artefact of the language. Track is more than an artifact of the language, it is the commonly used term in the digital media industry for an independent stream of media samples in a container file. eric
Re: [whatwg] How to handle multitrack media resources in HTML
On Apr 10, 2011, at 12:36 PM, Mark Watson wrote: In the case of in-band tracks it may still be the case that they are retrieved independently over the network. This could happen two ways: - some file formats contain headers which enable precise navigation of the file, for example using HTTP byte ranges, so that the tracks could be retrieved independently. mp4 files would be an example. I don't know that anyone does this, though. QuickTime has supported tracks with external media samples in .mov files for more than 15 years. This type of file is most commonly used during editing, but they are occasionally found on the net. - in the case of adaptive streaming based on a manifest, the different tracks may be in different files, even though they appear as in-band tracks from an HTML perspective. In these cases it *might* make sense to expose separate buffer and network states for the different in-band tracks in just the same way as out-of-band tracks. I strongly disagree. Having different tracks APIs for different container formats will be extremely confusing for developers, and I don't think it will add anything. A UA that chooses to support non-self contained media files should account for all samples when reporting readyState and networkState. eric
Re: [whatwg] HTML5 Video - Issue and Request for improvment
On Jan 29, 2011, at 3:21 AM, Lubomir Toshev wrote: Another thing that I see as a limitation is that video should expose API for currentFrame, so that when control developers want to add support for subtitles on their own, to be able to support formats that display the subtitles according to the current video frame. This is a limitation to the current design of the video tag. I don't understand what you are suggesting, what would currentFrame return and how exactly would you use it? eric
Re: [whatwg] Value of media.currentTime immediately after setting
On Jan 20, 2011, at 12:46 AM, Philip Jägenstedt wrote: On Thu, 20 Jan 2011 04:20:09 +0100, Matthew Gregan kine...@flim.org wrote: Hi, The media seek algorithm (4.8.10.9) states that the current playback position should be set to the new playback position during the asynchronous part of the algorithm, just before the seeking event is fired. This implies the following behaviour: 0. Initial load state (currentTime reports 0) 1. currentTime set to 20 by script 2. currentTime continues to report 0 3. Script returns to main loop 4. seeking event raised 5. currentTime reports 20 in seeking event handler This is the behaviour in Firefox 4. In every other browser I tested (Chrome 10, Opera 11, Safari 5, and Internet Explorer 9), the following behaviour is observed: 2. currentTime immediately reports 20 This doesn't seem to be required by the current wording of the spec (in fact, it seems to be incorrect behaviour), but I think this behaviour is more intuitive, as it seems unusual that currentTime returns to the old value immediately after being set and remains that way until the seeking event fires. Does it make sense to update the seeking algorithm to reflect how non-Firefox browsers are implementing this? My proposal is, effectively, to take steps 5 through 8 and insert them before step 4. I've uploaded a testcase to http://flim.org/~kinetik/seek-627139.html if anyone's curious. Thanks, -mjg There have been two non-trivial changes to the seeking algorithm in the last year: Discussed at http://lists.w3.org/Archives/Public/public-html/2010Feb/0003.html lead to http://html5.org/r/4868 Discussed at http://lists.w3.org/Archives/Public/public-html/2010Jul/0217.html lead to http://html5.org/r/5219 At least we (Opera) just haven't gotten around to updating our implementation yet. With that said, it seems like there's nothing that guarantees that the asynchronous section doesn't start running while the script is still running. It's also odd that currentTime is updated before the seek has actually been completed, but the reason for this is that the UI should show the new position. In WebKit this happens because currentTime isn't maintained in HTMLMediaElement (modulo the caching added in https://bugs.webkit.org/show_bug.cgi?id=49009), it is whatever the media engine (QuickTime, GStreamer, etc) reports. When currentTime is set the media engine is asked to seek immediately so the asynchronous section may run in parallel to the script, and therefore the seek may actually have completed by the time you check currentTime. eric
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Jan 12, 2011, at 12:42 AM, Philip Jägenstedt wrote: On Wed, 12 Jan 2011 09:16:59 +0100, Glenn Maynard gl...@zewt.org wrote: On Wed, Jan 12, 2011 at 2:49 AM, Philip Jägenstedt phil...@opera.com wrote: (Also, it might be useful to be able to chose whether seeking should be fast or exact, as frame-accurate seeking is hardly necessary in most normal playback situations.) In an audio engine I worked on I had a seek hint like that, to indicate whether the priority was accuracy or speed. It matters even more with video: when seeking with a seek bar, you may want to snap to keyframes, whereas bookmarks, jump to chapter features, etc. will often want to jump precisely. A fast seek option would be particularly useful for files with infrequent keyframes. For the record, this is the solution I've been imagining: * add HTMLMediaElement.seek(t, [exact]), where exact defaults to false if missing * make setting HTMLMediaElement.currentTime be a non-exact seek, as that seems to be the most common case That is a very interesting idea!. Precise seeking in some video files can be quite slow, greater than a second is not unlikely on some devices. FWIW, the media playback framework on iOS has a seek method with parameters for the tolerance allowed before and after the seek time [1] to allow the programmer to choose. eric [1] http://developer.apple.com/library/ios/#documentation/AVFoundation/Reference/AVPlayer_Class/Reference/Reference.html%23//apple_ref/occ/instm/AVPlayer/seekToTime:toleranceBefore:toleranceAfter:
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Jan 12, 2011, at 4:04 PM, Robert O'Callahan wrote: On Wed, Jan 12, 2011 at 9:42 PM, Philip Jägenstedt phil...@opera.comwrote: For the record, this is the solution I've been imagining: * add HTMLMediaElement.seek(t, [exact]), where exact defaults to false if missing * make setting HTMLMediaElement.currentTime be a non-exact seek, as that seems to be the most common case I think setting currentTime should be exact, since a) exact seeking is simpler from the author's point of view and b) it would be unfortunate to set currentTime to value T and then discover that getting currentTime gives you a value other than T (assuming you're paused). I agree that precise seeking follows the principle of least surprise, based partly on the bugs files against the video element on iOS where this hasn't always been the behavior. eric
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Jan 11, 2011, at 9:54 AM, Rob Coenen wrote: just a follow up question in relation to SMPTE / frame accurate playback: As far as I can tell there is nothing specified in the HTML5 specs that will allow us to determine the actual frame rate (FPS) of a movie? In order to do proper time-code calculations it's essential to know both the video.duration and video.fps - and all I can find in the specs is video.duration, nothing in video.fps What does frames per second mean for a digitally encoded video file, where frames can have arbitrary duration? eric
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Jan 11, 2011, at 12:54 PM, Rob Coenen wrote: Eric, not sure if I understand what you mean. Are you referring to digitally encoded files where frame #1 has a different duration than frame #2? Exactly, every frame can have an arbitrary duration so frame rate may have no meaning. Even in the case of video captured from film, the original frame rate is often not stored in the digital file so there is no way to programmatically determine the original frame rate. eric On Tue, Jan 11, 2011 at 6:10 PM, Eric Carlson eric.carl...@apple.com wrote: On Jan 11, 2011, at 9:54 AM, Rob Coenen wrote: just a follow up question in relation to SMPTE / frame accurate playback: As far as I can tell there is nothing specified in the HTML5 specs that will allow us to determine the actual frame rate (FPS) of a movie? In order to do proper time-code calculations it's essential to know both the video.duration and video.fps - and all I can find in the specs is video.duration, nothing in video.fps What does frames per second mean for a digitally encoded video file, where frames can have arbitrary duration? eric
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Jan 11, 2011, at 5:43 PM, Chris Pearce wrote: On 12/01/2011 2:22 p.m., Dirk-Willem van Gulik wrote: On 12 Jan 2011, at 01:17, Chris Pearce wrote: I cannot think of a format where this would in fact be the case - but for a few arcane ones like an animated push gif without a loop. WebM can be variable frame rate. At best the WebM container specification [http://www.webmproject.org/code/specs/container/#track] lists the FrameRate block as Informational only, which presumably means the value stored in the container can't be trusted. Right - but is there a WebM decoder which is able to hand it off that way ? AFAIK they all use that value or select a default/measured rounded heuristic to solve flicker ? Firefox 4 doesn't use the frame rate stored in the container for WebM. Each frame is stored with its presentation time, and we request repaints as each frame fall due for painting. The prevention of flicker is handled by our graphics layer, video doesn't really participate in that, it just hands off frames downstream when they're due for painting. We have plans to schedule video frame painting more preemptively in future, but I imagine we'd still use the presentation time encoded with each frame when we do that. Video in WebKit is handled a similar same way. Media engines that decode into a bitmap signal the graphics layer when a new frame is available, and the new frame is composited overlapping page content during the next paint. Media engines that render into a hardware layer do so completely asynchronously. In both cases, the media engine is free to decode a whatever rate is appropriate for video file. eric
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Jan 9, 2011, at 11:14 AM, Rob Coenen wrote: I have written a simple test using a H264 video with burned-in timecode (every frame is visually marked with the actual SMPTE timecode) Webkit is unable to seek to the correct timecode using 'currentTime', it's always a whole bunch of frames off from the requested position. I reckon it simply seeks to the nearest keyframe? WebKit's HTMLMediaElement implementation uses different media engines on different platforms (eg. QuickTime, QTKit, GStreamer, etc). Each media engine has somewhat different playback characteristics so it is impossible to say what you are experiencing without more information. Please file a bug report at https://bugs.webkit.org/ with your test page and video file, and someone will look into it. eric On Fri, Jan 7, 2011 at 5:02 PM, Eric Carlson eric.carl...@apple.com wrote: On Jan 7, 2011, at 8:22 AM, Rob Coenen wrote: are there any plans on adding frame accuracy and/or SMPTE support to HTML5 video? As far as I know it's currently impossible to play HTML5 video frame-by-frame, or seek to a SMPTE compliant (frame accurate) time-code. The nearest seek seems to be precise to roughly 1-second (or nearest keyframe perhaps, can't tell). Flash seems to be the only solution that I'm aware of that can access video on a frame-by-frame basis (even though you the Flash Media Server to make it work). Seeking to a SMPTE time-code is completely impossible with any solution I have looked at. Very interested to learn what the community POV is, and why it isn't already implemented. 'currentTime' is a double so you should be able to seek more accurately than one second - modulo the timescale of the video file and how the UA supports seeking to inter-frame times. eric
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Jan 7, 2011, at 8:22 AM, Rob Coenen wrote: are there any plans on adding frame accuracy and/or SMPTE support to HTML5 video? As far as I know it's currently impossible to play HTML5 video frame-by-frame, or seek to a SMPTE compliant (frame accurate) time-code. The nearest seek seems to be precise to roughly 1-second (or nearest keyframe perhaps, can't tell). Flash seems to be the only solution that I'm aware of that can access video on a frame-by-frame basis (even though you the Flash Media Server to make it work). Seeking to a SMPTE time-code is completely impossible with any solution I have looked at. Very interested to learn what the community POV is, and why it isn't already implemented. 'currentTime' is a double so you should be able to seek more accurately than one second - modulo the timescale of the video file and how the UA supports seeking to inter-frame times. eric
Re: [whatwg] Html 5 video element's poster attribute
On Sep 21, 2010, at 11:17 AM, Shiv Kumar wrote: 1. The poster should stay visible until the video is played, rather than disappear as soon as the first frame is loaded. In addition, the poster should not show during buffering or any operation during video playback or switching video streams in mid step. This is a description of how the poster should behave in all browsers. Have you filed bugs against any browsers that do not behave this way? eric
Re: [whatwg] Html 5 video element's poster attribute
On Sep 19, 2010, at 3:17 PM, Silvia Pfeiffer wrote: Not quite: this is an implementation decision that Webkit-based browsers made. Neither Opera nor Firefox work that way (haven't checked IE yet). I agree that this implementation of poster frames is practically useless and it really annoys me as a user. I've been considering registering a bug on Webkit. On Sep 19, 2010, at 5:50 PM, Aryeh Gregor wrote: On Sun, Sep 19, 2010 at 4:53 PM, Shiv Kumar sku...@exposureroom.com wrote: The poster frame should remain visible until the video is played. I agree with Silvia, this should be required by the spec. The alternative is clearly wrong. Someone should also file a bug with WebKit to ask them to change. Someone might want to try a WebKit nightly build before filing a bug. This was changed in r64884. A poster is displayed until there is a movie frame to display and playback begins or the current time is changed. eric
Re: [whatwg] Timed tracks: feedback compendium
On Sep 9, 2010, at 6:08 AM, Silvia Pfeiffer wrote: On Wed, Sep 8, 2010 at 9:19 AM, Ian Hickson i...@hixie.ch wrote: On Fri, 23 Jul 2010, Philip Jägenstedt wrote: I'm not a fan of pauseOnExit, though, mostly because it seems non-trivial to implement. Since it is last in the argument list of TimedTrackCue, it will be easy to just ignore when implementing. I still don't think the use cases for it are enough to motivate the implementation cost. Really? It seems like automatically pausing video half-way would be a very common thing to do; e.g. to play an interstitial ad, or to play a specific sound effect in a sound file containing multiple sound effects, or to play a video up to the point where the user has to make a choice or has to ask to move on to the next slide. There's basically no good way to do this kind of thing without this feature. Also, some text cues will be fairly long and thus certain users cannot read them within the allocated time for the cue. So, making a pauseOnExit() available is a good thing for accessibility. I have never been a huge fan of pauseOnExit, but on reflection I agree that it is important because event latency will make it difficult or impossible to replicate the functionality in script. On Fri, 31 Jul 2009, Silvia Pfeiffer wrote: * It is unclear, which of the given alternative text tracks in different languages should be displayed by default when loading an itext resource. A @default attribute has been added to the itext elements to allow for the Web content author to tell the browser which itext tracks he/she expects to be displayed by default. If the Web author does not specify such tracks, the display depends on the user agent (UA - generally the Web browser): for accessibility reasons, there should be a field that allows users to always turn display of certain itext categories on. Further, the UA is set to a default language and it is this default language that should be used to select which itext track should be displayed. It's not clear to me that we need a way to do this; by default presumably tracks would all be off unless the user wants them, in which case the user's preferences are paramount. That's what I've specced currently. However, it's easy to override this from script. It seems to me that this is much like video autoplay in that if we don't provide a markup solution, everyone will use scripts and it will be more difficult for the UA to override with user prefs. What would we need for this then? Just a way to say by the way, in addition to whatever the user said, also turn this track on? Or do we need something to say by default, override the user's preferences for this video and instead turn on this track and turn off all others? Or something else? It's not clear to me what the use case is where this would be useful declaratively. You have covered all the user requirements and that is good. They should dominate all other settings. But I think we have neglected the authors. What about tracks that the author has defined and wants activated by default for those users that don't have anything else specified in their user requirements? For example, if an author knows that the audio on their video is pretty poor and they want the subtitles to be on by default (because otherwise a user may miss that they are available and they may miss what is going on), then currently they have to activate it with script. A user whose preferences are not set will thus see this track. For a user whose preferences are set, the browser will turn on the appropriate tracks additionally or alternatively if there is a more appropriate track in the same language (e.g. a caption track over the default subtitle track). If we do this with script, will it not have the wrong effect and turn off what the browser has selected, so is not actually expressing author preferences, but is doing an author override? I agree. It is important for a page author to be able to mark the default in case none of the alternates match user preferences. FWIW this is the way that alternate track groups work in MPEG-4 and QuickTime files - one track in a group is enabled by default but is disabled if another track in the group is enabled. Alternatively, might it not be better to simply use the voice sound for this and let the default stylesheet hide those cues? When writing subtitles I don't want the maintenance overhead of 2 different versions that differ only by the inclusion of [doorbell rings] and similar. Honestly, it's more likely that I just wouldn't bother with accessibility for the HoH at all. If I could add it with sounddoorbell rings, it's far more likely I would do that, as long as it isn't rendered by default. This is my preferred solution, then keeping only one of kind=subtitles and kind=captions. Enabling the
Re: [whatwg] VIDEO Timeupdate event frequency.
On Sep 10, 2010, at 8:06 PM, Biju wrote: On Fri, Sep 10, 2010 at 7:05 AM, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: Incidentally: What use case did you have in mind, Biju ? I was thinking about applications like https://developer.mozilla.org/samples/video/chroma-key/index.xhtml ( https://developer.mozilla.org/En/Manipulating_video_using_canvas ) Now it is using setTimeout so if processor is fast it will be processing same frame more than on time. Hence wasting system resource, which may affect other running process. Perhaps, but it only burns cycles on those pages instead of burning cycles on *every* page that uses a video element. If we use timeupdate event we may be missing some frames as timeupdate event is only happen every 200ms or 250ms, ie 4 or 5 frames per second. Even in a browser that fires 'timeupdate' every frame, you *will* miss frames on a heavily loaded machine because the event is fired asynchronously. And we know there are videos which a have more than 5 frames per second. So use a timer if you know that you want update more frequently. eric
Re: [whatwg] Video with MIME type application/octet-stream
On Aug 31, 2010, at 4:01 PM, Ian Hickson wrote: On Tue, 31 Aug 2010, Eric Carlson wrote: On Aug 31, 2010, at 12:36 AM, Ian Hickson wrote: Safari does crazy things right now that we won't go into; for the purposes of this discussion we'll assume Safari can change. What crazy things does Safari do that it should not? I forget the details, but IIRC one of the main problems was that it was based on the URL's file extension exclusively. No, I don't see how you came to that conclusion. QuickTime knows how to create a movie from a text file (to make it easy to create captions, chapters, etc), but it also assumes a file served as text/plain may be coming from a misconfigured server. Therefore, when it gets a file served as text/plain it first looks at the file content and/or the file extension to see if it is a movie file. It opens it as text only if it doesn't look like a movie. In your test page (http://hixie.ch/tests/adhoc/html/video/002.html), all four movies have correct extensions but are served as text/plain: !DOCTYPE HTML titletext/plain video files/title p video autoplay controls src=resources/text.txt/video p video autoplay controls src=resources/text.webm/video p video autoplay controls src=resources/text.m4v/video p video autoplay controls src=resources/text.ogv/video When the shipping version of Safari opens this page the MPEG-4 file opens correctly, and opens the other three as text (if you wait long enough) because by default QuickTime doesn't know how to open the Ogg or WebM files. If you add QuickTime importers for WebM and Ogg, those file will be opened as movies instead of as text because of the file extensions, despite the fact that they are serve as text. FWIW, in nightly builds we are now configuring QuickTime so it won't ever open files it identifies as text. eric
Re: [whatwg] Video with MIME type application/octet-stream
On Sep 1, 2010, at 9:07 AM, Zachary Ozer wrote: On Wed, Sep 1, 2010 at 10:51 AM, Adrian Sutton adrian.sut...@ephox.com wrote: Given that there is a very limited set of video formats that are supported anyway, wouldn't it be reasonable to just identify or define the standard file extensions then work with server vendors to update their standard file extension to mime type definitions to include that. While adoption and upgrading to the new versions would obviously take time, that applies to the video tag itself anyway and is just a temporary source of pain. At first glance, my eyes almost popped out of my sockets when I saw this suggestion. Using the file extension?! He must be mad! Then I remembered that our Flash player *has* to use file extension since the MIME type isn't available in Flash. Turns out that file extension is a pretty good indicator, but it doesn't work for custom server configurations where videos don't have extensions, ala YouTube. For that reason, we allow users to override whatever we detect with a type configuration parameter. Ultimately, the question is, What are we trying to accomplish? I think we're trying to make it easy for content creators to guarantee that their content is available to all viewers regardless of their browser. If that's the case, I'd actually suggest that the browsers *strictly* follow the MIME type, with the source type as a override, and eliminating all sniffing (assuming that the file container format contains the codec meta-data). If a publisher notices that their video isn't working, they can either update their server's MIME type mapping, or just hard code the type in the HTML. Hard coding the type is only possible if the element uses a source element, @type isn't allowed on audio or video. Neither is that time consuming / difficult. It isn't hard to update a server if you control it, but it can be *very* difficult and time consuming if you don't (as is the case with most web developers, I assume). Moreover, as Adrian suggested, it's probably quite easy to get the big HTTP servers (Apache, IIS, nginx, lighttpd) to add the new extensions (if they haven't already), so this would gradually become less and less of an issue. Really? Your company specializes in web video and flv files have been around for years, but your own server still isn't configured for it: eric% curl -I http://content.longtailvideo.com/videos/flvplayer.flv; HTTP/1.1 200 OK Server-Status: load=0 Content-Type: application/octet-stream Accept-Ranges: bytes ETag: 4288394655 Last-Modified: Wed, 23 Jun 2010 20:42:28 GMT Content-Length: 2533148 Date: Wed, 01 Sep 2010 16:16:28 GMT Server: bit_asic/3.8/r8s1-bitcast-b eric
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Aug 25, 2010, at 8:40 AM, Silvia Pfeiffer wrote: On Thu, Aug 26, 2010 at 12:39 AM, Philip Jägenstedt phil...@opera.com wrote: The results are hardly consistent, but at least one player exist for which it's not enough to change the file extension and add a header. If we want to make sure that no content is treated as SRT by any application, the format must be more incompatible. You misunderstand my intent. I am by no means suggesting that no WebSRT content is treated as SRT by any application. All I am asking for is a different file extension and a different mime type and possibly a magic identifier such that *authoring* applications (and authors) can clearly designate this to be a different format, in particular if they include new features. Then a *playback application* has the chance to identify them as a different format and provide a specific parser for it, instead of failing like Totem. They can also decide to extend their existing SRT parser to support both WebSRT and SRT. And I also have no issue with a user deciding to give a WebSRT file a go by renaming it to .srt. By keeping WebSRT and SRT as different formats we give the applications a choice to support either, or both in the same parser. If we don't, we force them to deal in a single parser with all the oddities of SRT formats as well as all the extra features and all the extensibility of WebSRT. I think we've made some interesting finds in this thread, but we're starting to go in circles by now. Perhaps we should give it a rest until we get input from a third party. A medal to anyone who has followed it this far :) FWIW, I agree with Silvia that a new file extension and MIME type make sense. Keeping them the same won't help applications that don't know about WebSRT, they will try to play the files and aren't likely to deal with the differences gracefully. Keeping them the same also won't help new applications that know about WebSRT, it won't make any difference if there is one MIME type or two. eric
Re: [whatwg] On implementing videos with multiple tracks in HTML5
On Aug 20, 2010, at 5:53 PM, Silvia Pfeiffer wrote: On Sat, Aug 21, 2010 at 10:03 AM, Eric Carlson eric.carl...@apple.com wrote: On Aug 19, 2010, at 5:23 PM, Silvia Pfeiffer wrote: * Whether to include a multiplexed download functionality in browsers for media resources, where the browser would do the multiplexing of the active media resource with all the active text, audio and video tracks? This could be a context menu functionality, so is probably not so much a need to include in the HTML5 spec, but it's something that browsers can consider to provide. And since muxing isn't quite as difficult a functionality as e.g. decoding video, it could actually be fairly cheap to implement. I don't understand what you mean here, can you explain? Sure. What I mean is: you get a video resource through the video element and a list of text resources through the track element. If I as a user want to take away (i.e. download and share with friends) the video file with the text tracks that I have activated and am currently watching, then I'd want a download feature that allows me to download a single multiplexed video file with all the text tracks inside. Something like a MPEG-4 file with the track resources encoded into, say, 3GPP-TT. Or a WebM with WebSRT encoded (if there will be such a mapping). Or a Ogg file with WebSRT - maybe encoded in Kate or natively. The simplest implementation of such a functionality is of course where the external text track totally matches the format used in the media resource for encoding text. Assuming WebM will have such a thing as a WebSRT track, the download functionality would then consist of multiplexing a new WebM resource by re-using the original WebM resource and including the WebSRT tracks into that. It wouldn't require new video and audio encoding, since it's just a matter of a different multiplexed container. If transcoding to the text format in the native container is required, then it's a bit more complex, but no less so than what we need to do for extracting such data into a Web page for the JavaScript API (it's in fact the inverse of that operation). So, I wouldn't think it's a very complex functionality, but it certainly seems to be outside the HTML spec and a browser feature, possibly at first even a browser plugin. Sorry if this is now off topic. :-) Even in the hypothetical case where the external text track is already in a format supported by the media container file, saving will require the UA to regenerate the movie's table of contents (eg. the 'moov' atom in MPEG-4 or QuickTime files, Meta Seek Information in a WebM file) as well as muxing the text track with the other media data. As you note transcoding is a bit more complex, especially in the case where a feature in the text track format is not supported by the text format of the native container. Further, what should a UA do in the case where the native container format doesn't support any form of text track - eg. mp3, WAVE, etc? I disagree that it is not a complex feature, but I do agree that it is outside of the scope of the HTML spec. eric
Re: [whatwg] On implementing videos with multiple tracks in HTML5
On Aug 19, 2010, at 5:23 PM, Silvia Pfeiffer wrote: * Whether to include a multiplexed download functionality in browsers for media resources, where the browser would do the multiplexing of the active media resource with all the active text, audio and video tracks? This could be a context menu functionality, so is probably not so much a need to include in the HTML5 spec, but it's something that browsers can consider to provide. And since muxing isn't quite as difficult a functionality as e.g. decoding video, it could actually be fairly cheap to implement. I don't understand what you mean here, can you explain? Thanks, eric
Re: [whatwg] HTML5 video source dimensions and bitrate
Hi Chris - On Aug 13, 2010, at 6:48 PM, Chris Double wrote: On Sat, Aug 14, 2010 at 4:05 AM, Zachary Ozer z...@longtailvideo.com wrote: It would still be nice if the video made dropped frame information available, but that's probably not in the cards. I have a work in progress bug with patch that adds this to the video implementation in Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=580531 It adds a 'mozDroppedFrames' as well as a couple of other stats people have queried about here (download rate, framerate, etc). I'd be keen to see something like this get discussed/added. I see the following additions: interface HTMLMediaElement { readonly attribute float mozDownloadRate; readonly attribute float mozPlaybackRate; }; interface HTMLVideoElement { readonly attribute unsigned long mozFrameCount; readonly attribute unsigned long mozDroppedFrames; }; A few questions: mozDownloadRate - What are the units, bit per second? mozPlaybackRate - Is this the movie's data rate (total bytes / duration)? mozFrameCount - What do you propose a UA report for a partually downloaded VBR movie, or for a movie in a container that doesn't have a header (ie. one where you don't know the fame count until you have examined every byte in the file)? eric
Re: [whatwg] Race condition in media load algorithm
On Aug 5, 2010, at 8:22 AM, Boris Zbarsky wrote: In practice, what Gecko would likely do here is to treat stable state as the event loop is spinning, just like we would for the other case. This means that while a modal dialog is up, or a sync XHR is running or whatnot is a stable state. FWIW this is what WebKit does now, assuming you mean *asynch* XHR. eric
Re: [whatwg] Introduction of media accessibility features
On Apr 13, 2010, at 12:28 AM, Jonas Sicking wrote: Will implementations want to do the rendering of the subtitles off the main thread? I believe many browsers are, or are planning to, render the actual video graphics using a separate thread. If that is correct, do we want to support rendering of the subtitles on a separate thread too? Or is it enough to do the rendering on the main thread, but composit using a separate thread? If rendering is expected to happen on a separate thread, then CSS is possibly not the right solution as most CSS engines are main-thread-only today. It seems to me that the thread subtitles are composed, rendered, and/or composited on is an implementation detail that we should not try to spec. People are extremely sensitive to audio/video sync but we don't mandate how that should be handled, why start with captions? eric
Re: [whatwg] Video source selection based on quality (was: video feedback)
On Feb 15, 2010, at 11:30 PM, Tim Hutt wrote: Anyway, with respect to the actual discussion. My vote is to add two optional tags to video I assume you mean to add these to the source element rather than video? : bitrate=800 (in kb/s) and If a UA is to use bitrate as a selection criteria, what data should it base the selection on? Would you have it ping the server where the resource is located? If so, how much data should it be required to read? eric
Re: [whatwg] video feedback
On Feb 10, 2010, at 8:01 AM, Brian Campbell wrote: On Feb 9, 2010, at 9:03 PM, Ian Hickson wrote: On Sat, 31 Oct 2009, Brian Campbell wrote: At 4 timeupdate events per second, it isn't all that useful. I can replace it with setInterval, at whatever rate I want, query the time, and get the synchronization I need, but that makes the timeupdate event seem to be redundant. The important thing with timeupdate is that it also fires whenever the time changes in a significant way, e.g. immediately after a seek, or when reaching the end of the resource, etc. Also, the user agent can start lowering the rate in the face of high CPU load, which makes it more user-friendly than setInterval(). I agree, it is important to be able to reduce the rate in the face of high CPU load, but as currently implemented in WebKit, if you use timeupdate to keep anything in sync with the video, it feels fairly laggy and jerky. This means that for higher quality synchronization, you need to use setInterval, which defeats the purpose of making timeupdate more user friendly. Perhaps this is just a bug I should file to WebKit, as they are choosing an update interval at the extreme end of the allowed range for their default behavior; but I figured that it might make sense to mention a reasonable default value (such as 30 times per second, or once per frame displayed) in the spec, to give some guidance to browser vendors about what authors will be expecting. I disagree that 30 times per second is a reasonable default. I understand that it would be useful for what you want to do, but your use case is not a typical. I think most pages won't listen for 'timeupdate' events at all so instead of making every page incur the extra overhead of waking up, allocating, queueing, and firing an event 30 times per second, WebKit sticks with the minimum frequency the spec mandates figuring that people like you that need something more can roll their own. On Thu, 5 Nov 2009, Brian Campbell wrote: Would something like video firing events for every frame rendered help you out? This would help also fix the canvas over/under painting issue and improve synchronization. Yes, this would be considerably better than what is currently specced. There surely is a better solution than copying data from the video element to a canvas on every frame for whatever the problem that that solves is. What is the actual use case where you'd do that? This was not my use case (my use case was just synchronizing bullets, slide transitions, and animations to video), but an example I can think of is using this to composite video. Most (if not all) video formats supported by video in the various browsers do not store alpha channel information. In order to composite video against a dynamic background, authors may copy video data to a canvas, then paint transparent to all pixels matching a given color. This use case would clearly be better served by video formats that include alpha information, and implementations that support compositing video over other content, but given that we're having trouble finding any video format at all that the browsers can agree on, this seems to be a long way off, so stop-gap measures may be useful in the interim. Compositing video over dynamic content is actually an extremely important use case for rich, interactive multimedia, which I would like to encourage browser vendors to implement, but I'm not even sure where to start, given the situation on formats and codecs. I believe I've seen this discussed in Theora, but never went anywhere, and I don't have any idea how I'd even start getting involved in the MPEG standardization process. Have you actually tried this? Rendering video frames to a canvas and processing every pixel from script is *extremely* processor intensive, you are unlikely to get reasonable frame rate. The H.262 does support alpha (see AVC spec 2nd edition, section 7.3.2.1.2 Sequence parameter set extension), but we do not support it correctly in WebKit at the moment. *Please* file bugs against WebKit if you would like to see this properly supported. QuickTime movies support alpha for a number of video formats (eg. png, animation, lossless, etc), you might give that a try. eric
Re: [whatwg] Quality Values for Media Source Elements
On Dec 13, 2009, at 8:12 PM, Silvia Pfeiffer wrote: Oh! What are you doing with it? I mean - have the values in the media attribute any effect on the video element? Certainly! WebKit evaluates the query in the 'media' attribute if it believes it can handle the MIME type. If the query evaluates to true, it uses that source element. If it evaluates to false it skips it, even though it could (in theory) open the movie. For example, one of our layout tests [1] has the following : video controls source src=content/error.mpeg media=print source src=content/error2.mpeg media=screen and (min-device-width: 8px) source src=content/test.mp4 media=screen and (min-device-width: 100px) /video The test fails if the video element is instantiated with anything but test.mp4. I have seen 'media' used on real-world pages with something like the following to select different movies for the iphone and desktop: video controls source src='desktop-video.mp4' media=@media screen and (min-device-width: 481px) source src='iphone-video.mp4' media=@media screen and (min-device-width: 480px) /video This works because the source elements are evaluated in order, so the first one is selected on the desktop where both queries will evaluate to true. eric [1] http://trac.webkit.org/browser/trunk/LayoutTests/media/video-source-media.html?format=txt Thanks, Silvia. On Mon, Dec 14, 2009 at 2:43 PM, Eric Carlson eric.carl...@apple.com wrote: On Dec 13, 2009, at 2:35 PM, Silvia Pfeiffer wrote: This is why the @media attribute hasnt' been used/implemented anywhere yet Are you saying that nobody has implemented the media attribute on source? If so, you are incorrect as WebKit has had this for almost two years. eric
Re: [whatwg] Restarting the media element resource fetch algorithm after load event
On Oct 8, 2009, at 5:32 AM, Philip Jägenstedt wrote: On Thu, 08 Oct 2009 12:10:01 +0200, Robert O'Callahan rob...@ocallahan.org wrote: Another issue is that it's not completely clear to me what is meant by While the user agent might still need network access to obtain parts of the media resourcehttp://www.whatwg.org/specs/web-apps/current-work/#media-resource ... What if there is data in the resource that we don't need in order to play through normally, but which might be needed in some special situations (e.g., enabling subtitles, or seeking using an index), and we optimize to not load that data unless/until we need it? In that case would we never reach NETWORK_LOADED? As I understand it, NETWORK_LOADED means that all bytes of the resource have been loaded, regardless of whether they will be used or not. Are there any formats that would actually allow not downloading parts of the resource in a meaningful way? Yes. A disabled track in an MPEG-4 or QuickTime file is not rendered so the data is not used when presenting the movie. Media data for an enabled but invisible video track (eg. size 0x0, or not within the visible region) or an enabled but muted audio track isn't technically needed for the presentation either. Subtitles and indexes are too small to bother, and multiplexed audio/ video tracks can hardly be skipped without zillions of HTTP Range requests. It seems to me that kind of thing would have to be done either with a server side media fragment request (using the 'track' dimension) or with an external audio/video track somehow synced to the master track (much like external subtitles). I don't agree that this is necessarily best done on a server. Some file formats include tables with the location of every sample, so a media engine that uses range requests anyway can easily read just the data needed. It might be wise for such an engine to optimize the size of chunks read from the server, but that is an implementation detail. Also remember that multiplexed is a relative term, different chunking/interleaving schemes make sense for different media types and use cases so not all multiplexed files interleave data in small chunks. In general NETWORK_LOADED and the load event seem rather useless and dangerous IMHO. If you're playing a resource that doesn't fit in your cache then you'll certainly never reach NETWORK_LOADED, and since authors can't know the cache size they can never rely on load firing. And if you allow the cache discarding behavior I described above, authors can't rely on data actually being present locally even after load has fired. If data can be evicted from the cache you can never reach NETWORK_LOADED because Network connectivity could be lost without affecting the media playback. I suspect many authors will make invalid assumptions about load being sure to fire and about what load means if it does fire. Does anyone have any use cases that load actually solves? I also agree that the 'load' event and the NETWORK_LOADED state are not terribly useful and will likely cause a great deal of confusion for developers. We have have seen a number of cases where experienced web developers have used the 'load' event when they should have used the 'canplaythough', and I fear that this will be a common mistake. I agree, sites that depend on the load event sites will likely break randomly for file sizes that usually barely fit into the cache of the browser they were tested with. If browsers are conservative with bandwidth and only send the load event when it's true, I think we will have less of a problem however. I don't agree that it will be any less of a problem if browsers are conservative, users will still not *ever* be able to depend on the 'load' event firing (except perhaps for local files). Note that the load event isn't strictly needed, waiting for a progress event with loaded==total would achieve the same thing. Actually, a progress event with loaded==total tells you even less than the 'load' event because it doesn't guarantee that the data won't be evicted from the cache. Aesthetically, however, I think it would be strange to not have the load event. I am not worried about the aesthetics of not having the event. I am somewhat concerned about existing content that uses it (including many of the WebKit layout tests :-( ), but I think we will be better off in the long run if we get rid of the event and network state now. eric
Re: [whatwg] Chipset support is a good argument
On Jul 6, 2009, at 3:00 AM, Lino Mastrodomenico wrote: (BTW, canPlayType in Safari 4.0 seems buggy: it always returns no, even with XiphQT installed). That was fixed just after Safari 4.0 shipped, it should work in WebKit nightly builds. See http://trac.webkit.org/changeset/43972. eric
Re: [whatwg] Start position of media resources
On Apr 6, 2009, at 9:11 PM, Chris Double wrote: On Tue, Apr 7, 2009 at 3:40 AM, Eric Carlson eric.carl...@apple.com wrote: Media time values are expressed in normal play time (NPT), the absolute position relative to the beginning of the presentation. I don't see mention of this in the spec which is one of the reasons I raised the question. Have I missed it? If not I'd like to see the spec clarified here. I thought this was explicit in the spec, but maybe I am thinking of the discussion of effective start in a previous revision? In any case, I agree the wording should be clarified. eric
Re: [whatwg] Start position of media resources
On Apr 6, 2009, at 3:08 AM, Silvia Pfeiffer wrote: On Mon, Apr 6, 2009 at 7:38 PM, Chris Double chris.dou...@double.co.nz wrote: On Mon, Apr 6, 2009 at 9:40 PM, Silvia Pfeiffer I doubt though we need another attribute on the element - the information is stored in the src URL, so should be retrieved from there IMHO. In this case it is not stored in the src URL in a way the author of the document can retrieve. An oggz-chopped file can be copied and served with a normal filename for example. The time is embedded in the Ogg file. There is no way for the author to retrieve it. Hence the need for an attribute. Ah, yes, in this case it can only come from the file directly into a DOM property. I see. I agree, there is a need for an explicit attribute. A media file with a non-zero initial time stamp is not new to oggz- chopped files (eg. an MPEG stream initial PTS can have any value, SMPTE time-codes do not necessarily start at zero, etc) , but I disagree that we need a new attribute to handle it. Media time values are expressed in normal play time (NPT), the absolute position relative to the beginning of the presentation. It is the responsibility of the UA to map time zero of the element to the starting time of the media resource, whatever it may be. eric
Re: [whatwg] Captions, Subtitles and the Video Element
Greg - Interesting ideas! A few questions that occur to me on first read: On Feb 19, 2009, at 2:37 PM, Greg Millam wrote: HTML5 / Page Author: * Each video will have a list of zero or more Timed Text tracks. * A track has three variables for selection: Type, Language, Name. These can be null, except for name. I am confused by your terminology. Does Timed Text track refer to the caption elements, or the caption tracks in the media file, or both? The term Time Text track has a very specific meaning in a media file, so unless that is what you mean I think another term would be preferable. * All timed text tracks encoded in the video file are added to the list, as an implicit caption element. When should the UA create the implicit caption element(s) from the tracks in the media file? What should it do about caption samples that are spread throughout the media file? * Caption tags, when displayed, count as span class=caption.../span unless they have style associated with them (uncommon). So they can be tweaked via CSS. Whether by the author or overridden by useragent. So by default, all of the captions (along with number and time stamps) for the entire file are displayed at the same time? eric
Re: [whatwg] media elements: Relative seeking
On Nov 24, 2008, at 2:21 PM, Calogero Alex Baldacchino wrote: Well, the length attribute could be an indication about such limit and could accept a generic value, such as 'unknown' (or '0', with the same meaning - just to have only numerical values) to indicate an endless stream (i.e. a realtime iptv): in such a case, any seeking operation could be either prohibited or just related to the amount of yet played content which is eventually present in a local cache. It is a mistake to assume that media data is present in the local cache after it has been played. Some devices have very limited storage (eg. small handhelds) and choose to use a very limited non-persistent cache, live streams have essentially unbounded size and can't be cached, even downloaded content can be so large that clients on a desktop class machine may choose to no buffer the entire file. eric
Re: [whatwg] media elements: Relative seeking
Reporting the absolute time of the current sample won't help when the first sample of the file doesn't have a timestamp of zero. It will be even more confusing for files with portions removed or added without fixing time stamps - for example a movie created by concatenating different files. As I noted when this subject came up a few weeks ago, the right way to deal with media formats that don't store duration is to estimate the duration of the whole file by extrapolating from the known, exact, duration of the portion(s) that have been processed. While the initial estimate won't always be correct for variable bit-rate formats, the estimate will become more and more accurate as it is iteratively refined by processing more media data. The spec defines the durationchange for just exactly this scenario. I don't think it makes *any* sense at all to push this problem up so the user has to deal with . It is a hard problem, but it is a problem for the User Agent eric On Nov 23, 2008, at 8:08 AM, Maik Merten wrote: currently seeking in the media elements is done by manipulating the currentTime attribute. This expects an absolute time offset in seconds. This works fine as long as the duration (in absolute time) of the media file is known and doesn't work at all in other cases. Some media formats don't store the duration of the media file anywhere. A client can only determine the duration of the media file by byte-seeking near the end of the file and finding a timestamp near/at the end. This isn't a problem whatsoever on local files, but in remote setups this puts additional load on the server and the connection. If one would like to avoid this, meaning no duration is known, seeking in absolute time cannot work. While getting the absolute duration is often a problem retrieving the length of the media file is is no problem. I propose seeking with relative positions, e.g. values between zero and one. This way the client can determine if to seek in absolute time (if the duration is known) or to just jump into to a position of the bytestream (if the length in bytes is known). - make currentTime readonly, still have it report playback position in absolute time. This information should be available in all media formats due to timestamps in the stream. - introduce a seek() method, taking a relative value ranging from zero to one. This allows both accurate seeking if the duration is known and less precise seeking otherwise if only the length of the file is known in storage units. This is still way better than not being able to seek at all. - make duration report either the duration in absolute time (if known) or the length of the file in storage units. This enables computation of a relative playback position even when no duration is known, if the byte position of the stream is known (low precision fallback - still better than nothing at all). - introduce a readonly storagePosition attribute. Meant to compute a relative position if the duration is only known in storage units.
Re: [whatwg] media elements: Relative seeking
On Nov 23, 2008, at 10:51 AM, Maik Merten wrote: Eric Carlson schrieb: Reporting the absolute time of the current sample won't help when the first sample of the file doesn't have a timestamp of zero. It will be even more confusing for files with portions removed or added without fixing time stamps - for example a movie created by concatenating different files. Well, at least the zero-timestamp has offset problem can be corrected. Whenever possible a corrected time should be reported - whatever that may be. As I noted when this subject came up a few weeks ago, the right way to deal with media formats that don't store duration is to estimate the duration of the whole file by extrapolating from the known, exact, duration of the portion(s) that have been processed. While the initial estimate won't always be correct for variable bit-rate formats, the estimate will become more and more accurate as it is iteratively refined by processing more media data. The spec defines the durationchange for just exactly this scenario. Personally I don't think extrapolating the duration will work at all. Yes, it gets better the more has been seen, but I assume we'll see a lot of position indicators in the UI bouncing back and forth if durations are to be extrapolated. QuickTime has used this method this since it started supporting VBR mp3 in 2000, and in practice it works quite well. I am sure that there are degenerate cases where the initial estimate is way off, but generally it is accurate enough that it isn't a problem. An initial estimate is more likely to be wrong for a very long file, but each pixel represents a larger amount of time in the time slider with a long duration so changes less noticeable. eric
Re: [whatwg] Scripted querying of video capabilities
On Nov 13, 2008, at 10:52 AM, Jeremy Doig wrote: did this thread go anywhere ? See http://www.whatwg.org/specs/web-apps/current-work/multipage/browsers.html#dom-navigator-canplaytype . i'm concerned about the maybe case - looks way too much like: http://en.wikipedia.org/wiki/DShow#Codec_hell also - when you probe for mime type, do you mean the entire type parameter (including the codecs string) ? for example, there are too many cases where just passing video/mp4 would be insufficient. (fragmented index support ? base/main/high profile ? paff ? cabac ?) source src=video.mp4 type=video/mp4; codecs=quot;avc1.42E01E, mp4a.40.2quot; My interpretation is that it does, and the vagueness of many MIME types is the reason for the maybe case. eric On Wed, Oct 15, 2008 at 11:14 PM, Maciej Stachowiak [EMAIL PROTECTED] wrote: On Oct 15, 2008, at 1:44 AM, Ian Hickson wrote: On Tue, 14 Oct 2008, Robert O'Callahan wrote: On Tue, Oct 14, 2008 at 12:13 PM, Maciej Stachowiak [EMAIL PROTECTED] wrote: While the underlying media frameworks can't necessarily answer, if I give you a file with this MIME type, can you play it?, they can at least give a yes/no/maybe answer, which can still be quite helpful, since the UA will know it does not need to check some media streams at all. I agree. If the API lets us answer maybe, there is not much need or temptation to lie, and we can still return information that could be useful to scripts. I have added window.navigator.canPlayType(mimeType). It returns 1, 0, or -1 to represent positive, neutral, and negative responses. This API would be tempting to treat as a boolean but would of course do completely the wrong thing. I think it would be better to either ensure that the positive and neutral responses are both values that JS would treat as true (for instance make the values true, maybe and false), or else make all of the return values something self- descriptive and symbolic (for instance the strings yes, maybe and no). I think 1, 0, -1 are neither clear nor likely to be in any way beneficial for perforamnce. Regards, Maciej
Re: [whatwg] Issue when Video currentTime used for seeking.
On Nov 11, 2008, at 11:24 PM, Chris Double wrote: On Wed, Nov 12, 2008 at 6:36 PM, Biju [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: toKeyFrame - optional, boolean, default false. if true indicates goto the nearest keyframe of the value provided in secondsToSeek. this is to improve performance while avoiding bug https://bugzilla.mozilla.org/show_bug.cgi?id=463358 Good question. Should seeks go to the previous keyframe to the requested time, the next keyframe after the time, the closest keyframe, or the exact frame requested? Seeks should end up as close to the requested time as possible, the behavior wrt keyframes should be an implementation detail. I say as close as possible because it is not always possible to know the file location of an exact time unless all of the media data up to that time has already been downloaded. Regarding that bug, I think it should be going to the last keyframe then decoding up to the point of the requested frame so it can display non-garbage data. But is there a requirement to be able to identify keyframes from JavaScript? I suspect not but don't know. I agree that it doesn't make sense to try to identify keyframes from JavaScript. I can't imagine that it will be commonly used, and with (at least some) streaming formats a media engine has no information about keyframe location or availability. eric
Re: [whatwg] Video element and duration attribute
Chris - On Oct 31, 2008, at 1:18 PM, Chris Double wrote: Some video formats don't make it easy to get the duration. For example, Ogg files can be concatenated to form a single playable file. To compute the duration you need to do multiple seeks to find the chains and total the durations of each chain. Even in the unchained case a seek is required to go to the end of the file and work backwards finding a packet with a timestamp. While this is not difficult to implement it can be expensive over HTTP, requiring multiple byte range requests. The most common player for Ogg files on the web is probably the Cortado Java applet, and it has an applet parameter to specify the duration. There have been requests in #theora from people wishing that video supported a duration attribute that could be set in the HTML. Would such an attribute be useful? It seems to be a commonly used in current Ogg web solutions. Are there any other video formats that could benefit from this? There are other audio and video formats that require a file's duration to be computed, eg. an MP3 file without an MPEG audio frames packet or a muxed MPEG stream, but I don't think including a duration attribute is necessary. Instead of seeking to the end of the file to calculate an exact duration as you describe, it is much cheaper to estimate the duration by processing a fixed portion of the file and extrapolating to the duration based on the file size. QuickTime does this and it works quite well. An estimate may not be correct, but the spec requires a user agent to post a durationchange event for exactly this case. If we have a duration attribute a user agent will still have to deal with pages that don't include it, or that include a value that is wildly inaccurate (copy/paste editing?), so I think it makes more sense for the user agent/media engine to just figure it out. eric
Re: [whatwg] video tag : loop for ever
On Oct 15, 2008, at 8:31 PM, Chris Double wrote: On Thu, Oct 16, 2008 at 4:07 PM, Eric Carlson [EMAIL PROTECTED] wrote: However I also think that playing just a segment of a media file will be a common use- case, so I don't think we need start and end either. How would you emulate end via JavaScript in a reasonably accurate manner? With a cue point. If I have a WAV audio file and I want to start and stop between specific points? For example a transcript of the audio may provide the ability to play a particular section of the transcript. If you use a script-based controller instead of the one provided by the UA, you can easily limit playback to whatever portion of the file you want: SetTime: function(time) { this.elem.currentTime = (timethis._minTime) ? this._minTime : (timethis._maxTIme? this._maxTIme:time); } I agree that it is more work to implement a custom controller, but it seems a reasonable requirement given that this is likely to be a relatively infrequent usage pattern. Or do you think that people will frequently want to limit playback to a section of a media file? eric
Re: [whatwg] video tag : loop for ever
On Oct 16, 2008, at 7:32 AM, Nils Dagsson Moskopp wrote: Am Mittwoch, den 15.10.2008, 20:03 -0700 schrieb Eric Carlson: After thinking about this, I'm not sure that limiting playback to a section of a media file will be used very often. Transcript anyone ? If you want to embed a lecture, for example, it makes sense to be able to link to specific points. Certainly. A developer can easily script the same functionality as long as they don't use the default controller, so it seems to me that attributes for this aren't warranted. How ? A script-based controller can easily control the section(s) that are playable as it provides the UI and thus controls the user's access to the media resource. You can have the controller's timeline show just the section you want to play, you can have the controller show the entire duration and limit seeking to just the section, etc. eric
Re: [whatwg] video tag : loop for ever
On Oct 16, 2008, at 8:17 AM, SA Alfonso Baqueiro wrote: playcount=1 only one time playcount=0 loop forever or playcount=-1 loop forever Or how about loop = loop forever, else play one time though? eric
Re: [whatwg] video tag : loop for ever
On Oct 16, 2008, at 9:24 AM, Dr. Markus Walther wrote: Eric Carlson wrote: I agree that it is more work to implement a custom controller, but it seems a reasonable requirement given that this is likely to be a relatively infrequent usage pattern. How do you know this will be infrequent? Of course I don't *know* that 'start' and 'end' attributes will be used infrequently, but I suspect it based on my experience helping developers with the QuickTime plug-in. It has had 'startTime' and 'endTime' attributes for almost ten years, but they are not commonly used. Or do you think that people will frequently want to limit playback to a section of a media file? Yes, I think so - if people include those folks working with professional audio/speech/music production. More specifically the innovative ones among those, who would like to see audio-related web apps to appear. Imagine e.g. an audio editor in a browser and the task play this selection of the oscillogram... Why should such use cases be left to the Flash 10 crowd (http://www.adobe.com/devnet/flash/articles/dynamic_sound_generation.html)? I for one want to see them become possible with open web standards! I am anxious to see audio-related web apps appear too, I just don't think that including 'start' and 'end' attributes won't make them significantly easier to write. In addition, cutting down on number of HTTP transfers is generally advocated as a performance booster, so the ability to play sections of a larger media file using only client-side means might be of independent interest. The 'start' and 'end' attributes, as currently defined in the spec, only limit the portion of a file that is played - not the portion of a file that is downloaded. If you are interested in clients requesting and playing media fragments, you might want to look at the W3C Media Fragments Working Group [1] which is investigating this issue. eric [1] http://www.w3.org/2008/WebVideo/Fragments
Re: [whatwg] video tag: pixel aspect ratio
On Oct 15, 2008, at 1:04 PM, Ralph Giles wrote: On Wed, Oct 15, 2008 at 12:31 PM, Sander van Zoest [EMAIL PROTECTED] wrote: Following that logic, why add the attribute at all? Well, I like the pixelaspect attribute because incorrect aspect ratios drive me up the wall. Because the video and its embedding page are often served from different locations, it's nice to have a way to fix it the doesn't require editing the video file. I agree that incorrectly encoded videos are annoying, but I don't think we should have this attribute at all because I don't think it passes the will it be commonly used smell test. I am also afraid that it will difficult to use correctly, since you frequently have to use clean aperture in conjunction with pixel aspect ratio to get the correct display size. For example (you probably know this, but for the benefit of others following the discussion) DV NTSC video is 720x480, has Rec.601 aspect ratio (10:11), and should be displayed at 640x480. Multiplying 720x480 by 10:11 doesn't give 640x480 however, you have to crop to clean aperture (704x480) first. We *definitely* don't want to expose CLASP. I don't think it should be included in the first version of the spec. eric
Re: [whatwg] video tag : loop for ever
On Oct 15, 2008, at 3:52 PM, Chris Double wrote: On Thu, Oct 16, 2008 at 10:14 AM, Anne van Kesteren [EMAIL PROTECTED] wrote: That's not the question. The question is whether the looping attributes are needed at all. It seems that there's some desire for simple looping, e.g. background sounds. That does not require the five attributes the specification currently provides though. Rather, it requires one simple boolean attribute. I agree. I think the simple boolean attribute seems clearer and more useful. Which attributes exactly are being considered for removal? I'm assuming these ones: playCount loopStart loopEnd currentLoop start and end would remain, yes? After thinking about this, I'm not sure that limiting playback to a section of a media file will be used very often. A developer can easily script the same functionality as long as they don't use the default controller, so it seems to me that attributes for this aren't warranted. eric
Re: [whatwg] video tag : loop for ever
On Oct 15, 2008, at 4:13 PM, Silvia Pfeiffer wrote: I like the simple boolean loop attribute. I am not sure we need loopStart and loopEnd, since we have start and end to reduce the looping to a segment. I would like to avoid going down the SMIL path and creating markup that defines interactive presentation - rather it should just be a video file (or segment) that we do stuff to - not multiple segments that we need to synchronise and sequence etc. I don't think we need loopstart and loopend either, looping over a segment of a media file seems a very uncommon use case. However I also think that playing just a segment of a media file will be a common use-case, so I don't think we need start and end either. As for playCount - I am unsure if aside from a boolean loop attribute we really need to enable the page author to specify how often a video/audio should be viewed/heard. Once, possibly with autoplay, and on loop should be enough for an author. I cannot see a use case for a fixed number of views, but am happy to be told otherwise. On Oct 15, 2008, at 4:57 PM, Antti Koivisto wrote: Would it be sufficient to have boolean attribute for enabling and disabling looping? Looping more than once but not forever seems like a pretty rare use case. I agree that looping a fixed number of times, and looping over a segment of a media file are likely to be very uncommon, so I think a simple boolean is enough. eric
Re: [whatwg] Video : Slow motion, fast forward effects
On Oct 13, 2008, at 3:41 PM, Ian Hickson wrote: On Thu, 7 Aug 2008, Chris Double wrote: On Thu, Aug 7, 2008 at 6:20 PM, Ian Hickson [EMAIL PROTECTED] wrote: On Thu, 7 Aug 2008, Biju [EMAIL PROTECTED] wrote: So can I assume HTML5 spec also allow playbackRate to be negative value. ie to support go backward at various speed Yes. Would you expect the audio to be played backwards too? Given that codecs are often highly optimized for forward playback the cost in memory can be excessive to go the other way. Could there be a possibility to say 'reverse playback not supported'? The spec now requires audio playback not to occur when going backwards, and allows the user agent to mute audio playback for rates less than or greater than 1.0 if desired. Some media formats and/or engines may not support reverse playback, but I think it is a mistake for the spec to mandate this behavior. Why is reverse playback different from other situations described in the spec where different UAs/ media formats will result in different behavior, eg. pitch adjusted audio, negotiation with a server to achieve the appropriate playback rate, etc? I think the current sentence that talks about audio playback rate: When the playbackRate is so low or so high that the user agent cannot play audio usefully, the corresponding audio must not play. could be modified to include reverse playback as well: When the playbackRate is such that the user agent cannot play audio usefully (eg. too low, too high, negative when the format or engine does not support reverse playback), the corresponding audio must not play. Eric On Thu, 7 Aug 2008, Maik Merten wrote: An interesting case would also be how to handle playback speeds smaller than 1x in the streaming case, given that you cannot possibly have an infinite buffer of input data. Irrespective of whether the file is streaming is not, you'll always have problems like this to do with. (Streaming makes them much more obvious though.) Streaming mostly forces a playback speed of +1x in all cases. I don't think that's accurate. On Thu, 7 Aug 2008, Philip Jägenstedt wrote: I suggest that the spec allows raising the NOT_SUPPORTED_ERR exception in response to any playback rate which it cannot provide for the current configuration. With a netcast you couldn't support any playback rate except 1.0 without first buffering all the data you want to play at a faster rate, so changing the playback rate doesn't make sense. Throwing NOT_SUPPORTED_ERR must be better than just ignoring it, but the question is if script authors will remember to check for exceptions when setting the attribute... I think you should definitely be able to slow down or speed up locally, and I think it would make perfect sense for a UA to buffer the last N minutes of data, to allow pausing and seeking in the recent stream. This is what TiVo does, for instance, with live TV. I agree that we need to do something to stop seeking backwards past the start of the buffer, though. I've redefined effective start and company to make the UA seek when the buffer's earliest possible point moves. On Thu, 7 Aug 2008, Dave Singer wrote: Would you expect the audio to be played backwards too? I think that's extra credit and optional. It's now not allowed, though I suppose an author could always have two video elements and could make the hidden one seek back and play samples forwards as the other is playing the video backwards. I think that the spec. should say that degraded playback (e.g. I frames only) or no playback (non-reversed audio) may occur... I think that's a quality of implementation issue, I don't really see what the spec can say about it. On Thu, 7 Aug 2008, Dave Singer wrote: I'm sorry if I wasn't clear: I agree. If you want your implementation to shine, or be used heavily for audio scrubbing, or something, go ahead and implement. But it should not be required. (For extra credit) We don't want some to do it and some not to do it, because then we get all kinds of interoperability problems (e.g. someone who hides his video and rewinds it at a particular rate for some reason or other, and doesn't hear audio in one UA, deploys, and finds his users get audio on another UA). On Thu, 7 Aug 2008, Charles Iliya Krempeaux wrote: This feature would be used to implement scrubing. Like what you see in Non-Linear Editors... for making movies, etc. (I.e., grabbing the position handle of the player, and moving it forwards and backwards through the video, and varying speeds, to find what you are looking for.) In those types of applications, the audio is on. And it is important for usability, for the video editor to hear the sound. I agree that on the long term we would want to provide this, but I think that is something we should offer as a separate feature (e.g. a flag that decides whether
Re: [whatwg] Scripted video query proposal
On Aug 21, 2008, at 8:56 PM, Robert O'Callahan wrote: On Fri, Aug 22, 2008 at 2:57 PM, Eric Carlson [EMAIL PROTECTED] wrote: It is possible to build a list of all types supported by QuickTime dynamically. WebKit does this, so Safari knows about both the built in types and those added by third party importers. You mean this http://trac.webkit.org/browser/trunk/WebCore/platform/graphics/mac/MediaPlayerPrivateQTKit.mm#L815 which calls this? http://developer.apple.com/documentation/QuickTime/Reference/QTKitFramework/Classes/QTMovie_Class/Reference/Reference.html#/ /apple_ref/occ/clm/QTMovie/movieFileTypes: Yes, and the Windows version is here: http://trac.webkit.org/browser/trunk/WebCore/platform/graphics/win/QTMovieWin.cpp#L695 Does that actually enumerate all supported codecs? Looking at the Webkit code and the Quicktime docs, it looks like it's just enumerating file/container types. Indeed the code enumerates movie importers and just builds a list of the MIME types supported by QuickTime, so it can not yet deal with a type string with an RFC4281 codecs parameter. We are working on that requirement, but the current approach is still useful because the codecs parameter is not yet widely used. eric
Re: [whatwg] Scripted video query proposal
On Aug 22, 2008, at 2:36 PM, Robert O'Callahan wrote: On Sat, Aug 23, 2008 at 1:46 AM, Eric Carlson [EMAIL PROTECTED] wrote: On Aug 21, 2008, at 8:56 PM, Robert O'Callahan wrote: Does that actually enumerate all supported codecs? Looking at the Webkit code and the Quicktime docs, it looks like it's just enumerating file/container types. Indeed the code enumerates movie importers and just builds a list of the MIME types supported by QuickTime, so it can not yet deal with a type string with an RFC4281 codecs parameter. We are working on that requirement, but the current approach is still useful because the codecs parameter is not yet widely used. That will require extensions to Quicktime, right? Correct. So using your current approach implement Tim's proposed API, we can use this to answer yes or no if the MIME type contains no codec string, and if the MIME type does contain a codec string we can either answer no (if the container is not supported) or maybe. I suppose if Tim's willing to assume that anything supporting the Ogg container supports Theora and Vorbis, that's good enough for now ... for Quicktime. We'll have to look into whether something similar is possible with GStreamer and DirectShow. But I guess even if it isn't, a 3-value version of Tim's proposed API is better than nothing. A three state return is an interesting idea, but wouldn't you then be required to return maybe for MIME types that can describe multiple formats? For example, video/mpeg can be used to describe a video elementary stream, an MPEG-1 system stream, an MPEG-2 program stream, or an MPEG-2 transport stream. application/ogg can include dirac, flac, theora, vorbis, speex, midi, cmml, png, mng, jng, celt, pcm, kate, and/or yuv4mpeg. And then there is video/quicktime... I think it makes more sense to leave it as a boolean, where no means the UA does not support the type, and yes means that the UA implements some support for the type but errors can occur during loading and/or decoding. eric
Re: [whatwg] Scripted video query proposal
On Aug 21, 2008, at 7:46 PM, Robert O'Callahan wrote: Any browser that supports integration with an extensible framework like GStreamer, Quicktime or Direct Show is going to have a hard time ever reporting false. Apparently there was a conversation today in #theora that you might have seen whcih explains why this is so, at least for GStreamer. With a three-value return, at least Firefox with both Ogg Theora and Quicktime support could return yes for Ogg and maybe for other types. But I think Safari is going to have to return maybe all the time --- except perhaps for codecs built into Quicktime. That doesn't help you. It is possible to build a list of all types supported by QuickTime dynamically. WebKit does this, so Safari knows about both the built in types and those added by third party importers. eric
Re: [whatwg] Accessibility and the Apple Proposal for Timed Media Elements
Benjamin - On Apr 4, 2007, at 11:44 PM, Benjamin Hawkes-Lewis wrote: Re: http://webkit.org/specs/HTML_Timed_Media_Elements.html There are three things I'd hope to see from a video element: 1) Ease of use compared to object (A common API contributes to this, and v2 might approach it with a default UI. Lack of agreement on a baseline format is a major problem here.) 2) Experiments in hyperfilm. 3) Better accessibility features than are provided by the current object or embed + plugin architecture. We are actively discussing accessibility features internally, and should have something to propose to this group shortly. eric
Re: [whatwg] video element feedback
On Mar 23, 2007, at 1:27 PM, Silvia Pfeiffer wrote: On 3/23/07, Nicholas Shanks [EMAIL PROTECTED] wrote: Can't we have all of: 1) A way for authors to match up timecodes with fragment identifiers in the fallback content 2) A way for UAs to skip to that time code if a fragment identifier is requested and it's contained within fallback the UA isn't displaying 3) And a way for users to link to timecodes that aren't marked up at all. I completely agree. Since we have to stick with the way that URIs are defined and the way that HTTP works, we can realise these in the following ways: 1. Either the fallback content has a means in which fragment identifiers are encoded into the video (as e.g. CMML/Annodex provides for Ogg Theora, or chapter markers in QuickTime), or the UA can request this information from the server, where it may be stored in a DB or in a XML file (such as CMML) and can be returned to the UA. 2. Again, there are two alternatives that are possible - either using the fragment (#) for identifiers or using queries (?) to provide the offset (or named anchor, or time section) in the URI. When using fragments, the UA first has to download the full file and then undertake the offset for playback itself. When using queries, a server module can do the offset and thus avoid potentially large amounts of binary data to be downloaded to the UA which the users may never want to look at. The use of queries will be absolutely necessary for mobile phones for example, where you pay through the nose for bandwidth use. 3. covered in my reply to 2. I know of only one format that provides for all this functionality at the moment and that is Ogg Theora with the Annodex and CMML extensions. In particular the part where a server component is required to provide a valid file from a time offset (and - don't get me wrong - a anchor is nothing else but a named time offset) is unique. Annodex has a apache module called mod_annodex to provide this functionality. And it has python, php and perl bindings to provide this functionality through typical Web scripting languages. Even without a server component, #2 and #3 do not require the UA to download the full file if it can use byte range requests for random access and the file format has time to offset tables (eg. the 'moov' resource in a QuickTime movie or ISO-based file, the 'movi' LIST chunk in an AVI file, etc). eric
Re: [whatwg] video element feedback
On Mar 23, 2007, at 3:49 PM, Silvia Pfeiffer wrote: Hi Eric, On 3/24/07, Eric Carlson [EMAIL PROTECTED] wrote: Even without a server component, #2 and #3 do not require the UA to download the full file if it can use byte range requests for random access and the file format has time to offset tables (eg. the 'moov' resource in a QuickTime movie or ISO-based file, the 'movi' LIST chunk in an AVI file, etc). I agree partially. You're right - it doesn't need to download everything. But there are two catches: 1) The UA doesn't know what byterange a timecode or timerange maps to. So, it has to request this information from the server, who has access to the file. For QuickTime movies, the UA would need to request the offset table from the server and for AVI it would need to request the chunking information. 2) Just streaming from an offset of a video file often breaks the file format. For nearly all video formats, there are headers at the beginning of a video file which determine how to decode the video file. Lacking this information, the video files cannot be decoded. Therefore, a simple byterange request of a subpart of the video only results in undecodable content. The server actually has to be more intelligent and provide a re-assembled correct video file if it is to stream from an offset. Yes, the UA needs the offset/chunking table in order to calculate a file offset for a time, but this is efficient in the case of container formats in which the table is stored together with other information that's needed to play the file. This is not the case for all container formats, of course. The UA would first use byte range requests to download the header. If the information is stored somewhere other than the beginning of the file, it may take several byterange requests to find it, but this is not much less efficient with ISO-derived or RIFF type formats. Once is has the headers, it will able to calculate the offset for any time in the file and it can request and play the media for *any* time range in the file. This scheme has the added benefit of not requiring the header to be downloaded again if the user requests another time range in the same file. eric
Re: [whatwg] Apple Proposal for Timed Media Elements
On Mar 21, 2007, at 7:20 PM, Robert Brodrecht wrote: On Mar 21, 2007, at 5:08 PM, Maciej Stachowiak wrote: The DOM attribute currentRate is the rate at which a media element is currently playing. I'm guessing this would be in frames per second? Is it the frames per second it is playing or the available frames per second encoded in the video? No, it is a multiple of the file's intrinsic (or natural) rate. Frames per second loses meaning quickly with digital media files, where individual frames can have arbitrary duration (true even for animated GIF files). The DOM attribute hasAudio returns a value that specifies whether the element has audio media. Does a video element hasAudio return true or false? Is this based only on the existence of some media or will it determine if the video actually has an audio track? It is based on the presence of absence of an audio track, so a video element may or may not return true. eric