Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Fri, Jan 21, 2011 at 1:12 PM, Michael Dale d...@ucsc.edu wrote: On 01/13/2011 01:44 AM, Philip Jägenstedt wrote: Changing the default at this point wouldn't really hurt since not all browsers are doing exact seeking anyway, right? I think that keyframe seeking is more often what you want and it'd be better to let the few who want frame-exact seeking jump through hoops than having everyone who makes a simple seeking UI discover that seeking is a bit slow and 20% of them figuring out that it can be fixed by using seek(). Is accurate seeking *that* much slower than keyframe seeking? I would discourage changing the accurate seek default. If supported at all you should opt into fuzzy seeking. Assuming the client can buffer fast enough for real-time playback, a seek to keyframe + time in the worst case would would take time offset. But in most cases the data to time ratio after the keyframe is much more sparse then the keyframe itself. So the ranged request for key + data to time will normally not be a lot more than keyframe + data for future playback for seeking to a keyframe. You can see seeking is pretty fast in the indexed seek firefox demos. It's usually the decoding, not the file access that kill you. Firefox seeking on low resolution clips is very snappy index or not. But try a 1080p clip encoded with a 10 second maximum keyframe interval... If your client is just capable of realtime playback then you can be waiting as much as ten seconds for an exact seek. Even if it's 2x realtime (a pretty reasonable multiple for 1080p) you're still taking about 5 seconds in the worst case. Basically, as the decoding speed approaches realtime the seeking time approaches what you'd get by seeking to the prior keyframe and playing up to the current point, except with the exact seeking you sit around twiddling your thumbs while the client looks broken. I'm personally a fan of a heuristic approach where the first seek from a user goes to a keyframe while subsequent seeks soon after the first are all exact. But this is a user interface thing, and not really a question of what the API should provide.
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Fri, Jan 21, 2011 at 1:38 PM, Rob Coenen coenen@gmail.com wrote: I still want the API to support seeks to exact frames. eg when I build some GUI that allows the user to click on a button that says explosion shot 1 at 00:31:02.15 then I want the player to seek to 00:31:02.15 exactly and not to say, 00:31:02.01 simply b/c that's where a keyframe happens to be. It's unthinkable to me that it wouldn't at least offer exact seeking as an option. There are too many interesting programmatic things that you can't do without exact seeking. Though I don't think that has anything to do with the default behavior. Since the default behavior is already inconsistent perhaps it should just be formally declared to be at the implementers whim.
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Fri, Jan 21, 2011 at 4:04 PM, Philip Jägenstedt phil...@opera.com wrote: Since, as you say, the behavior is currently inconsistent, there is still time to agree on something that makes sense and have everyone implement that. I think the best default is keyframe seeking and haven't seen any strong arguments for accurate seeking as the default yet. Concretely: Add seek(time, flags) where flags defaults to nothing. Accurate seeking would be done via seek(time, accurate) or some such. Setting currentTime is left as is and doesn't set any flags. Most of the plumbing is already in the spec: http://whatwg.org/html#dom-media-seek I don't like keyframe seeking as the default. Keyframe seeking assumes things about the container, codec, and encoding which may not be constants or even applicable to all formats. For example a file with rolling intra may have no keyframes, and yet are perfectly seekable. Or if for some reason a client can do exact seeking very cheaply for the request (e.g. seeking to the frame immediately after a keyframe) then that ought to be permitted too. I'd rather say that the default should be an implementation defined accuracy, which may happen to be exact, may differ depending on the input or user preferences, etc.
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Fri, Jan 21, 2011 at 4:42 PM, Roger Hågensen resca...@emsai.net wrote: Accurate seeking also assumes things about the codec/container/encoding. If a format does not have keyframes then it does have something equivalent. Formats without keyframes can probably (I might be wrong there) seek more accurate than those with keyframes. You can _always_ seek accurately if you can seek at all, just not necessarily efficiently: if all else fails you decode the video from the start. Keyframes are orthogonal to this: I can construct Theora streams (and presumably VP8, and other interframe formats) where you can begin decoding at any point and after decoding no more than N frames (where N is some value of my choosing, perhaps 24) the decoder is completely synchronized and bit-accurate. A stream with keyframes is no more seekable than such a stream, they are just less computationally expensive to seek if and only if you don't mind only seeking to the keyframes, for seeking to arbitrary locations a rolling intra scheme and an exact recovery scheme are the same. (Which is why firefox correctly decodes Theora files constructed in this manner, even though that was never a consideration). Exact should be exact. Consider a video editor application created using the video tag and canvas. A failure to operate exactly may cause data corruption. A stream which isn't incrementally seekable should be decoded from the front in the case of an exact request. The potentially high cost of an exact seek is the primary reason why I wouldn't want to make the default behavior mandate exact, but exact still needs to be available. On Fri, Jan 21, 2011 at 4:57 PM, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: * the default is best effort I fear that the best effort language is misleading. You can always do exact on a stream that you can seek to the beginning. So the best would be exact. The language I'd prefer is fast. Fast may be exact, or it might just be to the nearest keyframe, or something in between. It might just start you over at the beginning of the stream. One question about inexact seeking is what should the client do when the current playtime is closer to the requested time than what the inexact seek would provide? * KEYFRAME is keyframe-accurate seeking, so to the previous keyframe What does this mean when a seekable stream doesn't have interior keyframes? Should the client always seek to the beginning? Why is this valuable over a fast option?
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Fri, Jan 21, 2011 at 5:11 PM, Glenn Maynard gl...@zewt.org wrote: Should there be any consistency requirements for fast seeking? [snip] This could have unexpected side-effects. Should this be allowed? I'd suggest that fast seeking should always be consistent with itself, at least for a particular video instance. Good point. On Fri, Jan 21, 2011 at 5:05 PM, Gregory Maxwell gmaxw...@gmail.com wrote: One question about inexact seeking is what should the client do when the current playtime is closer to the requested time than what the inexact seek would provide? The above would also answer your question: seeking would be unaffected by the current play cursor. It would, but it also results in some pretty perplexing and unfriendly behavior... where the user hits the 5-seconds forward button and ends up going 4-seconds backwards, only to press the button again and repeatedly land on the same spot. It would be possible to require that the derivative with respect to the current play time have the same sign as the request to avoid this perplexing behavior (and this appears to be what standalone media players appear to do with forward and backward buttons) but it runs into the consistency problems you've raised. So I'm seeing three relevant dimensions: Consistency, Accuracy of the position, and Accuracy of the direction. Consistency and accuracy of the direction are not compatible unless the seeking is exact.
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Fri, Jan 21, 2011 at 5:25 PM, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: The language I'd prefer is fast. Fast may be exact, or it might just be to the nearest keyframe, or something in between. It might just start you over at the beginning of the stream. That is putting a restriction on how the browser has to seek - something I'd rather leave to the browser in the general case where No more than 'best' is, I suppose. I think you missed the argument I'm making: I'm saying that it's perfectly reasonable to assume that Best effort means exact in any seekable stream, because exact is best and best is always possible. This is the same kind of reasoning sequence that allows you to conclude that fast requires the browser to use the fastest. One question about inexact seeking is what should the client do when the current playtime is closer to the requested time than what the inexact seek would provide? In the case of fastest, the browser must then not do a seek. In the case of don't care, it's up to the browser if it does the seek or not. That was my thinking, but I find the consistency point raised by Glenn to be concerning. * KEYFRAME is keyframe-accurate seeking, so to the previous keyframe What does this mean when a seekable stream doesn't have interior keyframes? Should the client always seek to the beginning? Why is this valuable over a fast option? Where no keyframes are available, this seek option simply doesn't do anything, since obviously there are not keyframes. The point is that where this concept exists and people want to take advantage of it, this seek should be possible. I really feel that keyframe is far deeper into the codec internals than should be or currently is provided by rest of the video API. I've frequently seen content authors and application developers make incorrect assumptions about key-frames: That they always indicate scene changes, that they always occur at an exact interval, that two differently encoded files will have the keyframes in the same places. Etc. That these things are sometimes true feeds the mistaken impressions. The frametypes inside a codec are really deep internals that we ought not encourage people to mess with directly. It seems surprising to me that we'd want to expose something so deeply internal while the API fails to expose things like chapters and other metadata which can actually be used to reliably map times to meaningful high level information about the video.
Re: [whatwg] HTML5 video: frame accuracy / SMPTE
On Fri, Jan 21, 2011 at 7:27 PM, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Sat, Jan 22, 2011 at 9:48 AM, Gregory Maxwell gmaxw...@gmail.com wrote: On Fri, Jan 21, 2011 at 5:25 PM, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: The language I'd prefer is fast. Fast may be exact, or it might just be to the nearest keyframe, or something in between. It might just start you over at the beginning of the stream. That is putting a restriction on how the browser has to seek - something I'd rather leave to the browser in the general case where No more than 'best' is, I suppose. I think you missed the argument I'm making: I'm saying that it's perfectly reasonable to assume that Best effort means exact in any seekable stream, because exact is best and best is always possible. This is the same kind of reasoning sequence that allows you to conclude that fast requires the browser to use the fastest. Except that best effort has a tradition in meaning probably not the best possible way and inexact, while fast would imply to people that there is no faster way of doing it. This is certainly not the implication it carries for me, but I don't see any point in arguing it further. In the case of fastest, the browser must then not do a seek. In the case of don't care, it's up to the browser if it does the seek or not. That was my thinking, but I find the consistency point raised by Glenn to be concerning. If fast always means the fastest possible way, then it will be consistent. It may mean that we need to specify it a bit more to be consistent, but it would mean that. While default would be a best effort and therefore cannot be expected to be consistent. That is not always true. If you're at frame 5 the _fastest_ possible way to seek to frame 6 is to simply decode frame 6 (and probably also the case for quite a few subsequent frames assuming that the decode is fast and the data is in cache, while the nearest keyframes are not in cache). If you're at frame 800 then the fastest way to seek to frame 6 may be to start from frame three. It's a toy example, I know. But I think it makes the point that consistency has to be explicitly decided for any modes which are not implementation dependent. It might be reasonable enough to say if you need guaranteed consistency, use the exact mode. I could live with a seeking method without keyframe seeking. But frame accurate seeking needs to be possible. So we end up with: * default (best effort) * TIME (or ACCURATE) * FRAME * FASTEST I have no complaints about these. I still think we've left the inaccurate but right direction case uncovered. And perhaps its over-design, but its also functionality that exists in current media players. E.g. mplayer will do correct-direction inaccurate seeks, though this may just be an artifact of the tools not having decent exact seeking logic at all.
Re: [whatwg] Encrypted HTTP and related security concerns - make mixed content warnings accessible from JS?
On Sat, Nov 13, 2010 at 5:37 PM, Ingo Chao i4c...@googlemail.com wrote: 2010/11/13 timeless timel...@gmail.com: [snip] Good contracts with the component's providers of a mashup are neccessary, but not sufficient to resolve the mixed https/http issue in reality. Another ingredient for a secure mashup would be the event I am proposing, to alert the mashup's owner that something was going wrong, by mistake. That a component was loaded insecure. This sounds to me like the kind of reasoning which resulted in the CSP policy set stuff: https://developer.mozilla.org/en/Security/CSP (and, in particular, the violation reports)
Re: [whatwg] Video with MIME type application/octet-stream
On Mon, Sep 6, 2010 at 3:19 PM, Aryeh Gregor simetrical+...@gmail.com wrote: On Mon, Sep 6, 2010 at 4:14 AM, Philip Jägenstedt phil...@opera.com wrote: The Ogg page begins with the 4 bytes OggS, which is what Opera (GStreamer) checks for. For additional safety, one could also check for the trailing version indicator, which ought to be a NULL byte for current Ogg. [1] [2] OggS\0 as the first five bytes seems safe to check for. It's rather short, I guess because it's repeated on every page, but five bytes is long enough that it should occur by random only negligibly often, in either text or binary files. Um... If you do that you will fail to capture on files that most other ogg reading tools will happily capture on. Common software will read forward until it hits OggS then it will check the page CRC (in total, 9 bytes of capture). For example, here is a file which begins with a kilobyte of \0: http://myrandomnode.dyndns.org:8080/~gmaxwell/test.ogg Everything I had handy played it. This could fail to capture on a live stream that didn't ensure new listeners began at a page boundary. I don't know if any of these exist. I don't know if breaking these cases would matter much but herein lies the danger of sniffing— everyone thinks they're an expert but no one really has a handle on the implications.
Re: [whatwg] Video with MIME type application/octet-stream
On 8/31/10, Aryeh Gregor simetrical+...@gmail.com wrote: If you can't come up with any actual problems with what IE is doing, then why is anything else even being considered? There's a very clear-cut problem with relying on MIME types: MIME types are often wrong and hard for authors to configure, and this is not going to change anytime soon. Aggressive sniffing can and has resulted in some pretty nasty security bugs. E.g. an attacker crafts an input that a website identifies as video and permits the upload but which a browser sniffs out to be a java jar which can then access the source URL with the permissions of the user. The sniffing rules, in some contexts and some browsers can also end up causing surprising failures... e.g. I've seen older versions of some sniffing heavy browsers automatically switch into UCS-2LE encoding at wrong and surprising times. Perhaps this is irrelevant in a video specific discussion of sniffing— but it is a hazard with sniffing in general. Moreover, it'll never be consistent from implementation to implementation, which seems to me to be pretty antithetical to standardization in general.
Re: [whatwg] video
On Sun, Jun 20, 2010 at 8:23 PM, Nils Dagsson Moskopp nils-dagsson-mosk...@dieweltistgarnichtso.net wrote: AFAIK, at least Firefox shows a fullscreen option already in the context menu. What makes you think there is another attribute needed (besides @controls) ? So... an interesting bit of fun comes up when you use layout tricks to prevent the context menu in order to make save as impossible, and you eliminate the full screen option as an unwanted side effect. There is also the issue of the context menu not being an especially intuitive or discoverable way of activating it, especially if all the rest of the controls are buttons below the video.
Re: [whatwg] video
On Sun, Jun 20, 2010 at 9:17 PM, Ashley Sheridan a...@ashleysheridan.co.uk wrote: On Sun, 2010-06-20 at 21:06 -0400, Gregory Maxwell wrote: On Sun, Jun 20, 2010 at 8:23 PM, Nils Dagsson Moskopp nils-dagsson-mosk...@dieweltistgarnichtso.net wrote: AFAIK, at least Firefox shows a fullscreen option already in the context menu. What makes you think there is another attribute needed (besides @controls) ? So... an interesting bit of fun comes up when you use layout tricks to prevent the context menu in order to make save as impossible, and you eliminate the full screen option as an unwanted side effect. There is also the issue of the context menu not being an especially intuitive or discoverable way of activating it, especially if all the rest of the controls are buttons below the video. I think the context menu is a sensible place to find a fullscreen option short of a button existing as part of the video controls. Short of. Yes. But I think that the gap between the two is pretty big from a usability perspective (though I have no data to back this up) Also, if someone is daft enough to think that disabling the context menu will prevent people from saving their clips, then they deserve the pain of not being able to make the video play full-screen :p I can agree that it's usually a bad move... and people who do it to prevent saving belong in the same circle of hell as the people that send HTML mail to mailing lists (ahem :) ), but I think you're incorrect to assume that it's always a daft move. For example, what if you have a video available in multiple qualities and you believe it would be beneficial for the viewers of your site to be corralled into your official download page where they can get the most appropriate file, rather than getting the 'thumbnail' that comes out of the save as? ... if the save as works, many people are going to use it without even bothering to look at the superior download link. Or you might want to direct the user to a lower quality/cost distribution service which isn't suitable for streaming but works as well for non-realtime use. Even if your interest is in preventing downloads— simply slowing down most people most of the time may adequately address your need, and overlaying a transparent div to hide access to the controls is a perfectly workable way of doing that. In any case, I think the motivations of context-menu-prohibitions are mostly off topic here. The simple usability issue is enough to justify doing something, if something can actually be done.
Re: [whatwg] Video Tag Proposal
On Sat, Mar 27, 2010 at 9:45 PM, Aaron Franco aa...@ngrinder.com wrote: I can see how it is counter productive in the creation of the specification, but the fact that such licensing is being considered for what is supposed to be open free is counter productive to the advancement of web technologies. I feel we cannot allow companies like Microsoft and Apple to take advantage of such patents. Allowing the H.264 to be a part of the spec without it being royalty free only gives those corporations more control [snip] Ah! Now I understand. H.264 is not under consideration as part of spec, and I don't believe that anyone has ever even tendered a serious proposal that it be considered as part of the specification for exactly the reasons that you've enumerated. It wasn't clear to me that you were unaware of this, I thought you were attempting to propose a way— though, sadly, an unworkable one— in which it could be considered. Cheers!
[whatwg] Video tag in IE
I thought the list might appreciate this news regarding plugin-added video/ support in Internet Explorer: http://cristianadam.blogspot.com/2010/02/ie-tag-take-two.html
Re: [whatwg] Video tag in IE
On Sun, Feb 21, 2010 at 8:58 PM, Ashley Sheridan a...@ashleysheridan.co.uk wrote: On Sun, 2010-02-21 at 20:26 -0500, Gregory Maxwell wrote: I thought the list might appreciate this news regarding plugin-added video/ support in Internet Explorer: http://cristianadam.blogspot.com/2010/02/ie-tag-take-two.html Isn't that just a plugin being used to display content that shouldn't require a plugin to play? Or am I missing something? It's an Activex component providing the HTML5 video tag syntax to a browser which doesn't otherwise have it. The video is invoked with the regular HTML5 syntax, and while this is still very early and incomplete the intent is to provide as large a portion of the API as is possible. This isn't a replacement for real native support. It's an intermediate interoperability measure which, when completely, should lower the barriers to deploying HTML5 and abandoning legacy playback technology.
Re: [whatwg] Video source selection based on quality (was: video feedback)
On Thu, Feb 18, 2010 at 12:41 PM, Tim Hutt tdh...@gmail.com wrote: Good point. You mean something like a .ram file? I think both techniques should be supported -- a metadata file is extra hassle to set up if you /are/ the HTML and video author, and it involves an extra file download which will slow things down. Maybe something like: video src=many_files.sources Which is equivalent to video [the contents of many_files.sources (which optionally contains bitrate and resolution tags)] /video Obviously this shouldn't be implemented like #include because that would be insecure. Instead the UA would parse the file and make sure it only contained source tags. But what happens when you need some custom UI to switch among sources... Or what about camera angles, languages, or any of the other logically connected things you'd like to bundle together as a single video object? Special logic for handling non-HTML5 video fallbacks? Logic for nearest-mirror selection? Advertising? All this can be solved by using object/embed to embed a player pagelet into your pages, it also works across sites. Why is special case handling for just selecting among multiple rates interesting or important?
Re: [whatwg] Video source selection based on quality (was: video feedback)
On Tue, Feb 16, 2010 at 10:33 AM, Tim Hutt tdh...@gmail.com wrote: [snip] It's up the UA. It can ping the server if it wants. If I were writing the UI for firefox, for example I would have it do the following: [snip] 3. If the default isn't the highest quality, show a little Better quality available tooltip similar to youtube's Watch in HD. 4. If the video stutters a lot, and there is a lower quality video available, display a (non-modal) message along the lines of Lower quality video is available, it may work better. Imagine that you are a user-agent. Place these streams in order of quality: 1. 854x480 4:2:0 @ 1mbit/sec. average rate. 2. 1280x720 4:2:0 @ 1mbit/sec. average rate. 3. 640x360 4:4:4 @ 2mbit/sec. average rate. Or these: 1. 640x360 4:2:0 @ 1mbit/sec. average rate peaking to 1.4mbit/sec (over 64 frames). 2. 640x360 4:2:0 @ 0.7mbit/sec. average rate peaking to 8mbit/sec (over 64 frames). Or: 1. 640x360 simple profile @ 800kbit/sec average 2. 640x360 super-ultra mega profile requiring a water-cooled supercomputer to decode @ 700kbit/sec average. I don't think it's hard to imagine that in each of these cases there exists a real quality ranking which the creator of the videos could be well aware of, but that no user-agent could determine automatically. Moreover, even the switch to a lower rate if you are exhausting your buffer isn't a necessary a good strategy when the 'lower rate' stream is one which places more buffer pressure.
Re: [whatwg] Video source selection based on quality (was: video feedback)
On Mon, Feb 15, 2010 at 11:44 PM, Hugh Guiney hugh.gui...@gmail.com wrote: And when other established terms are used, like 480p—which, in virtually every other context, refers to 720x480, the most common of the acceptable resolution for DVDs—yet the video is *854*x480, that's also confusing. Ted could download such a 480p clip and attempt to burn it, only to discover that his authoring program won't accept that format. ... Pixel aspect ratios. This whole discussion has been painful to watch.
Re: [whatwg] video feedback
On Wed, Feb 10, 2010 at 4:37 PM, Robert O'Callahan rob...@ocallahan.org wrote: On Thu, Feb 11, 2010 at 8:19 AM, Brian Campbell lam...@continuation.org wrote: But no, this isn't something I would consider to be production quality. But perhaps if the WebGL typed arrays catch on, and start being used in more places, you might be able to start doing this with reasonable performance. With WebGL you could do the chroma-key processing on the GPU, and performance should be excellent. In fact you could probably prototype this today in Firefox. You're not going to get solid professional quality keying results just by depending on a client side keying algorithm, even a computationally expensive one, without the ability to perform manual fixups. Being able to manipulate video data on the client is a powerful tool, but its not necessarily the right tool for every purpose.
Re: [whatwg] HTML5 video element - default to fallback in cases where UA can't play format
On Tue, Oct 27, 2009 at 7:40 PM, Kit Grose k...@iqmultimedia.com.au wrote: [snip] I expected (incorrectly, in this case) that if I only produced one source element (an MP4), Firefox would drop down to use the fallback content, as it does if I include an object element for a format not supported (for example, if I include a QuickTime object and QT is not [snip] Please see the list archives: http://www.mail-archive.com/whatwg@lists.whatwg.org/msg16092.html In short, there are corner cases which make this untenable. E.g. what happens when you have JS doing brilliant canplaytype magic, but also provide tag-interior fallbacks for clients which may not even have JS? They'll fire off and start before your JS has a chance to do its canplaytype magic. Of course, standards-compliant HTML/XHTML using CSS and Javascript as required by the design, as your products are described, is pretty meaningless when its applied to sites no more compatible than the bad old Works best in IE days, only it's now Apple™ and Adobe™. I urge you to consider the values of an open and interoperable web against the costs, and I hope you are informing your customer(s) about any long term licensing fees for the formats chosen for their site.
Re: [whatwg] a onlyreplace
On Fri, Oct 16, 2009 at 4:43 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: [snip] Isn't if inefficient to request the whole page and then throw most of it out? With proper AJAX you can just request the bits you want. == This is a valid complaint, but one which I don't think is much of a problem for several reasons. [snip] 3. Because this is a declarative mechanism (specifying WHAT you want, not HOW to get it), it has great potential for transparent optimizations behind the scenes. [snip] Yes— A HTTP request header that gives the only-replace IDs requested, and the server is free to pare the page down to that (or not). I hit reply to point out this possibility but then saw you already basically thought of it, but a query parameter is not a good method: it'll break bookmarking ... and preserving bookmarking is one of the most attractive aspects of this proposal. [snip] What about document.write()? What if the important fragment of the page is produced by document.write()? Then you're screwed. document.write()s contained in script blocks inside the target fragment will run when they get inserted into the page, but document.write()s outside of that won't. Producing the target fragment with document.write() is a no-go from the start. Don't do that anyway; it's a bad idea. I'm guessing that the rare case where you need to write into a replaced ID you can simply have a JS hook that fires on the load and fixes up the replaced sections as needed.
Re: [whatwg] Alt attribute for video and audio
I have no opinion on the need being adequately covered by other attributes, but… On Sun, Aug 9, 2009 at 11:05 PM, Remcoremc...@gmail.com wrote: For an image this usually works well. An image usually doesn't convey a lot of meaning. It can be replaced by a simple sentence like A young dog plays with a red ball on the grass.. For video, audio, object, iframe, this is a little sparse. Shortening [snip] For some videos a simple textual description is inadequate, just like it is a poor proxy for some still images. Yet for some other videos, it is completely accurate. I have no problem imagining a short video clip which fits your A young dog plays with a red ball on the grass just as accurately as a still image could fit that description. An argument that an attribute is inadequate to cover *all* cases shouldn't be used as a reason to exclude something which is useful in many case.
Re: [whatwg] HTML 5 video tag questions
On Sat, Jul 11, 2009 at 3:24 PM, Maciej Stachowiakm...@apple.com wrote: On Jul 10, 2009, at 6:11 PM, Jeff Walden wrote: On 10.7.09 17:44, Ian Hickson wrote: The design is based around the assumption that we will eventually find a common codec so that fallback won't ever be needed in supporting UAs. So has anyone ever actually pointed out the elephant in the room here, that we might never do so? I can't remember if so. Maybe HTML5: Galaxy Quest (cf. Captain Taggart's line) just isn't going to happen in the foreseeable future. There is likely an upper bound set by the maximum possible expiration date of any patents applying to any of the viable candidates. It's just that we'd like to reach agreement well before then. Not really, at some point well before then the argument will shift to a newer clearly superior to H.264 encumbered format. I would expect that H.264 would spend a period of time as a non-consideration by almost everyone since it would be inferior to something newer and yet would still require fees. You could counter that H.264 and AAC have reached some magic threshold of adoption or usability that they will not fail to the great hamster-wheel of encumbered codec upgrades, but since I've never seen anyone state what those requirements are I'm doubtful. Regardless— It's far from clear that simply waiting 15ish years would resolve the problem, even if anyone found that to be desirable.
Re: [whatwg] Serving up Theora video in the real world
On Thu, Jul 9, 2009 at 5:59 PM, David Gerarddger...@gmail.com wrote: * In Safari with XiphQT, we can *probably* detect Theora's MIME type as being supported and it will Just Work (more or less). I'm now being told that our workaround of checking for system mime types stopped working and isn't working for users with Safari 4.0.2 and fresh XiphQT installs. We've been doing querying navigator.plugins to look for 'video/ogg' since canPlayType fails for Ogg/Theora even when XiphQT is installed for current Safari (I understand that this is fixed in development builds). I'm trying to figure out where the plugin mime detection workaround stopped and why. On Thu, Jul 9, 2009 at 6:00 PM, Benjamin M. Schwartzbmsch...@fas.harvard.edu wrote: David Gerard wrote: * In the one released browser that supports video and Theora, Firefox 3.5, this will Just Work. Two! Firefox and Chrome. AFAIK Chrome support is only in the developer builds, not the full released stuff yet.
Re: [whatwg] Serving up Theora video in the real world
On Thu, Jul 9, 2009 at 7:14 PM, Maciej Stachowiakm...@apple.com wrote: I don't think we did anything intentional in 4.0.2 to break detection of XiphQT. If you have a solid reproducible case, please file a bug. On the other hand, I suspect the canPlayType fix will have shipped by the time we manage to do any fixes for the navigator.plugins technique. It may even be something about the XiphQT install that changed. Or, more significantly, something the XiphQT install can fix. If so we don't have the problem of where we can't tell Safari users to install XiphQT (because if they do they'll just continue to get the message). At the moment we're just going to have to tell Safari users to install Firefox 3.5 though.
Re: [whatwg] Serving up Theora video in the real world
On Thu, Jul 9, 2009 at 7:24 PM, Maciej Stachowiakm...@apple.com wrote: I thought your plan was to use Cortado for plugins that don't have video+Theora. Why would you single out Safari users for a worse experience? As david mentioned, Cortado is a worse experience. What we've been planning was that everyone without native support who appeared to be on a platform which could have native support were going to get a recommendation to upgrade/change software along with the Cortado. This would have resulted in Safari users being advised to install XiphQT, but since that doesn't appear to work anymore for the shipping software it will just get treated like Firefox 3.0 and other video-less browsers.
Re: [whatwg] Serving up Theora video in the real world
On Thu, Jul 9, 2009 at 10:35 PM, Robert O'Callahanrob...@ocallahan.org wrote: 2009/7/10 Ian Fette (イアンフェッティ) ife...@google.com To me, this seems like a great test if canPlayType actually works in practice. In the perfect world, it would be great to do getElementById('video'), createElement, and then canPlayType('video/whatever','theora'). If this simple use case doesn't work, I would ask if it's even worth keeping canPlayType in the spec. var v = document.getElementById(video); if (v.canPlayType v.canPlayType(video/ogg; codecs=vorbis,theora)) { ... } else { ... } should work great. Certainly does in Firefox. It works. Except where it doesn't. It's the where it doesn't that counts. At the moment Safari has issues. Out of two widely used production browsers with HTML5 support, one is broken. Not good odds, but I'm hopeful for the future. There is also the potential problem of It technically supports format X; but the browser developer never bothered testing X and its too buggy to be usable. My preference, however, is to start with the basic canPlayType test and then only eliminate known problems and to make the problem filtering as specific as possible with the assumption that future versions will get it right.
Re: [whatwg] Codecs for audio and video
On Wed, Jul 1, 2009 at 4:06 PM, Jonas Sickingjo...@sicking.cc wrote: [snip] I think the first bullet has been demonstrated to be false. The relative quality between theora and h.264 is still being debated, but the arguments are over a few percent here or there. Arguments that theora is simply not good enough seems based on poor or outdated information at this point. I'm commenting here because I don't my own posts to be a source of misinformation. Depending on how and what you compare it's more than a few percent. It turns out that H.264 as used in many places on web is within spitting distance of the newer theora encoder due to encode side and decode side computational complexity and compatibility concerns and the selection of encoder software. For these same reasons there are many 'older' formats still in wide use which Theora clearly outperforms. The reality of what people are using puts the lie to broad claims that Theora is generally unusable because it under-performs the best available H.264 encoders in the lab. Different uses and organizations will have different requirements. Which is a good reason why HTML5 never required solutions to support only one codec. I do not doubt that there are uses which Theora is clearly inferior, because of the mixture of tolerance for licensing, computational load, intolerance for bitrate, requirements to operate at bits-per-pixel levels below the range that theora operates well at, etc. but it is an enormous jump to go from there are some uses to apply the claim to the general case, or to go from it's needs some more bitrate to achieve equivalent subjective quality to remarks that the bitrate inflation would endanger the Internet. It was this kind of over generalization that my commentary on Theora quality was targeting. (And it should be absolutely unsurprising that at the limit Theora does a somewhat worse off than H.264 in terms of quality/bits— it's an older less CPU hungry design which is, from 50,000 ft, almost a strict subset of H.264) At the same time, we have clearly defined cases where H.264/AAC is absolutely unacceptable. Not merely inferior, but completely unworkable due to the licensing issues. Different uses and organizations will have different requirements. Different codecs will be superior depending on your requirements. Which is a good reason why HTML5 never required solutions to support only one codec. But what I think is key is that the inclusion of Theora as a baseline should do nothing to inhibit the parties which are already invested in H.264, or whom have particular requirements which make it especially attractive, from continuing to offer and use it. The advantage of a baseline isn't necessarily that it's the best at anything in particular, but that it's workable and mostly universal. If when talking about a baseline you find yourself debating details over efficiency vs the state of the art you've completely missed the point. This is a field which is still undergoing rapid development. Even if codec-science were to see no improvements we will still see the state of the art advance tremendously in the next years simply due to increasing tolerance for CPU hungry techniques invented many years ago but still under-used. Anything we use today is going to look pretty weak compared to the options available 10 years from now. It's important for a codec to be efficient, but the purpose of the baseline is to be compatible. As such the relevant arguments should be largely limited to workability, of which efficiency is only one part. It was suggested here that MJPEG be added as a baseline. I considered this as an option for Wikipedia video support some years ago before we had the Theora in Java playback working. I quickly determined that it was unworkable for over-the-web use because of the bitrate: we're talking about on the order of 10x the required bitrate over Theora before considering the audio (which would also be 10x the bitrate of Vorbis). At lest for general public web use I think the hard workability threshold could be fairly set as can a typical consumer broadband connection stream a 'web resolution' (i.e. somewhat sub-standard definition) in real time with decent quality. Even though thats a fairly vague criteria it seems clear that Ogg/Theora is well inside this limit while MJPEG is well outside it. Obviously different parties will have different demands. As far as I'm concerned spec might as well recommend a lossless codec as MJPEG— at least lossless has advantages for the set of applications which are completely insensitive to bitrate.
Re: [whatwg] Codecs for audio and video
On Tue, Jun 30, 2009 at 5:31 AM, Mikko Rantalainenmikko.rantalai...@peda.net wrote: [snip] Patent licensing issues aside, H.264 would be better baseline codec than Theora. I don't know that I necessarily agree there. H.264 achieves better efficiency (quality/bitrate) than Theora, but it does so with greater peak computational complexity and memory requirements on the decoder. This isn't really a fault in H.264, it's just a natural consequence of codec development. Compression efficiency will always be strongly correlated to computational load. So, I think there would be an argument today for including something else as a *baseline* even in the absence of licensing. (Though the growth of computational power will probably moot this in the 15-20 years it will take for H.264 to become licensing clear) Of course there are profiles, but they create a lot of confusion: People routinely put out files that others have a hard time playing. Of course, were it not for the licensing Theora wouldn't exist but there would likely be many other codec alternatives with differing CPU/bandwidth/quality tradeoffs. I just wanted to make the point that there are other considerations which have been ignored simply because the licensing issue is so overwhelmingly significant, but if it weren't we'd still have many things to discuss. The subject does bring me to a minor nit on Ian's decent state of affairs message: One of the listed problems is lack of hardware support. I think Ian may be unwittingly be falling to a common misconception. This is a real issue, but it's being misdescribed— it would be more accurately and clearly stated as lack of software support on embedded devices. Although people keep using the word hardware in this context I believe that in 999 times out of 1000 they are mistaken in doing so. As far as I, or anyone I've spoken to, can tell no one is actually doing H.264 decode directly in silicon, at least no one with a web browser. The closest thing to that I see are small microcoded DSPs which you buy pre-packaged with codec software and ready to go. I'm sure someone can correct me if I'm mistaken. There are a number of reasons for this such as the rapid pace of codec development vs ASIC design horizons, and the mode switching heavy nature of modern codecs (H.264 supports many mixtures of block sizes, for example) simply requiring a lot of chip real-estate if implemented directly in hardware. In some cases the DSP is proprietary and not sufficiently open for other software. But at least in the mobile device market it appears to be the norm to use an off the shelf general purpose DSP. This is a very important point because the hardware doesn't support it sounds like an absolute deal breaker while No one has bothered porting Theora to the TMS320c64x DSP embedded in the OMAP3 CPU used in this handheld device is an obviously surmountable problem. In the future, when someone says no hardware support it would be helpful to find out if they are talking about actual hardware support or just something they're calling hardware because it's some mysterious DSP running a vendor-blob that they themselves aren't personally responsible for programming... or if they are just regurgitating common wisdom. Cheers!
Re: [whatwg] Codecs for audio and video
On Tue, Jun 30, 2009 at 10:41 PM, Maciej Stachowiakm...@apple.com wrote: I looked into this question with the help of some experts on video decoding and embedded hardware. H.264 decoders are available in the form of ASICs, and many high volume devices use ASICs rather than general-purpose programmable DSPs. In particular this is very common for mobile phones and similar devices - it's not common to use the baseband processor for video decoding, for instance, as is implied by some material I have seen on this topic, or to use other fully general DSPs. Can you please name some specific mobile products? Surely if it's common doing so shouldn't be hard. I don't mean to argue that it isn't true or intend to debate you on the merits of any examples… But this is an area which has been subject to a lot of very vague claims which add a lot more confusion rather than insight. Iphone (of all vintages), and Palm Pre have enough CPU power to do Theora decode for 'mobile resolutions' on the main cpu (no comment on battery life; but palm pre is OMAP3 and support for that DSP is in the works as mentioned). I can state this with confidence since the horribly slow 400mhz arm4t based SOC in the OpenMoko freerunner is able to (just barely) do it with the completely unoptimized (for arm) reference libraries (on x86 the assembly optimizations are worth a 30-40% performance boost). Another example I have is the WDTV, a set top media box. It's often described as using a dedicated hardware H.264 decoder, but what it actually uses is a SMP8634. Which is a hardware decode engine based on general purpose processors which appears to be format-flexible enough to decode other formats. (Although the programing details aren't freely available so its difficult to make concrete claims). [snip] As far as I know, there are currently no commercially available ASICs for Ogg Theora video decoding. (Searching Google for Theora ASIC finds some claims that technical aspects of the Theora codec would make it hard to implement in ASIC form and/or difficult to run on popular DSPs, but I do not have the technical expertise to evaluate the merit of these claims.) There is, in fact, a synthetically VHDL implementation of the Theora decoder backend available at http://svn.xiph.org/trunk/theora-fpga/ I'm not able to find the claims regarding Theora on DSPs which you are referring to, care to provide a link? Not especially relevant but worth mentioning for completeness: Elphel also distributes a complete i-frame only theora encoder as syntheziable verilog under the GPL which is used on an FPGA in their prior generation camera products. (http://www3.elphel.com/xilinx/publications/xcellonline/xcell_53/xc_video53.htm) Existence trumps speculation. But I'm still not of the impression that the hardware forms are not all that relevant.
Re: [whatwg] Codecs for audio and video
On Wed, Jul 1, 2009 at 12:35 AM, Maciej Stachowiakm...@apple.com wrote: For the mobile phones where I have specific knowledge regarding their components, I am not at liberty to disclose that information. Unsurprising but unfortunate. There are other people trying to feel out the implications for themselves whom are subject to different constraints then you are, for whom the take my word for it is less than useless since they can only guess that the same constraints may apply to their situation. I can tell you that iPhone does not do H.264 decoding on the CPU. Iphone can decode Theora on the CPU. I don't think really isn't even a open question on that point. It's an important distinction because there are devices which can decode theora on the primary cpu but need additional help for H.264 (especially for things beyond base-profile). No one doubts that software implementations are available. However, they are not a substitute for hardware implementations, for many applications. I would expect a pure software implementation of video decoding on any mobile device would decimate battery life. Then please don't characterize it as it won't work when the situation is it would work, but would probably have unacceptable battery life on the hardware we are shipping. The battery life question is a serious and important one, but its categorically different one than can it work at all. (In particular because many people wouldn't consider the battery life implications of a rarely used fallback format to be especially relevant to their own development). I would caution against extrapolating from a single example. But even here, this seems to be a case of a component that may in theory be programmable, but in practice can't be reprogrammed by the device vendor. Yes, I provided it to balance the OMAP3 example I provided. It a case which is not as well off as the as a widly used, user programmable general purpose DSP based devices but still not likely to be limited by fixed function hardware. It's programmable, but only by the chip maker. There is, in fact, a synthetically VHDL implementation of the Theora decoder backend available at http://svn.xiph.org/trunk/theora-fpga/ I did not mention FPGAs because they are not cost-competitive for products that ship in volume. Of course not, but the existence of a code for a syntheizable hardware description language means that the non-existence of a ASIC version is just a question of market demand not of some fundamental technical barrier which you appeared to be implying exists. Silvia implied that mass-market products just have general-purpose hardware that could easily be used to decode a variety of codecs rather than true hardware support for specific codecs, and to the best of my knowledge, that is not the case. There are mass market products that do this. Specifically palm-pre is OMAP3, the N810 is OMAP2. These have conventional DSPs with publicly available toolchains. It seems like in both cases broad vague claims are misleading. I'd still love to see some examples of a web browsing device on the market (obviously I don't expect anyone to comment on their unreleased products) which can decode H.264 but fundamentally can't decode Theora. The closest I'm still aware of is the WDTV I mentioned (which, does not have enough general CPU power to decode pretty much any video format, and which uses a video engine which can only be programed by its maker). I understand that you're not at liberty to discuss this point in more detail, but perhaps someone else is. Likewise, I'm still curious to find out what webpages are claiming that implementation on common DSPs would be unusually difficult. Cheers,
Re: [whatwg] H.264-in-video vs plugin APIs
On Sat, Jun 13, 2009 at 3:06 PM, Frank Hellenkampjo...@depagecms.net wrote: [snip] Well, the thing is (perhabs unfortunately because of patents and liscensing) that you can use h264 with the video tag (in safari and chrome), but at the same time you can send the same video to every old browser with the flash player 9 or 10, because it also supports h264, which means IE 6/7/8, old Safari, Firefox, Opera etc. And there is the iPhone and Android. With Theora, you cannot do this. All in all it looks like, it's very hard to beat h246 at the moment. Although you *can* use the Theora file in every browser that has a working JVM and a little surplus of cpu cycles. For example: http://www.celt-codec.org/presentations/A small bit of easily added JS automatically replaces the video tag with an alternative Theora player when video will not work. It's also theoretically possible to also serve modern systems with Flash 10 in this manner, but no one has written the software yet. (The Vorbis support is done, the Theora support has not yet been done) Unfortunately, though perhaps not surprisingly, the intersection of people who care about open video standards and competent low level flash applet developers appears to be the empty set. Even old versions Safari, IE, firefox, etc also gain the ability gain the ability to play Ogg/Theora if the user downloads and installs a plugin (i.e. VLC; or a JVM for the above mentioned java approach). Far from ideal— but the user installing a plugin is exactly how those older systems got flash, so we have an existence proof. I don't disagree that the current solution set for Theora has gaps, but it's not as simple as it can't do that as you make it sound. Cheers
Re: [whatwg] H.264-in-video vs plugin APIs
On Sat, Jun 13, 2009 at 8:00 AM, Chris DiBonacdib...@gmail.com wrote: Comparing Daily Motion to Youtube is disingenuous. If yt were to switch to theora and maintain even a semblance of the current youtube quality it would take up most available bandwidth across the internet. [snip] I'm not sure what mixture of misinformation and hyperbole inspired this remark, but I believe that it is misleading and to leave it stand without comment would be a disservice to this working group. I have prepared a detailed response: http://people.xiph.org/~greg/video/ytcompare/comparison.html I understand that the selection and implementation of video, especially at the scale of YouTube, is worlds apart from such a simplistic comparison. But you didn't claim that Theora support would be inconvenient, that it would require yet-unjustified expenditure, or that the total cost would simply be somewhat higher than the H.264 solution. You basically claimed that Theora on YouTube would destroy the internet. I'd consider that too silly to respond to if I didn't know that many would take it as the literal truth. Even though I wish Google were doing more to promote open video, I appreciate all that it has done so far. I hope that I'll soon be able to add a retraction or amendment of that claim to the list. Cheers, Greg Maxwell
Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome
On Sun, Jun 7, 2009 at 10:45 PM, Peter Kasting pkast...@google.com wrote: On Sun, Jun 7, 2009 at 7:43 PM, Gregory Maxwell gmaxw...@gmail.com wrote: I don't think the particular parallel you've drawn there is the appropriate one. And I think you failed to answer the line in my email that asked what the point of this tangent is. PK I split the thread specifically because I agreed with your position that the encumbered codec angst was unrelated the LGPLv2 licensing concerns. I apologize for not saying so directly. Much of my email was presenting a position as to why the concerns related to these formats keep arising within whatwg and why these concerns have practical implications for the standard. Frankly, the legality of Google's software while interest is almost entirely off-topic, though of some interest to implementers. I felt guilty for the two messages I posted on the subject a few days ago. As switch in focus to the codec compatibility issues is a move back on topic, although that horse is already well beaten at this point. Cheers.
Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome Was: Re: MPEG-1 subset proposal for HTML5 video codec
On Tue, Jun 2, 2009 at 9:29 PM, Daniel Berlin dan...@google.com wrote: [snip] I would, however, get in trouble for not having paid patent fees for doing so. No more or less trouble than you would have gotten in had you gotten it from ffmpeg instead of us, which combined with the fact that we do For the avoidance of doubt, Are you stating that when an end user obtains Chrome from Google they do not receive any license to utilize the Google distributed FFMPEG code to practice the patented activities essential to H.264 and/or AAC decoding, which Google licenses for itself?
Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome Was: Re: MPEG-1 subset proposal for HTML5 video codec
On Tue, Jun 2, 2009 at 10:18 PM, Daniel Berlin dan...@google.com wrote: On Tue, Jun 2, 2009 at 9:50 PM, Gregory Maxwell gmaxw...@gmail.com wrote: On Tue, Jun 2, 2009 at 9:29 PM, Daniel Berlin dan...@google.com wrote: [snip] I would, however, get in trouble for not having paid patent fees for doing so. No more or less trouble than you would have gotten in had you gotten it from ffmpeg instead of us, which combined with the fact that we do For the avoidance of doubt, Are you stating that when an end user obtains Chrome from Google they do not receive any license to utilize the Google distributed FFMPEG code to practice the patented activities essential to H.264 and/or AAC decoding, which Google licenses for itself? I'm not saying that at all. I'm simply saying that any patent license we may have does [not] cause our distribution of ffmpeg to violate the terms of the LGPL 2.1 I now understand that your statement was only that Google's distribution of FFMPEG is not in violation of the LGPL due to patent licenses. Thank you for clarifying what you have stated. I will ask no further questions on that point. But I do have one further question: Can you please tell me if, when I receive Chrome from you, I also receive the patent licensing sufficient to use the Chrome package to practice the patents listed in MPEG-LA's 'essential' patent list for the decoding of H.264? I wouldn't want to break any laws. I believe I know the answer, based on your statement No more or less … than … ffmpeg as ffmpeg explicitly does not provide any patent licensing, but it seems surprising so I am asking for clarification. A simple yes or no will suffice. Thank you!