Re: [whatwg] default audio upload format (was Fwd: The Media Capture API Working Draft)
On 2010-08-31 22:11, James Salsman wrote: Does anyone object to form input type=file accept=audio/*;capture=microphone using Speex as a default, as if it were specified accept=audio/x-speex;quality=7;bitrate=16000;capture=microphone or to allowing the requesting of different speex qualities and bitrates using those mime type parameters? Speex at quality=7 is a reasonable open source default audio vocoder. Runner-up alternatives would include audio/ogg, which is a higher bandwidth format appropriate for multiple speakers or polyphonic music; audio/mpeg, a popular but encumbered format; audio/wav, a union of several different sound file formats, some of which are encumbered; etc. Actually, wouldn't accept=audio/*;capture=microphone basically indicate that the server wish to accept anything as audio? Which means it's up to the browser to decide what is best (among a list of the most common audio formats most likely). The proper way however would be to do: accept=audio/flac, audio/wav, audio/ogg, audio/aac, audio/mp3;capture=microphone indicating all the audio formats the server can handle. audio/* basically says to the browser that it can send anything even raw data or a image as if it was audio, and the website would have to respond to the user that this was unsupported or that they should submit these formats instead. Although I guess that audio/* could be taken as a sign that the browser should negotiate directly with the server about the preferred format to use. (Is POST HEADER request supported even?) -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/
Re: [whatwg] Video with MIME type application/octet-stream
On Tue, 31 Aug 2010 09:36:00 +0200, Ian Hickson i...@hixie.ch wrote: On Mon, 19 Jul 2010, Philip Jägenstedt wrote: I've tested Firefox 3.6.4, Firefox 4.0b1 and Chrome 5.0.375.99 and none return maybe for canPlayType(application/octet-stream). I couldn't get meaningful results from Safari on Windows (requires restart to detect QuickTime, perhaps?). It would appear that Opera is the only browser that supports application/octet-stream. At the time I added this, it was simply because it is true, maybe we can play it. However, I see no practical benefit of this spec-wise or implementation-wise. Since no other browsers have implemented it, I am going to remove it from Opera and hope that the spec will be changed to match this. Agreed. I've changed the spec to match. I never did make that change, instead waiting for the outcome of this discussion. Note that since Opera uses the same code path for checking the argument to canPlayType and for the Content-Type header, the change would also have meant that videos served as application/octet-stream would stop working, in violation of the spec. On Thu, 22 Jul 2010, Philip Jägenstedt wrote: Chrome and Safari ignore the MIME type altogether, in my opinion if we align with that we should do it full out, not just by adding text/plain to the whitelist, as that would either require (a) canPlayType(text/plain) to return maybe or (b) different code paths for checking the MIME type in Content-Type and for canPlayType. On Thu, 22 Jul 2010, Maciej Stachowiak wrote: I don't think canPlayType(text/plain) has to return maybe. It's not useful for a Web developer to test for the browser's ability to sniff to overcome a bad MIME type. canPlayType should be thought of as testing whether the browser could play a media resource that is really of a given type, rather than labeled with that type over HTTP. On Fri, 23 Jul 2010, Philip Jägenstedt wrote: Right, it certainly isn't useful, I'm just pointing out that this is what happens if one adds text/plain to the list of maybe codecs rather than ignoring Content-Type altogether, which is the only thing you can do within the bounds of the current spec to get text/plain to play. The only 3 serious options I know are still the ones I outlined in my earlier email. canPlayType() is now hardcoded as not supporting application/octet-stream even though that type is otherwise not considered one that isn't supported (i.e. is a type that sniffs). I'm not very happy with special-casing application/octet-stream only for canPlayType, especially as it only handles the exact string application/octet-stream, not e.g. application/octet-stream; which would instead be put through the same code path as Content-Type and return maybe. At this point the least complex solution seems to be to ignore the Content-Type header and unless the teams behind Chrome, Safari and IE9 have a sudden change of hearts it's the only realistic outcome. Perhaps we should also encourage authors to not send the Content-Type header at all, to remove any illusions of it having an effect. -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Video with MIME type application/octet-stream
On Wed, 01 Sep 2010 02:59:54 +0200, Andrew Scherkus scher...@chromium.org wrote: On Tue, Aug 31, 2010 at 12:59 PM, Aryeh Gregor simetrical+...@gmail.comsimetrical%2b...@gmail.com wrote: On Tue, Aug 31, 2010 at 10:35 AM, Boris Zbarsky bzbar...@mit.edu wrote: You can't sniff in a toplevel browser window. Not the same way that people are sniffing in video. It would break the web. How so? For the sake of argument, suppose you sniff only for known binary video/audio types, and fall back to existing behavior if the type isn't one of those (e.g., not video or audio). Do people do things like link to MP3 files with incorrect MIME types and no Content-Disposition, and expect them to download? If so, don't people also link to MP3 files with correct MIME types and expect the same? I don't see how sniffing vs. using MIME type makes a compatibility difference here, since media support in browsers is so new -- surely whatever bad thing happens, sniffing will make it happen more often, at worst. What do Chrome and IE do here? We use the incoming MIME type to determine whether we render the audio/video in the browser versus download. We would never want to execute multimedia sniffing code in the trusted/browser process so implementing sniffing for a top level browser window would involve sending the bytes to a sandboxed process for inspection first. Can you elaborate on this? What would be the problem with sniffing in this context? This does have a side effect where a video may play fine on a page with a bogus MIME type (due to sniffing), but viewing the video URL in the browser itself would prompt a download. If we start ignoring the Content-Type I expect we would also add sniffing so that opening a video served with the wrong (or missing) Content-Type still works in a top-level browsing context, as it does for images (I think). -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] time element feedback
Am 31.08.2010 22:21 schrieb Martin Janecke: Am 31.08.10 21:40, schrieb Aryeh Gregor: On Tue, Aug 31, 2010 at 5:25 AM, Martin Janeckewhatwg@kaor.in wrote: Besides,time2010/time in a British news article would allow users e.g. in Japan to have these dates displayed as 平22年. That's clearly an advantage over the number 2010 alone. I would say the opposite. If they can read the English news article, they'll necessarily know what 2010 means. But they might not be able to read Japanese. Maybe they're borrowing a Japanese person's computer, for example, or maybe the browser's idea of the user language is otherwise wrong. Also, content that behaves differently based on the browser settings of the viewer is confusing and can cause hard-to-debug problems. Users will think that the author of that British article actually wrote out a Japanese date, and be completely at a loss to explain why. Even if they can actually understand the date, the incongruity will look like a bug. It could be outright misleading if there are two year display formats that look the same but actually have different meaning. A plain year number in Arabic numerals like 2010 could refer to any number of totally different year-numbering conventions, and the only way to tell them apart currently is the page's context. Having the browser change the number to some convention that doesn't match its surroundings makes it impossible to guess the convention. And finally, it just looks weird. I would find it extremely strange to have all dates on pages I'm reading replaced with Hebrew dates, even though I understand those just fine. I wouldn't want that at all, and I find it hard to believe that many actual users do in real life. Basically, any kind of attempt to have browsers localize dates that are actually displayed in content is a terrible idea, and the spec should remove all mention of any such thing. I'm pretty sure I've said all this before, though. I understand your point, the situation you describe would be unfavorable indeed. However, there's no need to make this unfavorable. The localized display of times and dates can be realized via tooltips for example, as it is often seen with abbreviations in texts. The localized date doesn't have to be a replacement for the original date string but can be a helpful, explaining addition. This is a nice idea, but localisation should then be based on the language of the context of the time element, not based on the browser language: html lang=de body pDie Party ist time datetime=2010-09-01heute/time./p /body /html The tooltip might then display 1. September 2010 in German - independent from the computer or browser language. As Aryeh stated, displaying a Japanese or English date here on the computer in an internet café would be highly disturbing - even in a tooltip. Localisation should not mess around with the content unless explicitly triggered by the author. -- Markus
Re: [whatwg] Video with MIME type application/octet-stream
On Aug 31, 2010, at 9:40 AM, Boris Zbarsky wrote: On 8/31/10 3:36 AM, Ian Hickson wrote: You might say Hey, but aren't you content sniffing then to find the codecs and you'd be right. But in this case we're respecting the MIME type sent by the server - it tells the browser to whatever level of detail it wants (including codecs if needed) what type it is sending. If the server sends 'text/plain' or 'video/x-matroska' I wouldn't expect a browsers to sniff it for Ogg content. The Microsoft guys responded to my suggestion that they might want to implement something like this with what's the benefit of doing that?. One obvious benefit is that videos with the wrong type will not work, and hence videos will be sent with the right type. What makes you say this? Even if they are sent with the right type initially, the correct types are at high risk of bitrotting. The big problem with MIME types is that they don't stick to files very well. So, while someone might get them working when they initially use video, if they move to a different web server, or upgrade their server, or someone mirrors their video, or any of a number of other things, they might lose the proper association of files and MIME types. The real problem is that there is no standard way of storing and transmitting file type metadata on the majority of filesystems and majority of internet protocols, meaning that people need to maintain separate databases of MIME types, which are extremely easy to lose when moving between web servers. Until this problem is fixed (and this is a pretty big problem, even Apple gave up on tracking file type metadata years ago due to it's incompatibility with how other systems work), it will simply be too hard to maintain working Content-Type headers, and sniffing will be much more likely to produce the effects that the authors intended. It seems that periodically, web standards bodies decide this time, if we're strict, people will just get the content right or it won't work (such as XHTML with XML parsing rules), and invariably, people manage to screw it up anyhow. Sure, when the author tests their page the first time it's fine, but a mistaken lack of quoting in a comments field breaks the whole page. This causes people to migrate to the browsers or technologies that are less strict, and actually show the user what they want to see, rather than just breaking due to something out of the user's control. -- Brian
Re: [whatwg] Video with MIME type application/octet-stream
On 9/1/10 4:12 AM, Philip Jägenstedt wrote: If we start ignoring the Content-Type I expect we would also add sniffing so that opening a video served with the wrong (or missing) Content-Type still works in a top-level browsing context, as it does for images (I think). It can't possibly work for images. If I send a file as text/html, and you load it from an img then you will render it as an image (possibly a broken one). If you load it from a toplevel browsing context you will render it as text/html, even if it's image data (where you possibly excludes IE/Windows, which will do some sniffing in that situation). -Boris
Re: [whatwg] Video with MIME type application/octet-stream
On Wed, 01 Sep 2010 15:14:10 +0200, Boris Zbarsky bzbar...@mit.edu wrote: On 9/1/10 4:12 AM, Philip Jägenstedt wrote: If we start ignoring the Content-Type I expect we would also add sniffing so that opening a video served with the wrong (or missing) Content-Type still works in a top-level browsing context, as it does for images (I think). It can't possibly work for images. If I send a file as text/html, and you load it from an img then you will render it as an image (possibly a broken one). If you load it from a toplevel browsing context you will render it as text/html, even if it's image data (where you possibly excludes IE/Windows, which will do some sniffing in that situation). Huh, I guessed incorrectly, neither serving a PNG as text/plain or text/html makes it be sniffed and rendered in a top-level browsing context in Opera. However, both work in IE8. Why do you say that it can't possibly work? Are there any security risks with the browser potentially interpreting a plain text or HTML document and failing to decode it? Anything else? -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Video with MIME type application/octet-stream
On 9/1/10 10:23 AM, Philip Jägenstedt wrote: Huh, I guessed incorrectly, neither serving a PNG as text/plain or text/html makes it be sniffed and rendered in a top-level browsing context in Opera. However, both work in IE8. Why do you say that it can't possibly work? That was a statement about the current implementation state of opera, not about future possibilities. Are there any security risks with the browser potentially interpreting a plain text or HTML document Yes, actually, if there's a filtering proxy trying to screen out video or image data that's trying to exploit known OS-level bugs, say. But I had assumed, based on the rest of this discussion, that people simply didn't care about that. -Boris
Re: [whatwg] Video with MIME type application/octet-stream
On 9/1/10 9:13 AM, Brian Campbell wrote: It seems that periodically, web standards bodies decide this time, if we're strict, people will just get the content right or it won't work (such as XHTML with XML parsing rules), and invariably, people manage to screw it up anyhow. Sure, when the author tests their page the first time it's fine, but a mistaken lack of quoting in a comments field breaks the whole page. This causes people to migrate to the browsers or technologies that are less strict, and actually show the user what they want to see, rather than just breaking due to something out of the user's control. I hasn't actually happened for MIME types in toplevel documents (modulo the one known workaround for a common server issue with text/plain). By and large, browsers don't sniff toplevel browsing contexts, and the one browser that does has been losing market share. -Boris
Re: [whatwg] Video with MIME type application/octet-stream
On 01.09.2010 10:12, Philip Jägenstedt wrote: ... If we start ignoring the Content-Type I expect we would also add sniffing so that opening a video served with the wrong (or missing) Content-Type still works in a top-level browsing context, as it does for images (I think). ... Sniffing in the *absence* of a content type is fine. The interesting question is what to do when it's present, but wrong. Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 01.09.2010 16:23, Philip Jägenstedt wrote: ... Huh, I guessed incorrectly, neither serving a PNG as text/plain or text/html makes it be sniffed and rendered in a top-level browsing context in Opera. However, both work in IE8. ... Please don't say work when talking about something that's not supposed to happen...
Re: [whatwg] Video with MIME type application/octet-stream
On 01.09.2010 15:13, Brian Campbell wrote: On Aug 31, 2010, at 9:40 AM, Boris Zbarsky wrote: On 8/31/10 3:36 AM, Ian Hickson wrote: You might say Hey, but aren't you content sniffing then to find the codecs and you'd be right. But in this case we're respecting the MIME type sent by the server - it tells the browser to whatever level of detail it wants (including codecs if needed) what type it is sending. If the server sends 'text/plain' or 'video/x-matroska' I wouldn't expect a browsers to sniff it for Ogg content. The Microsoft guys responded to my suggestion that they might want to implement something like this with what's the benefit of doing that?. One obvious benefit is that videos with the wrong type will not work, and hence videos will be sent with the right type. What makes you say this? Even if they are sent with the right type initially, the correct types are at high risk of bitrotting. The big problem with MIME types is that they don't stick to files very well. So, while someone might get them working when they initially use video, if they move to a different web server, or upgrade their server, or someone mirrors their video, or any of a number of other things, they might lose the proper association of files and MIME types. ... That's true, and the reason why people still use file extensions. That's not super elegant, but it works. Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 1 Sep 2010, at 15:45, Julian Reschke wrote: The big problem with MIME types is that they don't stick to files very well. So, while someone might get them working when they initially use video, if they move to a different web server, or upgrade their server, or someone mirrors their video, or any of a number of other things, they might lose the proper association of files and MIME types. ... That's true, and the reason why people still use file extensions. That's not super elegant, but it works. Given that there is a very limited set of video formats that are supported anyway, wouldn't it be reasonable to just identify or define the standard file extensions then work with server vendors to update their standard file extension to mime type definitions to include that. While adoption and upgrading to the new versions would obviously take time, that applies to the video tag itself anyway and is just a temporary source of pain. Regards, Adrian Sutton. __ Adrian Sutton, CTO UK: +44 1 628 353 032 US: +1 (650) 292 9659 x717 Ephox http://www.ephox.com/ Ephox Blogs http://people.ephox.com/, Personal Blog http://www.symphonious.net/
Re: [whatwg] default audio upload format (was Fwd: The Media Capture API Working Draft)
On Tue, Aug 31, 2010 at 11:24 PM, Roger Hågensen resca...@emsai.net wrote: On 2010-08-31 22:11, James Salsman wrote: Does anyone object to form input type=file accept=audio/*;capture=microphone using Speex as a default, as if it were specified accept=audio/x-speex;quality=7;bitrate=16000;capture=microphone or to allowing the requesting of different speex qualities and bitrates using those mime type parameters? Speex at quality=7 is a reasonable open source default audio vocoder. Runner-up alternatives would include audio/ogg, which is a higher bandwidth format appropriate for multiple speakers or polyphonic music; audio/mpeg, a popular but encumbered format; audio/wav, a union of several different sound file formats, some of which are encumbered; etc. Actually, wouldn't accept=audio/*;capture=microphone basically indicate that the server wish to accept anything as audio? Yes; that doesn't encourage implementors to implement. However, as it was agreed on by two different browser developer representatives, it's the best way forward. The proper way however would be to do: accept=audio/flac, audio/wav, audio/ogg, audio/aac, audio/mp3;capture=microphone indicating all the audio formats the server can handle. I agree that the specifications should allow a comma- or space-separated list of MIME types. I have no idea why that was specifically disallowed in the most recent HTML5 specification, and think that decision should be reversed before publication. I also think capture would work a lot better as an attribute of the input type=file element than a MIME type parameter. Although I guess that audio/* could be taken as a sign that the browser should negotiate directly with the server about the preferred format to use. (Is POST HEADER request supported even?) Not for multipart/form-encoded input type=file uploads, sadly. There's a way to do that specified in http://w3.org/TR/device-upload with the alternates form POST parameter which the browser would send back to the server when the requested type(s) were unavailable. I guess that is a content negotiation feature, but it seemed unlikely that people would need it for that because ideally the server would specify all the types it could accept, as you pointed out. Best regards, James Salsman
Re: [whatwg] time element feedback
Aryeh Gregor writes: On Tue, Aug 31, 2010 at 3:53 PM, Ashley Sheridan a...@ashleysheridan.co.uk wrote: I think localisation does have a valid use though. Consider a page written in English with the date 01/12/2010. Is that date the 1st December, or the 12th January? The only clue might be the spelling of certain words in the document, but even then, the most popular office software in use at the moment defaults to American spelling for its spell-check feature, even if bought in England, which leads to words being spelt wrong and giving the reader no good clue as to what the date might be. Localisation in this case would mean that I could read the document and easily figure out what the date was. What do expect the browser to do in this case? Flip it to 12/01/2010 if appropriate, ... would make things much worse, because now rather than having to guess whether the *page* is using American or British convention (usually not too hard), you have to guess what convention your *browser* thinks is right (and it might be someone else's computer, a public computer, . . .). Even so, that still doesn't help. You _also_ have to know whether the author just wrote the date in text or used the time element, in order to know whether your browser has already localized the date for you. Which, in general, an author will have no way of knowing. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Video with MIME type application/octet-stream
On Wed, Sep 1, 2010 at 10:51 AM, Adrian Sutton adrian.sut...@ephox.com wrote: Given that there is a very limited set of video formats that are supported anyway, wouldn't it be reasonable to just identify or define the standard file extensions then work with server vendors to update their standard file extension to mime type definitions to include that. While adoption and upgrading to the new versions would obviously take time, that applies to the video tag itself anyway and is just a temporary source of pain. At first glance, my eyes almost popped out of my sockets when I saw this suggestion. Using the file extension?! He must be mad! Then I remembered that our Flash player *has* to use file extension since the MIME type isn't available in Flash. Turns out that file extension is a pretty good indicator, but it doesn't work for custom server configurations where videos don't have extensions, ala YouTube. For that reason, we allow users to override whatever we detect with a type configuration parameter. Ultimately, the question is, What are we trying to accomplish? I think we're trying to make it easy for content creators to guarantee that their content is available to all viewers regardless of their browser. If that's the case, I'd actually suggest that the browsers *strictly* follow the MIME type, with the source type as a override, and eliminating all sniffing (assuming that the file container format contains the codec meta-data). If a publisher notices that their video isn't working, they can either update their server's MIME type mapping, or just hard code the type in the HTML. Neither is that time consuming / difficult. Moreover, as Adrian suggested, it's probably quite easy to get the big HTTP servers (Apache, IIS, nginx, lighttpd) to add the new extensions (if they haven't already), so this would gradually become less and less of an issue. Best, Zach -- Zachary Ozer Developer, LongTail Video w: longtailvideo.com • e: z...@longtailvideo.com • p: 212.244.0140 • f: 212.656.1335 JW Player | Bits on the Run | AdSolution
Re: [whatwg] Video with MIME type application/octet-stream
On Aug 31, 2010, at 4:01 PM, Ian Hickson wrote: On Tue, 31 Aug 2010, Eric Carlson wrote: On Aug 31, 2010, at 12:36 AM, Ian Hickson wrote: Safari does crazy things right now that we won't go into; for the purposes of this discussion we'll assume Safari can change. What crazy things does Safari do that it should not? I forget the details, but IIRC one of the main problems was that it was based on the URL's file extension exclusively. No, I don't see how you came to that conclusion. QuickTime knows how to create a movie from a text file (to make it easy to create captions, chapters, etc), but it also assumes a file served as text/plain may be coming from a misconfigured server. Therefore, when it gets a file served as text/plain it first looks at the file content and/or the file extension to see if it is a movie file. It opens it as text only if it doesn't look like a movie. In your test page (http://hixie.ch/tests/adhoc/html/video/002.html), all four movies have correct extensions but are served as text/plain: !DOCTYPE HTML titletext/plain video files/title p video autoplay controls src=resources/text.txt/video p video autoplay controls src=resources/text.webm/video p video autoplay controls src=resources/text.m4v/video p video autoplay controls src=resources/text.ogv/video When the shipping version of Safari opens this page the MPEG-4 file opens correctly, and opens the other three as text (if you wait long enough) because by default QuickTime doesn't know how to open the Ogg or WebM files. If you add QuickTime importers for WebM and Ogg, those file will be opened as movies instead of as text because of the file extensions, despite the fact that they are serve as text. FWIW, in nightly builds we are now configuring QuickTime so it won't ever open files it identifies as text. eric
Re: [whatwg] time element feedback
On Tue, Aug 31, 2010 at 4:19 PM, Ashley Sheridan a...@ashleysheridan.co.uk wrote: Because as I mentioned, content authors tend to be quite lazy, and leave default settings on. So lots of English people end up using American spelling, and American date formatting, because that's what their software does by default. I could find you 10 people who didn't know how to change this setting in MSWord for every one you found who did. However, I think readers should be given the choice still with this. If the content authors don't want their precious dates to be read as dates, then don't mark them up as such. A date should be something that can be understood by a variety of media, including search engines, screen readers, even as part of a web snippet that seems to be a popular thing at the moment. If it's in an ambiguous format then there's no point it even being marked up as a date at all. I don't follow what you're saying. How does this relate to auto-rewriting dates according to user preference? I don't think I said anything about marking up dates, beyond that they should not be rewritten according to user preference. On Tue, Aug 31, 2010 at 4:21 PM, Martin Janecke whatwg@kaor.in wrote: However, there's no need to make this unfavorable. The localized display of times and dates can be realized via tooltips for example, as it is often seen with abbreviations in texts. The localized date doesn't have to be a replacement for the original date string but can be a helpful, explaining addition. In principle, I guess that would be harmless. But returning to the original point of this discussion, the question is whether time2010/time should be allowed. I don't think the browser could add a localized tooltip is a good use-case here, because it's not obvious why users would actually want the browser to do so, and it's not obvious that any implementer would be interested in making their browser do so. Furthermore, browsers should not force a localized version upon their users. Users should be able to configure their prefered format, just as they can set a preferred language or a default charset. This assumes that users actually choose their browser settings. In practice, most users have no idea how to reconfigure their browser; they probably don't want to even if they do know how; they often use other people's browsers; and if they did reconfigure the browser, it might have been by mistake or long ago. So you can't assume that the browser configuration actually reflects what the user wants. (This is the basic idea behind programs that try to minimize configurability, like Chrome, or indeed most Google products.) On Wed, Sep 1, 2010 at 10:37 AM, Smylers smyl...@stripey.com wrote: Even so, that still doesn't help. You _also_ have to know whether the author just wrote the date in text or used the time element, in order to know whether your browser has already localized the date for you. Right, I forgot that point. So it's really just hopeless to try rewriting dates like 12/1/2010.
Re: [whatwg] Video with MIME type application/octet-stream
On Sep 1, 2010, at 9:07 AM, Zachary Ozer wrote: On Wed, Sep 1, 2010 at 10:51 AM, Adrian Sutton adrian.sut...@ephox.com wrote: Given that there is a very limited set of video formats that are supported anyway, wouldn't it be reasonable to just identify or define the standard file extensions then work with server vendors to update their standard file extension to mime type definitions to include that. While adoption and upgrading to the new versions would obviously take time, that applies to the video tag itself anyway and is just a temporary source of pain. At first glance, my eyes almost popped out of my sockets when I saw this suggestion. Using the file extension?! He must be mad! Then I remembered that our Flash player *has* to use file extension since the MIME type isn't available in Flash. Turns out that file extension is a pretty good indicator, but it doesn't work for custom server configurations where videos don't have extensions, ala YouTube. For that reason, we allow users to override whatever we detect with a type configuration parameter. Ultimately, the question is, What are we trying to accomplish? I think we're trying to make it easy for content creators to guarantee that their content is available to all viewers regardless of their browser. If that's the case, I'd actually suggest that the browsers *strictly* follow the MIME type, with the source type as a override, and eliminating all sniffing (assuming that the file container format contains the codec meta-data). If a publisher notices that their video isn't working, they can either update their server's MIME type mapping, or just hard code the type in the HTML. Hard coding the type is only possible if the element uses a source element, @type isn't allowed on audio or video. Neither is that time consuming / difficult. It isn't hard to update a server if you control it, but it can be *very* difficult and time consuming if you don't (as is the case with most web developers, I assume). Moreover, as Adrian suggested, it's probably quite easy to get the big HTTP servers (Apache, IIS, nginx, lighttpd) to add the new extensions (if they haven't already), so this would gradually become less and less of an issue. Really? Your company specializes in web video and flv files have been around for years, but your own server still isn't configured for it: eric% curl -I http://content.longtailvideo.com/videos/flvplayer.flv; HTTP/1.1 200 OK Server-Status: load=0 Content-Type: application/octet-stream Accept-Ranges: bytes ETag: 4288394655 Last-Modified: Wed, 23 Jun 2010 20:42:28 GMT Content-Length: 2533148 Date: Wed, 01 Sep 2010 16:16:28 GMT Server: bit_asic/3.8/r8s1-bitcast-b eric
Re: [whatwg] Video with MIME type application/octet-stream
On Wed, Sep 1, 2010 at 12:29 PM, Eric Carlson eric.carl...@apple.com wrote: Hard coding the type is only possible if the element uses a source element, @type isn't allowed on audio or video. Why isn't type allowed for video and audio? I know it doesn't strictly make sense (since the tag doesn't have a type per-se), but perhaps it could be an alias for the current item's type, much in the same way src is the current source. It isn't hard to update a server if you control it, but it can be *very* difficult and time consuming if you don't (as is the case with most web developers, I assume). Correct - but being able to manually specify type should be fine for those situations, since that can be written into the HTML itself. Really? Your company specializes in web video and flv files have been around for years, but your own server still isn't configured for it: Thanks for the heads up on this. However, I think this reemphasizes my original point: The Flash platform *isn't* strict about MIME types, so we've never bothered to do anything about it.
Re: [whatwg] Video with MIME type application/octet-stream
On Wed, 1 Sep 2010, Julian Reschke wrote: On 01.09.2010 16:23, Philip Jägenstedt wrote: ... Huh, I guessed incorrectly, neither serving a PNG as text/plain or text/html makes it be sniffed and rendered in a top-level browsing context in Opera. However, both work in IE8. Please don't say work when talking about something that's not supposed to happen... For the record, in the context of the WHATWG mailing list, saying work here is fine. What's important is the user experience, not strict adherence to specifications. In the case of the HTML spec, I'll change it to match what user agents implement. As mentioned earlier in the thread, for now I'm happy to give cover to Firefox and Opera (and hopefully Chrome and Safari) to more closely honour the Content-Type headers, but if the conclusion from implementors is that following Microsoft's route towards simply ignoring Content-Type with video as we do with img, that's fine. As far as sniffing for top-level browsing contexts goes, my understanding is that Adam is still working on the relevant spec, and it would not be a problem to add common video formats to that algorithm so that we can get interoperable handling of mislabeled content. (Currently, text/html won't ever sniff as binary IIRC, but text/plain, in certain cases, will. We could also make text/html sniff as binary if it turns out that this would be particularly helpful for Web compat.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] default audio upload format (was Fwd: The Media Capture API Working Draft)
On Tue, Aug 31, 2010 at 1:11 PM, James Salsman jsals...@gmail.com wrote: Does anyone object to form input type=file accept=audio/*;capture=microphone using Speex as a default, as if it were specified accept=audio/x-speex;quality=7;bitrate=16000;capture=microphone or to allowing the requesting of different speex qualities and bitrates using those mime type parameters? Speex at quality=7 is a reasonable open source default audio vocoder. Runner-up alternatives would include audio/ogg, which is a higher bandwidth format appropriate for multiple speakers or polyphonic music; audio/mpeg, a popular but encumbered format; audio/wav, a union of several different sound file formats, some of which are encumbered; etc. For what it's worth, I suspect it will be almost as has hard to settle on an audio codec as it has been to settle on a video codec. Why speex instead of vorbis for example. If WebM is a success it seems like vorbis in a WebM container would make a lot of sense. But yes, if you can get all vendors major browser vendors, one of which is not on this list, to agree on a codec and container format for audio, then that would be great. I suspect mozilla would be willing to support it, though I can only speak for myself and not the rest of the mozilla project. / Jonas
Re: [whatwg] default audio upload format (was Fwd: The Media Capture API Working Draft)
seems like a comma-separated list is the right way to go, and that audio/* should mean what it says -- any kind of audio (whether that is useful or not remains to be seen). I would suggest that this is likely to be used for short captures, and that uncompressed (such as a WAV file or AVI with PCM or u-law audio) should be the recommended format. If your usage is for longer captures or more specific situations, then indicate a suitable codec. Shouldn't there be statements about channels (mono, stereo, more), sampling rate (8 kHz speech, 16 kHz wideband speech, 44.1 CD-quality, 96 kHz bat-quality) and so on? On Sep 1, 2010, at 8:21 , James Salsman wrote: On Tue, Aug 31, 2010 at 11:24 PM, Roger Hågensen resca...@emsai.net wrote: On 2010-08-31 22:11, James Salsman wrote: Does anyone object to form input type=file accept=audio/*;capture=microphone using Speex as a default, as if it were specified accept=audio/x-speex;quality=7;bitrate=16000;capture=microphone or to allowing the requesting of different speex qualities and bitrates using those mime type parameters? Speex at quality=7 is a reasonable open source default audio vocoder. Runner-up alternatives would include audio/ogg, which is a higher bandwidth format appropriate for multiple speakers or polyphonic music; audio/mpeg, a popular but encumbered format; audio/wav, a union of several different sound file formats, some of which are encumbered; etc. Actually, wouldn't accept=audio/*;capture=microphone basically indicate that the server wish to accept anything as audio? Yes; that doesn't encourage implementors to implement. However, as it was agreed on by two different browser developer representatives, it's the best way forward. The proper way however would be to do: accept=audio/flac, audio/wav, audio/ogg, audio/aac, audio/mp3;capture=microphone indicating all the audio formats the server can handle. I agree that the specifications should allow a comma- or space-separated list of MIME types. I have no idea why that was specifically disallowed in the most recent HTML5 specification, and think that decision should be reversed before publication. I also think capture would work a lot better as an attribute of the input type=file element than a MIME type parameter. Although I guess that audio/* could be taken as a sign that the browser should negotiate directly with the server about the preferred format to use. (Is POST HEADER request supported even?) Not for multipart/form-encoded input type=file uploads, sadly. There's a way to do that specified in http://w3.org/TR/device-upload with the alternates form POST parameter which the browser would send back to the server when the requested type(s) were unavailable. I guess that is a content negotiation feature, but it seemed unlikely that people would need it for that because ideally the server would specify all the types it could accept, as you pointed out. Best regards, James Salsman David Singer Multimedia and Software Standards, Apple Inc.
Re: [whatwg] Video with MIME type application/octet-stream
On 9/1/10 2:51 PM, Ian Hickson wrote: (Currently, text/html won't ever sniff as binary IIRC, but text/plain, in certain cases, will. Will sniff as binary so as not to render as text but will NOT, last I checked, render as an image or whatnot (for good security reasons, imho). -Boris
Re: [whatwg] Video with MIME type application/octet-stream
On Tue, Aug 31, 2010 at 4:13 PM, Boris Zbarsky bzbar...@mit.edu wrote: The issue would be someone linking to text or HTML or a binary blob that happens to have some bits at the beginning that look like an audio/video types and expecting them to be rendered respectivel as text or HTML or be downloaded. Is this realistically possible unless the author deliberately crafts the file? We're talking quite a few bytes that have to be exactly right, no? If the author does deliberately craft the file, is there any security risk in displaying it unexpectedly, given that media isn't scriptable? The big danger with sniffing, as always, is that the server will think one thing will happen and suddenly the browser will do something totally different. As long as what the browser is doing is almost certain to be closer to the author's/user's/webmaster's intent, that's not a problem. Sniffing is a problem if you risk false positives or security issues, but I can't see how that's an issue in this specific case. We have a lot of experience with the perils of sniffing -- have any issues ever been caused by this kind of sniffing problem? The only sniffing problems I know of are when 1) The sniffing is unreliable, so false identifications happen by accident. They're common with MIME types too, but at least with MIME they're more predictable. This will hold for pretty much any text format, if only because you might want to serve the file as text/plain to mean let the user view the source code instead of executing it. But with binary formats it doesn't have to be plausible, if the string you're sniffing for is reasonably long. 2) The MIME type is safe (e.g., not scriptable), and the type it's sniffed as is not safe (e.g., it's HTML or JAR). Then even if false identifications are overwhelmingly improbable by accident, they'll happen when people upload malicious files posing as an image or whatever to get code to execute from a domain they don't control. Are there clear problems that have arisen in other cases? On Tue, Aug 31, 2010 at 8:59 PM, Andrew Scherkus scher...@chromium.org wrote: We use the incoming MIME type to determine whether we render the audio/video in the browser versus download. We would never want to execute multimedia sniffing code in the trusted/browser process so implementing sniffing for a top level browser window would involve sending the bytes to a sandboxed process for inspection first. Why can't you do media sniffing in the trusted process? It must be a lot simpler than parsing HTTP headers -- just a memcmp() or two per format, if the format is designed so it can be sniffed well. On Wed, Sep 1, 2010 at 12:27 AM, Gregory Maxwell gmaxw...@gmail.com wrote: Aggressive sniffing can and has resulted in some pretty nasty security bugs. E.g. an attacker crafts an input that a website identifies as video and permits the upload but which a browser sniffs out to be a java jar which can then access the source URL with the permissions of the user. This is problem (2) above. The solution is never to sniff for scriptable content. The problem can't plausibly arise with media files -- if you can execute a vulnerability via getting the user to view a media file, it's probably via arbitrary code execution. In that case you don't need to disguise yourself, just get the viewer to go to your own website and do whatever you want, since there are no same-domain restrictions. The sniffing rules, in some contexts and some browsers can also end up causing surprising failures... e.g. I've seen older versions of some sniffing heavy browsers automatically switch into UCS-2LE encoding at wrong and surprising times. Perhaps this is irrelevant in a video specific discussion of sniffing— but it is a hazard with sniffing in general. Is this plausible in practice for common media formats? I didn't find info on sniffing media by quick Googling, but for instance, GIF starts with GIF87a or GIF89a, and PNG has an eight-byte signature. Random binary data is going to hit these one time in 2^48 or 2^64, about 10^14 and 10^19 respectively. The actual figure is likely to be even lower, because most binary formats don't have arbitrary data in their first few bytes. Is this really something we should worry about, given how obviously hard it is to get MIME types right? Moreover, it'll never be consistent from implementation to implementation, which seems to me to be pretty antithetical to standardization in general. The exact sniffing algorithm needs to be precisely specced. In fact, there's work undergoing to do that right now, for other types of sniffing: http://tools.ietf.org/html/draft-abarth-mime-sniff-05 There's no reason it can't be perfectly consistent. The reason it's historically been inconsistent is because specs have tried to claim that no sniffing is allowed, so implementers had no spec to follow. Which is what's in the HTML5 spec now, and it's a mistake. On Wed, Sep 1, 2010 at 10:37
Re: [whatwg] video resource selection algorithm and NETWORK_NO_SOURCE
On Wed, 28 Jul 2010 10:59:13 +0200, Philip Jägenstedt phil...@opera.com wrote: I'm just saying that it shouldn't work if implementations follow the spec. I don't think that it *not* working is a feature, but I also don't think that making it work is very important. In the example the markup was invalid, so if the spec doesn't make it work perfectly I'm OK with that. Furthermore, once it stops working in all browsers authors will learn to add a call to load() after they are done modifying the attributes of source elements. If it turns out that this is something many authors get stuck on then maybe we can add more triggers for src, type and media attributes on source elements. I think it might be good to run the media element load algorithm when setting or changing src on source (that has a media element as its parent), but not type and media (what's the use case for type and media?). However it would fire an 'emptied' event for each source that changed, which is kind of undesirable. Maybe the media element load algorithm should only be invoked if src is set or changed on a source that has no previous sibling source elements? -- Simon Pieters Opera Software
Re: [whatwg] Video with MIME type application/octet-stream
On Thu, Sep 2, 2010 at 12:38 AM, Boris Zbarsky bzbar...@mit.edu wrote: On 9/1/10 9:13 AM, Brian Campbell wrote: It seems that periodically, web standards bodies decide this time, if we're strict, people will just get the content right or it won't work (such as XHTML with XML parsing rules), and invariably, people manage to screw it up anyhow. Sure, when the author tests their page the first time it's fine, but a mistaken lack of quoting in a comments field breaks the whole page. This causes people to migrate to the browsers or technologies that are less strict, and actually show the user what they want to see, rather than just breaking due to something out of the user's control. I hasn't actually happened for MIME types in toplevel documents (modulo the one known workaround for a common server issue with text/plain). By and large, browsers don't sniff toplevel browsing contexts, and the one browser that does has been losing market share. sureley that's not the reason it's losing market share ;-) S.
Re: [whatwg] Video with MIME type application/octet-stream
On 9/1/10 10:59 PM, Silvia Pfeiffer wrote: I hasn't actually happened for MIME types in toplevel documents (modulo the one known workaround for a common server issue with text/plain). By and large, browsers don't sniff toplevel browsing contexts, and the one browser that does has been losing market share. sureley that's not the reason it's losing market share ;-) My point is that the if you don't sniff all your users will leave argument is overly simplistic. -Boris
Re: [whatwg] Video with MIME type application/octet-stream
On 9/1/10 4:46 PM, Aryeh Gregor wrote: On Tue, Aug 31, 2010 at 4:13 PM, Boris Zbarskybzbar...@mit.edu wrote: The issue would be someone linking to text or HTML or a binary blob that happens to have some bits at the beginning that look like an audio/video types and expecting them to be rendered respectivel as text or HTML or be downloaded. Is this realistically possible unless the author deliberately crafts the file? I'm not an audio/video format expert; I have no idea. Does it matter? If the author does deliberately craft the file, is there any security risk in displaying it unexpectedly, given that media isn't scriptable? Yes; media codecs (including image decoders) are one of the most common sources or remote attacks on operating systems via web browsers. So showing some image or video is always a risk. If you have a known unpatched vulnerability and try to defend against it by blocking the relevant content, then sniffing can defeat the block, leading to exploits. Note that this means that only people/organizations that are proactive about their security will have their vulnerability window made bigger by sniffing; everyone else is already screwed in the above situation no matter whether browsers sniff. As long as what the browser is doing is almost certain to be closer to the author's/user's/webmaster's intent, that's not a problem. Why is it not a problem if there are suddenly use cases that are impossible because the browser will ignore the author's intent? Sniffing is a problem if you risk false positives or security issues, That's one case where it's a problem, yes. but I can't see how that's an issue in this specific case. See above. have any issues ever been caused by this kind of sniffing problem? As far as I know, yes (of the remotely take control of the computer kind). Are there clear problems that have arisen in other cases? See above. The problem can't plausibly arise with media files -- if you can execute a vulnerability via getting the user to view a media file, it's probably via arbitrary code execution. In that case you don't need to disguise yourself, just get the viewer to go to your own website and do whatever you want, since there are no same-domain restrictions. See above about people who take steps to protect themselves when problems like this arise and would be screwed over by sniffing. Yes, actually, if there's a filtering proxy trying to screen out video or image data that's trying to exploit known OS-level bugs, say. It seems like such a proxy would be unreliable in any event, since you could do all sorts of things to obfuscate it, not least of all just using HTTPS. There's no reason such a proxy couldn't block https (and it has to, to work correctly, as you point out). Do any such proxies exist? In the past, yes. I haven't checked in the last year or two. On Wed, Sep 1, 2010 at 3:54 PM, Boris Zbarskybzbar...@mit.edu wrote: Will sniff as binary so as not to render as text but will NOT, last I checked, render as an image or whatnot (for good security reasons, imho). What reasons are these? See above about not sneaking dangerous file formats past filtering software. -Boris