Re: [whatwg] How to determine content-type of file: protocol
On 07/28/2014 08:01 AM, duanyao wrote: On 07/28/2014 06:34, Gordon P. Hemsley wrote: Sorry for the delay in responding. Your message fell through the cracks in my e-mail filters. On 07/17/2014 08:26 AM, duanyao wrote: Hi, My first question is about a rule in MIME Sniffing specification (http://mimesniff.spec.whatwg.org): 5.1 Interpreting the resource metadata ... If the resource is retrieved directly from the file system, set supplied-type to the MIME type provided by the file system. As far as I know, no main-stream file systems record MIME type for files. Does the spec actually want to say "provided by the operating system" or "provided by the file name extension"? Yeah, you've hit a known (though apparently unrecorded) bug in the spec, originally pointed out to me by Boris Zbarsky via IRC many months ago. The intent here is basically just "whatever the computer says it is"—whether that be via the file system, the operating system, or whatever, and whether it uses magic bytes, file extensions, or whatever. In other words, feel free to read that as "the correct behavior is undefined/unknown" at this point. Thanks for the explanation. Recently, file: protocol becomes more and more important due to the popularity of packaged web applications, including PhoneGap app, Chrome app, Firefox OS app, Window 8 HTML app, etc (not all of them use file: protocol directly, but underlying mechanisms are similar). So If we can't specify a interoperable way to determine a local file's mime type, porting of packaged web applications can be problematic in some situations (actually my team already hit this). I know that currently there is no standard way to determine a local file's mime type, this may be one of the reason that mimesniff spec has not defined a behavior here. Well, the most basic reason is because I never delved into how it actually works, because I was primarily concerned with HTTP connections. It's possible that there is no interoperable way to determine a local file's MIME type, but see below. I'd like to propose a simple way to resolve this problem: For mime types that has already been standardized by IANA and used in web standards, determine a local file's supplied-type according to its file extension. This list could include htm, html, xhtml, xml, svg, css, js, ipeg, ipg, png, mp4, webm, woff, etc. Otherwise, UAs can determine supplied-type by any means. I think this rule should resolve most of the interoperability problems, and largely maintain compatibility with current UAs' implementations. There is already a "standard" in place to detect file types on the operating system level: http://www.freedesktop.org/wiki/Specifications/shared-mime-info-spec/ http://cgit.freedesktop.org/xdg/shared-mime-info/ I could just refer to that and be done with it. Do you think that would work? (That specification has complex rules for detecting files, including magic bytes and whatnot, and is already used on a number of Linux distros and probably other operating systems.) My second question is: does above rule apply equally to both fetching static resources (top level, iframe, img, etc) and XMLHttpRequest? It seems all browsers try to figure out actual type for local static resources, so that .htm and .xhtml files are rendered as HTML and XHTML respectively, so far so good. But when it comes to XHR, things are different. Firefox(31) set Content-Type header to 'application/xml' for local files of any type; and if setting xhr.responseType = 'document', response is parsed as XML; also if setting xhr.responseType = 'blob', blob.type is always 'application/xml'. This is significantly diverse from static fetching behavior. Chromium(34) set Content-Type header to null for local files of any type; but if setting xhr.responseType = 'document', response is parsed according to its actual type, i.e. .htm as HTML and .xhtml as XHTML; and if setting xhr.responseType = 'blob', blob.type is the file's actual type, i.e. 'text/html' for .htm and 'application/xhtml+xml' for .xhtml. This is similar to static fetching behavior, however Content-Type header is missing. I think rule 5.1 should be applied to both static fetching and XHR consistently. Browsers should set Content-Type header to local files' actual type for XHR, and interpret them accordingly. But firefox developers think this would break some existing codes that already rely on firefox's behavior (see https://bugzilla.mozilla.org/show_bug.cgi?id=1037762). What do you think? Regards, Duan Yao. Anne's the person to ask about XHR first, I think. I don't want to make any judgements or claims until I hear his view on the situation. That being said, I created the Contexts wiki article [1] and began splitting up the mime
Re: [whatwg] How to determine content-type of file: protocol
Sorry for the delay in responding. Your message fell through the cracks in my e-mail filters. On 07/17/2014 08:26 AM, duanyao wrote: Hi, My first question is about a rule in MIME Sniffing specification (http://mimesniff.spec.whatwg.org): 5.1 Interpreting the resource metadata ... If the resource is retrieved directly from the file system, set supplied-type to the MIME type provided by the file system. As far as I know, no main-stream file systems record MIME type for files. Does the spec actually want to say "provided by the operating system" or "provided by the file name extension"? Yeah, you've hit a known (though apparently unrecorded) bug in the spec, originally pointed out to me by Boris Zbarsky via IRC many months ago. The intent here is basically just "whatever the computer says it is"—whether that be via the file system, the operating system, or whatever, and whether it uses magic bytes, file extensions, or whatever. In other words, feel free to read that as "the correct behavior is undefined/unknown" at this point. My second question is: does above rule apply equally to both fetching static resources (top level, iframe, img, etc) and XMLHttpRequest? It seems all browsers try to figure out actual type for local static resources, so that .htm and .xhtml files are rendered as HTML and XHTML respectively, so far so good. But when it comes to XHR, things are different. Firefox(31) set Content-Type header to 'application/xml' for local files of any type; and if setting xhr.responseType = 'document', response is parsed as XML; also if setting xhr.responseType = 'blob', blob.type is always 'application/xml'. This is significantly diverse from static fetching behavior. Chromium(34) set Content-Type header to null for local files of any type; but if setting xhr.responseType = 'document', response is parsed according to its actual type, i.e. .htm as HTML and .xhtml as XHTML; and if setting xhr.responseType = 'blob', blob.type is the file's actual type, i.e. 'text/html' for .htm and 'application/xhtml+xml' for .xhtml. This is similar to static fetching behavior, however Content-Type header is missing. I think rule 5.1 should be applied to both static fetching and XHR consistently. Browsers should set Content-Type header to local files' actual type for XHR, and interpret them accordingly. But firefox developers think this would break some existing codes that already rely on firefox's behavior (see https://bugzilla.mozilla.org/show_bug.cgi?id=1037762). What do you think? Regards, Duan Yao. Anne's the person to ask about XHR first, I think. I don't want to make any judgements or claims until I hear his view on the situation. That being said, I created the Contexts wiki article [1] and began splitting up the mimesniff spec according to contexts [2] in an effort to clarify this situation and make sure that all bases were covered. It's still a work in progress, awaiting feedback from implementers and other spec writers. I agree that there's a hole in how mimesniff, XHR, and Contexts intersect, and I'll be happy to update mimesniff to fill it, if that's determined to be the best course of action. HTH, Gordon [1] http://wiki.whatwg.org/wiki/Contexts [2] http://mimesniff.spec.whatwg.org/#context-specific-sniffing -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/
Re: [whatwg] [mimesniff] The Apache workaround should not sniff random types
On 08/27/2013 12:26 PM, Boris Zbarsky wrote: The current mimesniff spec says that when the Apache workaround is applied sniffing should still be able to detect the content as PostScript, images, videos, archives, audio formats, etc. I feel that this poses an unacceptable security risk due to allowing content through firewalls that is then interpreted differently by a UA. In particular, postscript and media formats can be used to attack viewers and decoders. Web compat does not require this behavior: Gecko only allows "text/plain" and "application/octet-stream" as output types when the Apache workaround is being applied, and we have been successfully shipping this for a while. I would strongly oppose changing the Gecko behavior here due to the security implications. Given the security risks and the lack of web compat issues, I believe the spec should not require the behavior it currently requires. -Boris I have finally made this change. Please confirm that this is what you had in mind: https://github.com/whatwg/mimesniff/commit/d7bafc16ee480a5dea4c27d60dd5272388e022ce http://mimesniff.spec.whatwg.org/#rules-for-text-or-binary -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/
Re: [whatwg] [mimesniff] The Apache workaround should not sniff random types
On 8/27/13 12:26 PM, Boris Zbarsky wrote: The current mimesniff spec says that when the Apache workaround is applied sniffing should still be able to detect the content as PostScript, images, videos, archives, audio formats, etc. I feel that this poses an unacceptable security risk due to allowing content through firewalls that is then interpreted differently by a UA. In particular, postscript and media formats can be used to attack viewers and decoders. Web compat does not require this behavior: Gecko only allows "text/plain" and "application/octet-stream" as output types when the Apache workaround is being applied, and we have been successfully shipping this for a while. I would strongly oppose changing the Gecko behavior here due to the security implications. Given the security risks and the lack of web compat issues, I believe the spec should not require the behavior it currently requires. -Boris I'm inclined to agree. Having heard no objection (or, indeed, any discussion whatsoever) in the last 3 months, I plan to move ahead with this proposed change. Anyone else have anything to say before I do? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/
Re: [whatwg] Zip archives as first-class citizens
On 8/28/13 9:32 AM, Anne van Kesteren wrote: We have thought of three approaches for zip URL design thus far: * Using a sub-scheme (zip) with a zip-path (after !): zip:http://www.example.org/zip!image.gif * Introducing a zip-path (after %!): http://www.example.org/zip%!image.gif * Using media fragments: http://www.example.org/zip#path=image.gif High-level drawbacks: * Sub-scheme: requires changing the URL syntax with both sub-scheme and zip-path. * Zip-path: requires changing the URL syntax. * Fragments: fail to work well for URLs relative to a zip archive. Fragments are conceptually the cleanest as the only part of a URL that's supposed to depend on the Content-Type is the fragment. However, if you want to link to an ID inside an HTML resource you'd have to do #path=test.html&id=test which would require adding knowledge to the HTML resource that it is contained in a zip archive and have special processing based on that. And not just HTML, same goes for CSS or JavaScript. I'm not sure we need to consider sub-scheme if zip-path can work as it's more complex and not very well thought out. E.g. imagine view-source:zip:http://www.example.org/zip!test.html. (I hope we never need to standardize view-source and that it can be restricted to the address bar in browsers.) zip-path makes zip archive packaging by far the easiest. If we use %! as separator that would cause a network error in some existing browsers (due to an illegal %), which means it's extensible there, though not backwards compatible. We'd adjust the URL parser to build a zip-path once %! is encountered. And relative URLs would first look if there's a zip-path and work against that, and use path otherwise. Fetching would always use the path. If there's a zip-path and the returned resource is not a zip archive it would cause a network error. As for nested zip archives. Andrea suggested we should support this, but that would require zip-path to be a sequence of paths. I think we never went to allow relative URLs to escape the top-most zip archive. But I suppose we could support in a way that %!test.zip!test.html goes one level deeper. And "../image.gif" in test.html looks in the enclosing zip. And "../../image.gif" in test.html looks in the enclosing zip as well because it cannot ever be relative to the path, only the zip-path. As the following URLs suggest, the %! (or %-anything) will likely not work for ZIP files generated by a script using the query portion of the URL, as the path information will be subsumed into the last value without causing a network error: http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1%!example.png http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1%/example.png http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1?example.png (And feel free to use that script to try out any other combos.) However, since fragments (i.e. anything beginning with '#') are already not sent to the server, what if you modified the URL parser to use a special hash-prefix combo that indicates the path? Then you could avoid the problem of having to make documents aware of the fact that they're in a ZIP because the hash-prefix combo would come before the plain hash which holds the ID. So, for example: http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1#/example.html#middle Then you could also take the opportunity to spec the #! prefix (and other hash-combo prefixes) that is used by a lot of sites nowadays. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/
Re: [whatwg] [mimesniff] More issues on the MIME Sniffing spec
On Thu, Jun 6, 2013 at 5:42 AM, Peter Occil wrote: > I want to respond to the following issues in the MIME Sniffing spec: > > Resources > > I suggest the following wording for the issue box starting with "A resource > is..." > >A resource is a data item or message, such as a file or an HTTP response. > > I believe this covers the cases that would normally be associated with a > MIME type. I already have an idea about how to define "resource". The reason it's not currently in the spec is because I recall Hixie expressing some concern about complexity beyond "bag of bits" and I'm waiting on feedback from him. > Contexts > > I don't think the word "context" needs to be specially defined. The start > of section 8 > could be rewritten to remove the definition: > > [[ > In certain cases, it is only useful to identify resources that belong to a > certain subset of MIME types. In these cases, it is appropriate to use a > context-specific sniffing algorithm in place of the MIME type sniffing > algorithm in order to determine the sniffed MIME type of a resource. > > This specification defines the following context-specific sniffing > algorithms. > ]] On the contrary, I think it may be important to define "context", as it is the only lens through which to see fetching and sniffing and the like. Currently, the HTML spec only defines "(nested) browsing context", so I put together a wiki page that lists all the other ones that exist implicitly: http://wiki.whatwg.org/wiki/Contexts I plan to rewrite the whole second half of the spec to be in terms of contexts soon. > Apache Bug > > As for the Apache bug flag, would it be useful to additionally check the > HTTP > headers for a Server header and check if it contains "Apache/"? I don't > know which > version of Apache the bug involved was fixed in, so I can't suggest a more > accurate > string check. That thought had crossed my mind, but the handling of the situation mostly predates my editing of the spec, so I haven't given much thought into whether the current method is the ideal one. > MP3 Sniffing > > Finally, the Firefox team has recently included a patch to support sniffing > MP3 > files better [1] and would like to document it and add it to the MIME > Sniffing > spec. [2] The disadvantage, though, is that more than 512 bytes > are required for an accurate detection. > > --Peter > > [1]: https://bugzilla.mozilla.org/show_bug.cgi?id=862088 > [2]: https://bugzilla.mozilla.org/show_bug.cgi?id=879429 > I'm aware of this. I was told that a proposal would be made in due course, so I'm waiting on that. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Review request: Parsing a MIME type
(Re-added the list; I hope that's OK.) The canPlayType method (and similar mechanisms) are only approximations of what the browser can support. The "codecs" is generally not strictly necessary when the UA goes to actually play the file—if the "codecs" parameter is missing, it can generally be recovered by parsing/processing the file. Thus, it is not an especially reliable testing method. On Sat, Jun 1, 2013 at 8:17 PM, Peter Occil wrote: >> However, in order to test parameters, I have been >> using 'charset' (because that's they only one I'm aware of that has a >> Web-visible effect), and certain implementations may be sniffing >> specifically for the string "charset=", which would cloud the results >> of my testing. >> > > There are other parameters that are significant in MIME types, such as > "codecs", > which is used in certain newer HTML5 APIs. For example, some very > recent browsers support the canPlayType method of the element, > which takes a MIME type as a parameter (though it doesn't work well in OS X > versions of > Firefox 21, apparently [1]). The parameters, especially the "codecs" > parameter, > can make a difference in what value is returned by the API. > > [1]: https://bugzilla.mozilla.org/show_bug.cgi?id=875385 > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Review request: Parsing a MIME type
On Sat, Jun 1, 2013 at 11:41 AM, Gordon P. Hemsley wrote: > On Fri, May 31, 2013 at 11:50 PM, Peter Occil wrote: >> * The word "base64" can only appear at the end of the MIME type, so that a >> data URL like >> "data:application/example;base64;foo=bar,AA==" will not be encoded in >> base64, strictly speaking. A parameter name (base64 or otherwise) >> cannot otherwise appear without a parameter value. > > As I mentioned, "strictly speaking" doesn't matter, as all browsers do > the same thing, according to the resource you linked: base64 > parameters with values are fine; base64 boolean parameters in other > than last place are warnings. (Not sure what the reasoning behind that > distinction is, but that's what reality is.) It seems I read the purpose of the test wrong for base64 parameters with values: They're fine insofar as they're allowed, but they don't trigger base64 decoding (except in Safari?), unlike if the boolean base64 parameter is in a non-last position. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Review request: Parsing a MIME type
On Fri, May 31, 2013 at 11:50 PM, Peter Occil wrote: > >> * Another important point to notice is the fact that this algorithm >> allows parameter names to appear without values. This is useful in >> situations such as the "base64" option in data: URLs that use the mere >> presence or absence of a parameter to set its boolean value. > > > Since you mention data URLs I should note that data URLs can be percent > encoded, which HTTP > and MIME headers can't be. This raises additional considerations when > parsing a data URL's MIME type correctly; > see reference [1] for test cases. In particular: > > [1]: http://greenbytes.de/tech/tc/datauri/ This is a very useful resource; thank you for pointing it out to me. Realize now that that's the only thing that matters: What do the browsers do? (And percent encoding doesn't matter, as that gets handled before the parsing begins.) > * A data URL that begins with "data:," or "data:;base64," (with no MIME > type) is assumed to have the MIME type > "text/plain;charset=us-ascii" under RFC2397. > * A data URL that begins with "data:;" (with no type or subtype, but with > parameters) is assumed to have the MIME type > "text/plain" under RFC2397. An empty or invalide MIME type will get treated as unknown and will eventually be sniffed (if it isn't already). I'll have to consider what to do with the base64 and other parameters parts, though. > * The word "base64" can only appear at the end of the MIME type, so that a > data URL like > "data:application/example;base64;foo=bar,AA==" will not be encoded in > base64, strictly speaking. A parameter name (base64 or otherwise) > cannot otherwise appear without a parameter value. As I mentioned, "strictly speaking" doesn't matter, as all browsers do the same thing, according to the resource you linked: base64 parameters with values are fine; base64 boolean parameters in other than last place are warnings. (Not sure what the reasoning behind that distinction is, but that's what reality is.) So it seems the only issue I have to worry about is what to do with MIME types which only have parameters. Regards, Gordon -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
[whatwg] [mimesniff] Review request: Parsing a MIME type
Hello all, This is a request seeking feedback and review on the MIME Sniffing algorithm to "parse a MIME type": http://mimesniff.spec.whatwg.org/#parse-a-mime-type After numerous iterations, I think it is in a state that accurately reflects the best current practices for interoperability. As is common with such things, there are numerous points in this algorithm where implementations do not agree. In general, Firefox and Chrome tend to pattern together, as do IE and Opera. Safari often patterns on its own, in favor of a more literal interpretation of the various RFCs on the matter. At times, I have had to make a decision as to which was the best approach. This usually results in half of the implementations being in violation of the spec; I hope, in those instances, the implementations in question can be updated to become interoperable with the rest. With that being said, there are two specific points I want to raise: (1) The more recent RFCs on the matter restrict type, subtype, and parameter names to 127 characters. No implementation actually enforces this limit, but I have included it in the algorithm (relevant points appear in red) because I think it would be better and safer for both the user and the user agent to do so. (2) Based on my analysis of existing implementations, anything that occurs between the semicolon (and any first whitespace) and the equals sign is treated as the parameter name, including any whitespace before the equals sign. However, in order to test parameters, I have been using 'charset' (because that's they only one I'm aware of that has a Web-visible effect), and certain implementations may be sniffing specifically for the string "charset=", which would cloud the results of my testing. Any enlightenment into this issue would be much appreciated. I also have a few general points: * You may notice in the algorithm that I am using hybrid terminology, sometimes talking about bytes and sometimes talking about characters. This is mostly because I haven't decided/determined whether to treat a MIME type as ASCII or as UTF-8. I think there are arguments on both sides of the issue, but I'm eager to hear your opinions and advice (especially about how I might phrase the algorithm if it were written in terms of characters instead of bytes). * One of the most controversial parts of this algorithm might be the issue of what to do when a parameter appears more than once. (The RFCs suggest that the MIME type should be treated as invalid in such a case, but no implementation actually treats it that way.) I have opted to make a later appearance of a parameter override and replace an earlier appearance of a parameter. Modulo caveat (2) above, this is only done in half the implementations; in particular, IE and Opera appear to use the first instance of the parameter as the canonical value. * Another important point to notice is the fact that this algorithm allows parameter names to appear without values. This is useful in situations such as the "base64" option in data: URLs that use the mere presence or absence of a parameter to set its boolean value. Note, however, that a parameter that has been given an explicit value (even if that value is the empty string) does not get overridden by the later appearance of a boolean parameter of the same name. I think those are the important points of background information you need to know in order to evaluate this algorithm. I look forward to your response. Regards, Gordon -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Complete MIME type parsing algorithm for section 5
Peter, The main reason I haven't yet responded to your e-mails is because I'm still actively working on improving and testing the algorithm. But I do want you to know that your comments are valuable to me, because they point out the areas I need to consider and test. And while you should continue to bring inconsistencies with RFCs to my attention, you should keep in mind that some of these inconsistencies may be "willful violations". The IETF has the power to restrict the format of the MIME types that are formally registered, but they have little power over what winds up deployed in the wild. Browsers, on the other hand, need to know how to handle all sorts of things that the IETF would consider invalid—and in many cases existing browsers do things in violation of the RFCs. Since one of the main goals of this spec, and the WHATWG as a whole, is to improve interoperability, making the spec consistent with a majority of browsers overrides making the spec consistent with existing RFCs. One specific comment I have about your latest e-mail: I think you should read the algorithm again, because I'm fairly sure that it does guard against empty values for type, subtype, and parameter names. (But I'll check again.) Regards, Gordon On Tue, May 28, 2013 at 4:25 PM, Peter Occil wrote: > > I see you've updated the MIME sniffing algorithm in response to my feedback. > Here > I'll go over the difference and I want you to comment on these. > > 1. I assume the term "whitespace character" means the same as a "whitespace > byte" under > the MIME Sniffing spec. As such the use of that term is inadequate for the > following reasons. > > * A whitespace character includes 0x0C, form feed (FF), which is not > considered whitespace > in either HTTP or the Internet Message Format (IMF, RFC5322). > > For example, the following would not be well-formed under HTTP or IMF: > > text/plain{FF}; charset=utf-8 > > But the current algorithm would consider that string well-formed > anyway. > > * All steps in the document that are the same as step 7 skip all > whitespace characters, even > if the whitespace isn't well formed under HTTP or IMF. For example, a > bare carriage > return (CR) or line feed character (LF) is not allowed, and a CR-LF > pair not followed by either > SPACE or TAB is also not allowed. IMF also allows comments within > whitespace. > > For example, the following would not be well-formed under HTTP or IMF: > > text/plain;{CR} charset=utf-8 > text/plain;{LF} charset=utf-8 > text/plain;{CR}{LF}charset=utf-8 > > (Note the lack of space in the last example. Note also that folding > whitespace is deprecated > under the current HTTP draft.) > > And the following examples would be allowed under IMF, but not HTTP: > > (comment) text/plain; charset=utf-8 > text/plain; (comment) charset=utf-8 > text/plain; (comment (nested)) charset=utf-8 > text/plain; charset=utf-8 (comment) > text/plain; {CR}{LF} (comment) charset=utf-8 > > 2. While the type, subtype, and parameter name are checked for their length, > the other rules > for wellformedness are not checked in your version, namely, that they must > not be empty, > contain a byte that isn't a MIME type byte (see my original message), or > begin with a byte that > isn't an ASCII alphanumeric. > > For example, the following would not be well-formed under RFC6838: > > te*xt/plain;charset=utf-8 > text/pl*ain;charset=utf-8 > text/plain;ch*arset=utf-8 > text/plain;=utf-8 > text/;charset=utf-8 > /plain;charset=utf-8 > > The first three examples are because "*" isn't a MIME type byte. > > > 3. Unquoted parameter values are not checked to ensure that they are not > empty and do > not contain a byte that isn't a parameter value byte (see my original > message). > > For example, the following would not be well-formed under HTTP or MIME: > > text/plain;charset=ut?f-8 > text/plain;charset=utf=8 > > 4. Quoted parameter values are not checked to ensure that they do not > contain a 0x7F byte > or a byte other than TAB (0x09) that is less than 0x20. > > For example, the following would not be well-formed under HTTP or MIME: > > text/plain;charset="utf{LF}-8" > text/plain;charset="utf{0x7F}-8" > text/plain;charset="utf\{LF}-8" > text/plain;charset="utf\{0x7F}-8" > > Please give your comments. > > --Peter > > > -Original Message- From: Gordon P. Hemsley > Sent: Saturday, May 25, 2013 1:26 PM > > To: Peter Occil > Cc: WHATWG > Subject: Re:
Re: [whatwg] [mimesniff] Complete MIME type parsing algorithm for section 5
On Sat, May 25, 2013 at 12:46 PM, Peter Occil wrote: > My algorithm skips only SPACE and TAB instead of all whitespace characters > because it assumes that the field value was already extracted from > Content-Type according to the HTTP/HTTPbis spec (0x0C, form feed, is never > considered whitespace in HTTP headers). In particular, it assumes that > folding whitespace (obs-fold) was replaced with spaces (or the message with > obs-fold rejected) before the Content-Type value was interpreted. Thanks for your detailed explanation. It'll take me a little while to evaluate what you've proposed here, but in the meantime: Keep in mind that the Content-Type header is not the only source for a MIME type. This algorithm needs to consider MIME types from all possible sources. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Complete MIME type parsing algorithm for section 5
Peter, The burden is on you to describe your proposals and what their purpose and benefit would be. How does this proposed algorithm differ from what is already in the spec? How is it better? Regards, Gordon On Sat, May 25, 2013 at 3:58 AM, Peter Occil wrote: > I present this draft of the complete algorithm for parsing a MIME type. I > would appreciate comments. > > --Peter > > > > An ASCII alphanumeric is a byte or character in the ranges 0x41-0x5A, > 0x61-0x7A, and 0x30-0x39. > A MIME type byte is an ASCII alphanumeric or one of the following bytes: ! # > $ & ^ _ . + - > A parameter value byte is a MIME type byte or one of the following bytes: % ' > * ` | ~ > > To parse a MIME type, run the following steps: > > 1. Let length be the length of the byte sequence of the MIME type. > 2. If length is less than 1, return undefined. > 3. Let pointer be 0. Pointer is a zero-based index to the current byte in > the byte sequence. > 4. Advance pointer to the next byte other than 0x20 (SPACE) or 0x09 (TAB). > 5. Let type be the byte string from the current byte up to but not including > the next "/" byte. Advance pointer to the next "/" byte. > 6. If the current byte isn't "/", return undefined. > 7. Increment pointer by 1. > 8. Let subtype be the byte string from the current byte up to but not > including the next 0x20 (SPACE), 0x09 (TAB), or ";" byte. Advance pointer to > the next 0x20 (SPACE), 0x09 (TAB), or ";" byte. > 9. If type is empty, contains a byte that isn't a MIME type byte, or doesn't > begin with an ASCII alphanumeric, or is longer than 127 bytes, return > undefined. > 10. If subtype is empty, contains a byte that isn't a MIME type byte, or > doesn't begin with an ASCII alphanumeric, or is longer than 127 bytes, return > undefined. > 11. Convert type and subtype to ASCII lowercase. > 12. Let parameters be an empty dictionary. > 13. Run the following substeps in a loop. > 1. Advance pointer to the next byte other than 0x20 (SPACE) or 0x09 > (TAB). > 2. If pointer is equal to length, return type, subtype, and parameters. > 3. If the current byte isn't ";", return undefined. > 4. Increment pointer by 1. > 5. If pointer is equal to length, return type, subtype, and parameters. > 6. Let parameter be the byte string from the current byte up to but not > including the next "=" byte. Advance pointer to the next "=" byte. > 7. If parameter is empty, contains a byte that isn't a MIME type byte, > or doesn't begin with an ASCII alphanumeric, or is longer than 127 bytes, > return undefined. > 8. If parameters contains a mapping for parameter, return undefined. > 9. Convert parameter to ASCII lowercase. > 10. If the current byte isn't "=", return undefined. > 11. Increment pointer by 1. > 12. If the current byte equals 0x22 (quotation mark), run the following > substeps: > 1. Let value be an empty byte string. > 2. Increment pointer by 1. > 3. Run these substeps in a loop. > 1. If pointer is equal to length, return type, subtype, > and parameters. > 2. If the current byte equals 0x7F or is less than > 0x20, and the current byte isn't TAB (0x09), return type, subtype, and > parameters. > 3. If the current byte equals 0x22 (quotation mark), > increment pointer by 1 and terminate this loop. > 4. Otherwise, if the current byte is "\", increment > pointer by 1. Then, if there is a current byte, append that byte to value. > 5. Otherwise, append the current byte to value. > 6. Increment pointer by 1. > 4. Add the mapping of parameter to value to the parameters > dictionary. > 13. Otherwise, run these substeps: > 1. Let value be the byte string from the current byte up to but > not including the next 0x20 (SPACE), 0x09 (TAB), or ";" byte. Advance > pointer to the next 0x20 (SPACE), 0x09 (TAB), or ";" byte. > 2. If value is empty or contains a byte that isn't a parameter > value byte, return undefined. > 3. Add the mapping of parameter to value to the parameters > dictionary. > > --- > > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] An alternative approach to section 9 of Mime Sniffing
Section 5 is highlighted with all that red warning stuff precisely because it is known to be incomplete and insufficient. I haven't yet decided how I'm going to go about writing that up (and it isn't inherently obvious that what is there now is bad). So that's not the best example; and it certainly doesn't have anything to do with section 9 (at least, not with regard to formatting). I still don't understand what problem you're trying to solve (and if I don't understand the problem, I can't come up with a solution). Are you just having trouble reading and understanding what's there? MIME Sniffing and WebVTT have very different usecases and, in some ways, very different audiences. I don't think you can directly compare the two. Gordon On Sat, May 25, 2013 at 1:58 AM, Peter Occil wrote: > What I think is that even if an ABNF won't be the normative definition of a > syntax format, it can help put the format's syntax into a higher-level > perspective and aid understanding of its syntax: once we understand, for > example, what the Content-Type header field value ought to contain, in the > form of an ABNF or in some other way, it will be easier to write processing > rules for that field value in the spec. (Right now I'm in the process of > rewriting section 5 of the MIME sniffing spec.) > > Take the WebVTT spec for example. For each part of the WebVTT format > there's a definition of what that part contains in terms of characters, and > the actual processing rules for parsing that part. For example, the > definition for "WebVTT cue timings" and the algorithm to "collect WebVTT cue > timings and settings." The definition aids understanding of the syntax for > WebVTT cue timings and informs how the rules for collecting WebVTT cue > timings are written in the WebVTT spec. > > > --Peter > > -Original Message- From: Anne van Kesteren > Sent: Friday, May 24, 2013 1:28 AM > > To: Peter Occil > Cc: WHATWG > Subject: Re: [whatwg] An alternative approach to section 9 of Mime Sniffing > > On Thu, May 23, 2013 at 2:49 PM, Peter Occil wrote: >> >> Explain further why you don't recommend ABNF for this case. > > > We don't recommend ABNF in general because often ABNF results in a > mismatch between prescribed and actual processing. E.g. Content-Type > is defined as an ABNF and technically "text/html;" does not match that > ABNF, but everyone (logically) processes that as "text/html" without > parameters. > > It's much better to define the actual processing so implementers are > less inclined to take shortcuts when implementing (test suites also > help, but they're typically written way-after-the-fact). > > >> You should also explain whether another change to make section 9 more >> readable is >> appropriate (though it currently is relatively readable as is). > > > I'll leave that to Gordon. > > > -- > http://annevankesteren.nl/ -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] An alternative approach to section 9 of Mime Sniffing
The pattern matching algorithm is used because certain patterns require other-than-exact matching. That is why the "pattern mask" exists. This is particularly important for the "rules for identifying an unknown MIME type" (defined in 10.1), which matches ASCII characters case-insensitively; it is also important for a number of patterns that contain unimportant bytes that should be ignored (like WebP, in your example). The algorithm lays out the information in tabular form because that makes clearer the separation between the important bytes and the unimportant (or case-insensitive) bytes. Keep in mind that implementations may read one byte at a time; using ABNF would give them no benefit, and would likely make things more confusing. I wonder: What problem are you trying to solve with this proposal? (In the future, please add "[mimesniff]" to the beginning of your subject line for MIME Sniffing discussions; this will ensure that I see them and pay attention to them more quickly.) Regards, Gordon On Thu, May 23, 2013 at 2:10 AM, Peter Occil wrote: > I propose rewriting section 9 and parts of section 10 in a different way, to > use the ABNF format in RFC 5234. (Note that ABNFs are already used in the > current Fetch specification.) With this approach, the definitions for "byte > pattern", "pattern mask", and the "pattern matching algorithm" can be > eliminated (all of which are found before section 9.1). > > An example for the image pattern matching algorithm is given below. > > --- > > 9.1 Matching an image type pattern > > The image pattern matching algorithm takes a byte sequence as input. The > algorithm goes through the following image types in the order given. For > each image MIME type given below, if the start of the byte sequence matches > its ABNF, return the concatenation of "image/" and the name of the ABNF (in > lowercase), and terminate the image pattern matching algorithm. > > vnd.microsoft.icon = %x00.00.01.00 >; A Windows Icon signature. > bmp = %x42.4D >; The string "BM", a BMP signature. > gif = %x47.49.46.38 (%x37 / %x39) %x61 >; The string "GIF87a" or "GIF89a", a GIF signature. > webp = %x52.49.46.46 4OCTET %57.45.42.50.56.50 >; The string "RIFF" followed by four bytes followed by the string "WEBPVP". > png = %x89.50.4E.47.0D.0A.1A.0A >; The byte 0x89 followed by the string "PNG" >; followed by CR LF SUB LF, the PNG signature. > jpeg = %xFF.D8.FF >; The JPEG Start of Image marker followed by the indicator >; byte of another marker. > > If the start of the byte sequence doesn't match any ABNF given above, return > undefined. > > --- > > I would appreciate comments. > > --Peter -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Priority between and content-disposition
On Wed, May 8, 2013 at 12:21 PM, Boris Zbarsky wrote: > On 5/8/13 12:15 PM, Gordon P. Hemsley wrote: >> >> Perhaps. But maybe I'm not clear on what exactly the alternate >> proposal is. Are you suggesting not supporting the @download >> attribute? Or just ignoring it when Content-Disposition specifies a >> filename? (I would suggest that neither is the appropriate response.) > > > What Gecko implements right now is: > > 1) @download is ignored for non-same-origin links. > 2) If Content-Disposition specifies a filename, that filename is used > no matter what @download says. I understand now the motivation for this, but I would think that it would remove a lot of the usefulness of the @download attribute: If you have the same origin, you probably already have access to (a) name the file appropriately in the first place, or (b) set the Content-Disposition header to send the appropriate filename. No? >>> This is not trivial, since sniffing can easily fail on files that are >>> both >>> HTML and png or both HTML and exe at the same time. There's a good bit >>> of >>> research on things like this. >> >> >> Yes, and that research has already gone into creating the mimesniff >> standard, has it not? I'm suggesting use the existing algoirthm(s) in >> an additional arena, not creating a new, separate algorithm. > > > The mimesniff standard doesn't try to sniff for types UAs don't render > natively, which is what would be needed here. I'm not so sure about that, but I'll leave it to someone else to argue. (If you determine a file to be a PNG, then you suggest a .png extension, regardless of whether there might be an embedded executable; if you don't support the file format, then how do you know that it isn't supposed to be an executable in the first place? —and what is it doing on the Web?) -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Priority between and content-disposition
On Wed, May 8, 2013 at 12:01 PM, Boris Zbarsky wrote: > On 5/8/13 10:45 AM, Gordon P. Hemsley wrote: >> >> I still think @download takes priority. >> >> The Content-Disposition header says, "Nevermind what filename the URL >> shows; this is really file B.txt." >> >> The @download attribute says, "Nevermind what filename this link would >> normally be; let's just consider it A.txt." > > > OK, that's at least a reasonable argument for the behavior. ;) > > >> That seems like quite a sophisticated attack that relies on a lot of >> things falling into place all at once. > > > Uh... yes. Like most browser exploits. Perhaps. But maybe I'm not clear on what exactly the alternate proposal is. Are you suggesting not supporting the @download attribute? Or just ignoring it when Content-Disposition specifies a filename? (I would suggest that neither is the appropriate response.) >> Then I think it is the responsibility of the UA to sniff the file and >> protect the user from such attempts to mislead. > > > This is not trivial, since sniffing can easily fail on files that are both > HTML and png or both HTML and exe at the same time. There's a good bit of > research on things like this. Yes, and that research has already gone into creating the mimesniff standard, has it not? I'm suggesting use the existing algoirthm(s) in an additional arena, not creating a new, separate algorithm. If a file from an image sharing site is served as (or determined to be, via the sniffing algorithms) image/png, for example, then the UA should suggest a filename with a .png extension, ignoring any suggestion by the author for a .exe extension. (Whether you want to change it to "A.png" or "A.exe.png" is debatable, I suppose.) >> I'm not sure I have the resources to do extensive real-world testing >> of this (and that documentation suggests it has been superseded in >> more modern OSes), but I don't think it would be unreasonable for the >> UA to override or augment the filename suggested by the @download >> attribute it if determines that it would not be in the best interest >> of the user to use the suggested filename unchanged. > > > Phrased that way, using the Content-Disposition filename is a perfectly > valid "override if not in the best interest of the user" behavior, fwiw. > > -Boris > True. But doesn't that imply a rejection of my aforementioned "reasonable argument"? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Priority between and content-disposition
On Wed, May 8, 2013 at 9:43 AM, Boris Zbarsky wrote: > On 5/8/13 6:53 AM, Gordon P. Hemsley wrote: >> >> It's not clear to me which of the two factors you take issue with. > > > The question of which filename takes priority. > > >> The second sentence very clearly suggests >> that "A.txt" would be the filename presented to the user by default in >> the save dialog. > > > No, it suggests that A.txt is what the page author recommends. > > If, at the same time, B.txt is what the server author recommends, what > should happen? I still think @download takes priority. The Content-Disposition header says, "Nevermind what filename the URL shows; this is really file B.txt." The @download attribute says, "Nevermind what filename this link would normally be; let's just consider it A.txt." >>> There is if you allow cross-origin @download. >>> >>> There is if you allow untrusted markup on your server and don't sanitize >>> away @download (should it be sanitized away? Unclear). >> >> >> I'm still not seeing what the problem is. All this does is make the >> browser treat the link as if the user followed it and then went File > >> Save Page As > > > No, because in that case the browser will definitely use the > Content-Disposition filename, not the one from @download. OK, technically, the way I phrased it, yes. But what I meant was that it rolls a bunch of steps into one, telling the browser that the link should be downloaded and named per suggestion. >> What are the security concerns, cross-origin or otherwise? > > > One concern is being able to do this: > > href="http://some-bank/statement.pdf";> > > cross-site and combining it with something that lets you read > known-location.pdf (e.g. a file://-specific privacy hole that only applies > to some filenames, or an that the user has already filled > in). That seems like quite a sophisticated attack that relies on a lot of things falling into place all at once. I'm not sure that should block the use of the attribute in and of itself. > Another concern is if you upload a file to an image-sharing site, but it > happens to be a Windows executable. Then you link to it with: > > http://image-sharing-site/whatever";> > > and wait for the user to download and double-click. This relies on the user > thinking the file came from image-sharing-site so must be an image. UAs may > do mitigations here by changing the suggested filename, of course. Then I think it is the responsibility of the UA to sniff the file and protect the user from such attempts to mislead. At the very least, the download UI could specify the actual type of the file that is being downloaded. (More on how to protect users who don't read that below.) > Generally, allowing this sort of thing opens up several new phishing nd > social engineering attack vectors, and it's not clear that we want that. There is a price to freedom, as they say. We shouldn't let a few rotten apples spoil the whole bunch. >> Well, what I should have said is, there is no content sniffing beyond >> what is already done for regular page saves. (The UI can show the MIME >> type or format of the file in the download box, as it would for any >> file it doesn't handle natively.) > > > It can, and users routinely ignore that. > > >> Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a >> while since I used Windows. But I'd be surprised to find out that the >> browser (Firefox, in the case I have in mind) changes the extension in >> the suggested filename (e.g. "example.php" for an HTML file) on >> Windows but not on Mac > > > It sure used to in some cases, partially in concert with the Windows > filepicker. See the (scant) documention for lpstrDefExt at > http://msdn.microsoft.com/en-us/library/windows/desktop/ms646839%28v=vs.85%29.aspx > and I suggest actually doing some experimentation across the different save > variants (save image, save link as, save page as, click on something with > content-disposition:attacment) on several OSes to see the behavior. There > is certainly a good bit of code in the various file-saving codepaths in > Firefox that attempts to ensure extensions match MIME types, to forbid > saving things with certain extensions, etc. > > Also note that Chrome will change extensions on at least @download filenames > to match the MIME type; I haven't experimented in detail with its behavior > for other cases. And I haven't experimented much with other browsers in > this area, though I expect all have some interesting behavior. &
Re: [whatwg] Priority between and content-disposition
On Tue, May 7, 2013 at 10:18 PM, Boris Zbarsky wrote: > On 5/7/13 5:54 PM, Gordon P. Hemsley wrote: >> >> A @download attribute with a value would override both factors, like so: >> (1) Download it. >> (2) "A.txt" > > Why? > > You say this as if it were obvious, but it's not obvious to me at all... > What's the reasoning that makes this the desirable behavior? It's not clear to me which of the two factors you take issue with. Here's what the spec says: "The download attribute, if present, indicates that the author intends the hyperlink to be used for downloading a resource. The attribute may have a value; the value, if any, specifies the default file name that the author recommends for use in labeling the resource in a local file system." I interpret that first sentence to mean that the file should be downloaded (disposition type = attachment) rather than displayed (disposition type = inline). The second sentence very clearly suggests that "A.txt" would be the filename presented to the user by default in the save dialog. >> I don't see what the security concerns might be: There is no >> difference here than what is already available > > There is if you allow cross-origin @download. > > There is if you allow untrusted markup on your server and don't sanitize > away @download (should it be sanitized away? Unclear). I'm still not seeing what the problem is. All this does is make the browser treat the link as if the user followed it and then went File > Save Page As What are the security concerns, cross-origin or otherwise? >> AFAICT, there are no content >> sniffing or cross-domain issues at play. > > But there are; see above. Well, what I should have said is, there is no content sniffing beyond what is already done for regular page saves. (The UI can show the MIME type or format of the file in the download box, as it would for any file it doesn't handle natively.) >> results when saving a file; they don't do any file extension vs. file >> format checking. > > Uh... that depends on exactly how you save and your OS. Browsers commonly > do file extension vs MIME type checking on Windows. Behavior on other OSes > varies, and varies across browsers. > > -Boris Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a while since I used Windows. But I'd be surprised to find out that the browser (Firefox, in the case I have in mind) changes the extension in the suggested filename (e.g. "example.php" for an HTML file) on Windows but not on Mac, and I would argue that that perhaps should not be the case. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Priority between and content-disposition
I realize this is an old thread, so apologies if this has already been resolved. The discussion that originally followed seemed to have gotten off track, so I wanted to try to clarify things. First off, there are two factors to consider: (1) Whether to download the file or display it. (2) What filename to suggest for the file when it is downloaded. In the general case, with a normal and no Content-Disposition header (or the plain 'Content-Disposition: inline' header, listed as (1) originally), the answers are: (1) Display it. (2) Whatever the filename on the server is (e.g. "page.txt" or "example.php"), modulo OS restrictions. In the case of a normal and a 'Content-Disposition: inline; filename="B.txt"' header (listed as (2) originally), the answers are: (1) Display it. (2) "B.txt" Changing the disposition type doesn't change much, with a normal and a 'Content-Disposition: attachment; filename="B.txt"' header (listed as (3) originally): (1) Download it. (2) "B.txt" So now, the question is, what effect does a @download attribute have? Nothing too surprising. An empty @download attribute would override the 1st factors above so that they are always "Download it." A @download attribute with a value would override both factors, like so: (1) Download it. (2) "A.txt" Thus, the @download attribute acts to override the Content-Disposition header, giving the following hierarchy: @download > Content-Disposition > URL Or, in pseudocode (with the assumption that if X has Y, then X is also present): disposition_type = ( @download is present ) ? "attachment" : ( ( Content-Disposition header is present ) ? Content-Disposition disposition type : "inline" ); suggested_filename = ( @download has a value ) ? value of @download : ( ( Content-Disposition has filename parameter ) ? Content-Disposition filename value : filename from URL ); I don't see what the security concerns might be: There is no difference here than what is already available, except that there's now an additional way to specify it. AFAICT, there are no content sniffing or cross-domain issues at play. Browsers already give strange results when saving a file; they don't do any file extension vs. file format checking. (For example, the output of a .php or .cgi or .py file on a server is usually HTML, yet browsers don't generally make any attempt to change the file extension to .html when saving the file, IME.) Does this make sense? Am I missing anything? Regards, Gordon On Sat, Mar 16, 2013 at 9:49 PM, Jonas Sicking wrote: > It's currently unclear what to do if a page contains markup like href="page.txt" download="A.txt"> if the resource at audio.wav > responds with either > > 1) Content-Disposition: inline > 2) Content-Disposition: inline; filename="B.txt" > 3) Content-Disposition: attachment; filename="B.txt" > > People generally seem to have a harder time with getting header data > right, than getting markup right, and so I think that in all cases we > should display the "save as" dialog (or display equivalent download > UI) and suggest the filename "A.txt". > > The spec is currently defining something else at least for 3. > > Potentially there are reasons to do something different in the case > when the linked resource lives off of a different origin since in that > case there might be security reasons to use the filename or > disposition of the server that is actually serving up the content. > However I don't think we can expect people to indicate > "Content-Disposition: inline" in order to protect resources. Nor do I > think that simply using a different filename is going to meaningfully > protect downloaded content. So I think a stronger UI warning is needed > in this scenario. > > Firefox currently doesn't support cross-origin @download references, > so I don't have any meaningful implementation experience to share > regarding that scenario. > > / Jonas -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] HTML differences from HTML4 document updated
Simon, I think it would be good to consider the target audiences, of which there are probably many: You have the audience who is worried that HTML5 is some grand departure from the HTML 4.01 they (think they) know and love. For them, you'll want to describe what exactly has been removed and why, instilling the idea of a separation between semantic and presentational markup. Then you have the audience that is excited to see what they can do now with HTML5 that they couldn't do with HTML 4.01. For them, you'd list the new elements and attributes and such. Then you probably have some other incidentals such as things that were removed or changed just because they were never implemented or people never used them. These probably don't fall into either of the two categories above. But you also have another issue to consider: For this document, the difference between the W3C's concept of specification snapshots and WHATWG's concept of a living standard is not trivial. For the former, you can have snapshot documents detailing the differences between each snapshot specification; for the latter, you need a living document that is anchored by a fixed point at one end (HTML 4.01). This raises the question of the purpose of this document: Is it to simplify the transition from HTML 4.01 to HTML5+? Or is it to act as an HTML changelog from here on out? Because I think attempting to do both within a single document will become unwieldy as time goes on. Regards, Gordon On Tue, May 7, 2013 at 5:00 AM, Simon Pieters wrote: > On Mon, 06 May 2013 16:50:03 +0200, Jukka K. Korpela > wrote: > >>> I don't think this is of particular importance. >> >> >> If it isn't, why not use the correct spelling? > > > Mostly to be consistent with "HTML5". > > >> When referring to specifications, it is usually a good idea to use their >> own spelling, even when it is odd and confusing. >> >>> HTML 4.01 is intended. The differences between revisions of HTML4 is out >>> of scope. >> >> >> Then the heading should say "HTML 4.01". > > > It's longer, and it's not clear to me that people are actually confused > about what "HTML4" refers to. > > >>> "Modern HTML differences from HTML4"? I'm not convinced that's a win. >>> "Near-future" seems wrong since it's more like "current". >> >> >> The difficulty here directly reflects the vague nature of HTML5: it partly >> tries to describe HTML as actually implemented and partly specifies features >> that should (or "shall") be implemented. Hence it is both modern and >> (intended to be) near-future. >> >> But the fundamental difficulty is that you are trying to describe a >> specific version, or set of versions, of HTML without giving it a proper >> name or version number. >> >> Since WHATWG does not use a proper name for its version (the title is just >> "HTML"), I think the only way to refer to it properly is to prefix it with >> "WHATWG". This would lead to the title >> >> "Differences of HTML5 and WHATWG HTML from HTML 4.01" > > > Here "HTML5" is supposed to refer to "W3C HTML5 and W3C HTML5.1"? > > How about I go back to the original title "Differences from HTML4"? > http://wiki.whatwg.org/wiki/Differences_from_HTML4 > > > >>> Such a document would be useful, but it's not this document. The primary >>> focus for this document is what is different from HTML4. >> >> >> But why? What is the purpose of this document? This is relevant to naming >> it, and to the content too, of course. Now it is neither a reliable >> comparison with links the relevant clauses nor an overview - it has too many >> details, to begin with. > > > It's more intended to be an overview. Can you give an example of something > that is too detailed and suggest the level of detail that would be more > appropriate? > > >> Is this for authors who consider moving from HTML 4.01 to HTML 5? > > > Yes. > > >> Then I think it should primarily specify what HTML 4.01 features are >> forbidden in HTML 5, then the extensions. > > > Thanks, that's useful feedback. > > > -- > Simon Pieters > Opera Software -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] HTML differences from HTML4 document updated
It is my understanding that the W3C version lists "HTML5" and the WHATWG version uses "HTML". That was what I intended by "HTML(5)". I didn't mean the parentheses were included literally. Gordon On Fri, May 3, 2013 at 2:19 PM, Xaxio Brandish wrote: > Ah. The document scope [1] explains why it uses "HTML" in the title as > opposed to HTML5 or HTML(5). > > --Xaxio > > References: > [1] http://html-differences.whatwg.org/#scope > > > > On Fri, May 3, 2013 at 11:16 AM, Gordon P. Hemsley > wrote: >> >> The way I interpreted it, Jukka meant that the title could be >> something more flowing, like "Differences between HTML4 and HTML(5)". >> >> Gordon >> >> On Fri, May 3, 2013 at 2:10 PM, Xaxio Brandish >> wrote: >> > Good day, >> > >> > Let us start with a definition: >> > >> > es·o·ter·ic >> > /ˌesəˈterik/ >> > Adjective >> > Intended for or likely to be understood by only a small number of people >> > with a specialized knowledge or interest. >> > >> > The document Simon delivered and formatted is useful to a wide range of >> > audiences interested in HTML and how it differs from a previous named >> > release of the HTML roadmap, so I'm not sure calling the title of the >> > document "esoteric" is accurate. >> > >> > Regardless of that, if the title is obscure, could you please offer up >> > title suggestions so that your posting becomes more constructive? Keep >> > in >> > mind that an existing document [1] on the whatwg.org site references >> > HTML >> > version 4 as "HTML4" already, so there is a precedent set for this. I >> > do >> > not think this will confuse anybody, and it would have to be changed >> > throughout documents on the entire site to be consistent. I'd like to >> > propose that both nomenclatures are valid when referring to the entire >> > HTML >> > 4 specification. >> > >> > The important thing (IMHO) to remember here regarding the title is that >> > HTML released two subversions of HTML 4, HTML 4.0 [2] and HTML 4.01 [3]. >> > The document must be intended as a differentiation between the entire >> > version of HTML4, since it does not specify a specific subversion to >> > diff? >> > However, it links to the HTML 4.01 specification in the "References" >> > section. If this is *only* a diff between HTML 4.01 and the living >> > standard, perhaps the title should then be "HTML differences from HTML >> > 4.01" so that the document has additional meaning. If there are >> > differences between HTML 4.0, HTML 4.01, *and* HTML5 in the same section >> > of >> > the document, those should probably be appropriately marked. >> > >> > --Xaxio >> > >> > References: >> > [1] >> > >> > http://www.whatwg.org/specs/web-apps/current-work/multipage/introduction.html#history-1 >> > [2] http://www.w3.org/TR/1998/REC-html40-19980424/ >> > [3] http://www.w3.org/TR/REC-html40/ >> > >> > >> > On Fri, May 3, 2013 at 9:20 AM, Jukka K. Korpela >> > wrote: >> > >> >> 2013-05-03 18:37, Simon Pieters wrote: >> >> >> >> The past few days I've been working on updating the HTML differences >> >>> from HTML4 document, which is a deliverable of the W3C HTML WG but is >> >>> now also available as a version with the WHATWG style sheet: >> >>> >> >>> >> >>> http://html-differences.**whatwg.org/<http://html-differences.whatwg.org/> >> >>> >> >> >> >> I think you should start from making the title sensible. "HTML >> >> differences >> >> from HTML4" is too esoteric even in this context. >> >> >> >> Think about a heading "FOO differences from FOO9". Wouldn't you say >> >> that >> >> some FOOist is writing very obscurely? >> >> >> >> Besides, the spelling is "HTML 4". Especially if you think HTML 4 is >> >> ancient history, retain the historical spelling. >> >> >> >> Yucca >> >> >> >> >> >> >> >> >> >> -- >> Gordon P. Hemsley >> m...@gphemsley.org >> http://gphemsley.org/ • http://gphemsley.org/blog/ > > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] HTML differences from HTML4 document updated
The way I interpreted it, Jukka meant that the title could be something more flowing, like "Differences between HTML4 and HTML(5)". Gordon On Fri, May 3, 2013 at 2:10 PM, Xaxio Brandish wrote: > Good day, > > Let us start with a definition: > > es·o·ter·ic > /ˌesəˈterik/ > Adjective > Intended for or likely to be understood by only a small number of people > with a specialized knowledge or interest. > > The document Simon delivered and formatted is useful to a wide range of > audiences interested in HTML and how it differs from a previous named > release of the HTML roadmap, so I'm not sure calling the title of the > document "esoteric" is accurate. > > Regardless of that, if the title is obscure, could you please offer up > title suggestions so that your posting becomes more constructive? Keep in > mind that an existing document [1] on the whatwg.org site references HTML > version 4 as "HTML4" already, so there is a precedent set for this. I do > not think this will confuse anybody, and it would have to be changed > throughout documents on the entire site to be consistent. I'd like to > propose that both nomenclatures are valid when referring to the entire HTML > 4 specification. > > The important thing (IMHO) to remember here regarding the title is that > HTML released two subversions of HTML 4, HTML 4.0 [2] and HTML 4.01 [3]. > The document must be intended as a differentiation between the entire > version of HTML4, since it does not specify a specific subversion to diff? > However, it links to the HTML 4.01 specification in the "References" > section. If this is *only* a diff between HTML 4.01 and the living > standard, perhaps the title should then be "HTML differences from HTML > 4.01" so that the document has additional meaning. If there are > differences between HTML 4.0, HTML 4.01, *and* HTML5 in the same section of > the document, those should probably be appropriately marked. > > --Xaxio > > References: > [1] > http://www.whatwg.org/specs/web-apps/current-work/multipage/introduction.html#history-1 > [2] http://www.w3.org/TR/1998/REC-html40-19980424/ > [3] http://www.w3.org/TR/REC-html40/ > > > On Fri, May 3, 2013 at 9:20 AM, Jukka K. Korpela wrote: > >> 2013-05-03 18:37, Simon Pieters wrote: >> >> The past few days I've been working on updating the HTML differences >>> from HTML4 document, which is a deliverable of the W3C HTML WG but is >>> now also available as a version with the WHATWG style sheet: >>> >>> http://html-differences.**whatwg.org/<http://html-differences.whatwg.org/> >>> >> >> I think you should start from making the title sensible. "HTML differences >> from HTML4" is too esoteric even in this context. >> >> Think about a heading "FOO differences from FOO9". Wouldn't you say that >> some FOOist is writing very obscurely? >> >> Besides, the spelling is "HTML 4". Especially if you think HTML 4 is >> ancient history, retain the historical spelling. >> >> Yucca >> >> >> -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] use of article to markup comments
List elements and sectioning elements both represent hierarchical relationships. They differ in how they emphasize that relationship: lists emphasize the hierarchy outside the content, while sectioning emphasizes the hierarchy within the content. If the question is specifically about how to mark up comments on a blog post or something, there's no reason you can't combine the two methods: Each comment is a self-contained , with relationships between comments represented by . One example: http://jsbin.com/edewoy/1 That example presumes you consider blog post comments (or replies to comments) as a section within the content that is being commented on (or replied to). You could also modify the markup to have two s (one for the blog post and one for the comments) packaged within a single parent , but the principle is the same. Note that the key here is that there is no restriction on combining lists and sectioning elements, and thereby no need to modify the semantics of or (as proposed in [2] in the root message). Gordon On Mon, Jan 28, 2013 at 12:13 PM, Steve Faulkner wrote: >> Brucel wrote: >> >> On Sat, 26 Jan 2013 10:56:10 -, Steve Faulkner >> wrote: >> >> >> > Lists are appropriate for indicating nested tree structures. The use >> > of lists to markup comments is a common mark up pattern used in >> > blogging software such as wordpress. The code verbosity is not >> > dissimilar to the use of article, less so even option end tags >> > are omitted. Besides comments are generated code not hand authored so >> > I don't see a problem with code verbosity >> >> [...] >> >> > >> >> (It makes some sense, I suppose, to think of comments as a "list", but >> >> *unordered*? If you're going to group them at all, wouldn't the order >> >> be important? Bruce Lawson ( >> >> http://lists.w3.org/Archives/Public/public-html/2013Jan/0111.html)'s >> >> observation that comments are "heavily dependent on context" would seem >> >> to support the idea that it *is* important, especially since some >> >> comments are responses to others.) >> > >> > agreed it would be better to use order lists. >> > >> >> Wordpress blogs, for example, have comments like >> >> "Bob Smith said at 9.55 on 31 Febtember: LOL" >> >> Thus, every comment has a link that a UA can use to jump from comment to >> comment. The order is implied via the timestamp. So what's wrong with >> >> >> Witty blogpost >> lorem ipsum >> >> >> 35 erudite and well-reasoned comments >> Bob Smith said at 9.55 on 31 Febtember: Can >> I use DRM in Polyglot documents? >> Hixie said at 9.57 on 1 June: What's your >> use case? >> ... >> >> >> >> >> In short, why should the spec suggest any specific method of marking up >> comments? > > Good question, in the case of recommended tomarkup comments > it seems like it's an element in search of a use case. > > For users who consume article semantics it appear to cause issues when > used for any piece of content ranging from a one sentence comment to > an article containing thousands of words or an interactive widget. > > > regards > SteveF -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Sniffing archives
(It seems I somehow managed to not send this to the list the first time around. Addendum included.) On Tue, Dec 4, 2012 at 2:40 AM, Adam Barth wrote: > On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke wrote: >> On 2012-11-29 20:25, Adam Barth wrote: >>> These are supported in Chrome. That's what causes the download. From >> >> Can you elaborate about what you mean by "supported"? Chrome sniffs for the >> type, and then offers to download as a result of that sniffing? How is that >> different from not sniffing in the first place? > > They might otherwise be treated as a type that can be displayed > (rather than downloaded). But isn't the whole point of the spec to eliminate such accidental sniffing? Anything not explicitly sniffed based on the first bytes of the file will be assumed to be either 'application/octet-stream' or 'text/plain', depending on whether there are binary bytes present. The old IE behavior that you were investigating in your 2009 paper, where you sniff beyond the first few bytes to find embedded HTML, is eliminated with this sniffing algorithm. There is no case where you would accidentally sniff something as scriptable, if you were following the algorithm correctly. Or am I missing something? P.S. Note also that I have previously defined what it means to be "supported by the user agent": "A valid media type is supported by the user agent if the user agent has the capability to interpret a resource of that media type and present it to the user." http://mimesniff.spec.whatwg.org/#supported-by-the-user-agent -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Sniffing archives
On Tue, Dec 4, 2012 at 11:07 AM, Adam Barth wrote: > On Mon, Dec 3, 2012 at 11:59 PM, Julian Reschke wrote: >> On 2012-12-04 08:40, Adam Barth wrote: >>> They might otherwise be treated as a type that can be displayed >>> (rather than downloaded). Also, some user agents treat downloads of >> >> Do you have an example for that case? >> >>> ZIP archives differently than other sorts of download (e.g., they >>> might offer to unzip them). >> >> Out of curiosity: which? > > Safari. > > Adam To be more specific: (1) Safari doesn't appear to prompt the user for any downloads. It just automatically downloads any file it can't handle. (2) If you allow Safari to open "safe" files that it downloads, ZIP appears to be one of them. Gzip and RAR, however, do not. So this isn't the most convincing argument. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing
On Thu, Nov 29, 2012 at 2:30 PM, Adam Barth wrote: > On Wed, Nov 28, 2012 at 10:30 PM, Gordon P. Hemsley > wrote: >> Based on my reading of the source code, it seems that Gecko treats a >> resource served as 'application/octet-stream' as an unknown type which >> is sniffed as if no Content-Type was specified. >> >> Are there security implications with doing this? > > Yes, there are very large security consequences. I'm sorry that I > don't have time to respond to all of these threads in detail, but I'm > worried that you don't understand the consequences of the changes > you're proposing to this specification. > > I'm not sure how to help you succeed here, but tweaking things in the > spec without a compelling reason for doing so is not likely to lead to > a useful specification. I spent a great deal of time and effort > studying the behaviors of many user agents and of a massive amount of > content on the web. I'm certainly willing to believe that the spec > can be improved, but if you don't understand these sorts of basic > things about content sniffing, I worry that changes that you make to > the spec won't be improvements. > > Adam I and others have already made clear that I was misreading the Mozilla source code. I'm aware of the security implications of interpreting a resource as something other than what the Content-Type header says. The whole reason I sent the original e-mail was because I thought Mozilla was sniffing "application/octet-stream" in a way that it shouldn't, and I wanted to clarify whether there was something I was missing. I think you need to tone down your worry about my changes to the spec. If I didn't have concern for the security implications for a change, I wouldn't be sending an e-mail to the list about them, would I? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Sniffing archives
To be clear, I'm asking this because I would like to remove the sniffing of archive types from the mimesniff spec if there aren't any valid usecases. On Wed, Nov 28, 2012 at 12:18 PM, Gordon P. Hemsley wrote: > The mimesniff spec currently includes signatures for ZIP, gzip, and > RAR archive formats. However, no major browser seems to support them > natively (they all prompt for download), and it's not clear whether > the type detection is a product of the browser code or the OS, or > whether it is used beyond choosing an appropriate file extension for > the download. > > Are there any valid usecases for explicitly sniffing archive formats > instead of letting them default to application/octet-stream like other > binary files would? Note that Henri Sivonen has previously raised the > issue that ZIP-based formats (like office suite documents), for > example, would be misleadingly sniffed as ZIP files, and there is no > easy way around that. > > -- > Gordon P. Hemsley > m...@gphemsley.org > http://gphemsley.org/ • http://gphemsley.org/blog/ -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing
On Thu, Nov 29, 2012 at 12:57 PM, Boris Zbarsky wrote: > canPlayType is not called "against a file". It's called with a single > argument which is a string MIME type. If you pass > "application/octet-stream", it will return "". Its behavior does not depend > on any state of the element it's called on (like what it's actually pointing > to, etc); only on the string passed in. Oh, I see. My mistake. (One should never attempt to understand something after 2 AM.) So... are there any additional places where "application/octet-stream" should be treated as if the media type was undefined? Or is this conversation moot now? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing
On Thu, Nov 29, 2012 at 3:02 AM, Boris Zbarsky wrote: > On 11/29/12 2:53 AM, Gordon P. Hemsley wrote: >> >> At one point it says, "The MIME type "application/octet-stream" with >> no parameters is never a type that the user agent knows it cannot >> render. User agents must treat that type as equivalent to the lack of >> any explicit Content-Type metadata when it is used to label a >> potential media resource." >> >> But later it says, "The canPlayType(type) method must return the empty >> string if type is a type that the user agent knows it cannot render or >> is the type "application/octet-stream";" > > > What's the contradiction? We have set S = { types the user agent knows it > cannot render }. We have set T = S union { application/octet-stream } > > What the above statements tell us so far is: > > 1) T != S > 2) canPlayType(type) must return empty string for all types in T. > > But later on in the resource selection algorithm there are certain actions > taken for elements of S only. > > >> This seems to me to be unclear as to when sniffing of the audio/video >> resource occurs, and what it is used for. > > > It's used for actually showing the video even if it's sent as > application/octet-stream. The apparent contradiction occurs when, e.g., an Opus file is tagged as "application/octet-stream". If I understand correctly, a UA would return "" when canPlayType() is called against such a file—but then the file would actually play because it is later sniffed as "application/ogg". Am I missing something? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing
On Thu, Nov 29, 2012 at 2:32 AM, Boris Zbarsky wrote: > On 11/29/12 2:07 AM, Gordon P. Hemsley wrote: >> >> So perhaps a more useful question would be what to do in situations >> like that—should mimesniff treat "application/octet-stream" as a type >> "supported by the browser" for the purposes of sniffing images, audio >> or video, fonts, or other media types? > > > The way it works right now is that > http://www.whatwg.org/specs/web-apps/current-work/#mime-types says: > > The MIME type "application/octet-stream" with no parameters is never > a type that the user agent knows it cannot render. User agents must > treat that type as equivalent to the lack of any explicit > Content-Type metadata when it is used to label a potential media > resource. > > So for the purpose of sniffing media loads specifically, that type is > treated just like no type at all. > > But first you have to know it's a media load. Oh, this is probably the location where the HTML spec doesn't currently, but eventually should, reference the "rules for sniffing audio and video specifically" in mimesniff. (Is this where Opera implements such rules?) Is it just me (and my late-night reading), or is that section contradictory on how to treat "application/octet-stream"? At one point it says, "The MIME type "application/octet-stream" with no parameters is never a type that the user agent knows it cannot render. User agents must treat that type as equivalent to the lack of any explicit Content-Type metadata when it is used to label a potential media resource." But later it says, "The canPlayType(type) method must return the empty string if type is a type that the user agent knows it cannot render or is the type "application/octet-stream";" This seems to me to be unclear as to when sniffing of the audio/video resource occurs, and what it is used for. >> I imagine this ties in, too, to the issues with sniffing CSS files >> that has been raised elsewhere: >> >> https://bugzilla.mozilla.org/show_bug.cgi?id=560388 >> https://bugzilla.mozilla.org/show_bug.cgi?id=562377 > > Neither one of those has anything to do with application/octet-stream as far > as I can tell. Those cover cases in which data is sent with either no > Content-Type header or with such a header which can't even be parsed as > "major/minor". Neither of which is true if the data says > "appliction/octet-stream". I was grouping them together because they both rely on context clues for modifying the sniffing (fallback) behavior, but we can discuss them separately if that's easier. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing
On Thu, Nov 29, 2012 at 1:30 AM, Gordon P. Hemsley wrote: > Based on my reading of the source code, it seems that Gecko treats a > resource served as 'application/octet-stream' as an unknown type which > is sniffed as if no Content-Type was specified. Oh, wait, I forgot what I was reading—Gecko does this specifically in the context of sniffing for an audio or video resource. So, if a resource tagged as 'application/octet-stream' is included in or , for example, it will be treated as unknown for the purposes of identifying its true nature. This never follows a path of scriptable privilege escalation, AFAICT. So perhaps a more useful question would be what to do in situations like that—should mimesniff treat "application/octet-stream" as a type "supported by the browser" for the purposes of sniffing images, audio or video, fonts, or other media types? I imagine this ties in, too, to the issues with sniffing CSS files that has been raised elsewhere: https://bugzilla.mozilla.org/show_bug.cgi?id=560388 https://bugzilla.mozilla.org/show_bug.cgi?id=562377 https://bugzilla.mozilla.org/show_bug.cgi?id=808593 -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
[whatwg] [mimesniff] Treating application/octet-stream as unknown for sniffing
Based on my reading of the source code, it seems that Gecko treats a resource served as 'application/octet-stream' as an unknown type which is sniffed as if no Content-Type was specified. Are there security implications with doing this? Or should I add 'application/octet-stream' to the list of unknown types that currently includes 'unknown/unknown', 'application/unknown', and '*/*' (step 2 of the "media type sniffing algorithm")? Or, given that that step calls the "rules for identifying an unknown media type" with the sniff-scriptable flag set, should it get its own call, with the sniff-scriptable flag unset? Are there other options here? I haven't checked what UAs actually do in practice, but I don't believe the spec currently allows anything but leaving resources tagged as 'application/octet-stream' as they are. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Audio and video sniffing
Done: https://github.com/whatwg/mimesniff/commit/77ee676c8852f4e76facd7d6c1174ac0ec41696e Note that this also affects the "media type sniffing algorithm" and the "rules for identifying an unknown media type". On Tue, Nov 27, 2012 at 12:51 AM, Simon Pieters wrote: > On Mon, 26 Nov 2012 23:38:02 +0100, Gordon P. Hemsley > wrote: > >> Upon looking through the code for Gecko's media sniffing, I noticed >> that they seem to combine sniffing for audio and video elements. Given >> that Opera has said that it uses the specific sniffing algorithms, and >> that some media containers (like Ogg) can be used for either audio or >> video, I wonder if it would make sense to combine audio and video >> sniffing under a single audiovisual category? This would affect the >> "matching audio/video type pattern" sections and the "sniffing >> audio/video specifically" sections. >> >> Any objections? Other thoughts? > > > Yes, I think it makes sense to have the same sniffing for both. is > like without the rendering area. > > -- > Simon Pieters > Opera Software -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Handling container formats like Ogg
On Tue, Nov 27, 2012 at 4:39 AM, Henri Sivonen wrote: > On Tue, Nov 27, 2012 at 12:59 AM, Gordon P. Hemsley > wrote: >> Would this be something UAs would prefer to handle in their Ogg >> library, or should I spec it as part of sniffing? > > What would be the use case for handling it as part of sniffing layer? I don't know; that's why I'm asking! :) Is it sufficient to sniff just for "application/ogg" and then let the UA's Ogg library determine whether or not the contents of the file can be handled? (I'm sensing the consensus is yes.) -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
[whatwg] [mimesniff] Handling container formats like Ogg
Container formats like Ogg can be used to store many different audio and video formats, all of which can be identified generically as "application/ogg". Determining which individual format to use (which can be identified interchangeably as the slightly-less-generic "audio/ogg" or "video/ogg", or using a 'codecs' parameter, or using a dedicated media type) is much more complex, because they all use the same "OggS" signature. It would requiring actually attempting to parse the Ogg container to determine which audio or video format it is using (perhaps not unsimilar to what is done for MP4 video and what might have to be done with MP3 files without ID3 tags). Would this be something UAs would prefer to handle in their Ogg library, or should I spec it as part of sniffing? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
[whatwg] [mimesniff] Audio and video sniffing
Upon looking through the code for Gecko's media sniffing, I noticed that they seem to combine sniffing for audio and video elements. Given that Opera has said that it uses the specific sniffing algorithms, and that some media containers (like Ogg) can be used for either audio or video, I wonder if it would make sense to combine audio and video sniffing under a single audiovisual category? This would affect the "matching audio/video type pattern" sections and the "sniffing audio/video specifically" sections. Any objections? Other thoughts? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
[whatwg] [mimesniff] The X-Content-Type-Options header
https://www.w3.org/Bugs/Public/show_bug.cgi?id=19865 Microsoft introduced the X-Content-Type-Options header in IE8 back in 2008: http://blogs.msdn.com/b/ie/archive/2008/09/02/ie8-security-part-vi-beta-2-update.aspx I would like to integrate the header into mimesniff and describe its proper usage. Right now, it allows one parameter: 'nosniff'. I would like to allow the presence of this parameter to set the 'no-sniff flag' that I just introduced into mimesniff (in addition to that flag's existing duties): http://mimesniff.spec.whatwg.org/#no-sniff-flag But I would also like to fully spec the header, while leaving open the possibility that other values may be added in the future. In addition, I would like to, if I could, also allow the header to be specified without the 'X-' prefix (so as 'Content-Type-Options'), for that reason (and because of best current practice). Does anyone have any questions, comments, or objections about this issue? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Proposal for a debugging information API
Recent blog posts that coincidentally may be useful in this discussion: http://vocamus.net/dave/?p=1532 http://www.twobraids.com/2012/11/socorro-as-service.html On Thu, Nov 15, 2012 at 12:07 AM, David Barrett-Kahn wrote: > Hi whatwg. I have a proposal for a new web standard, and would value your > feedback. This is based on my experiences working on Google Docs, which > has a well developed ability to send crash reports back to the server for > analysis. We often find these crash reports to be lacking in crucial > information though, because that information is not available on the JS > APIs. > > My proposal is to have a class of information which can be made available > to an app only after the display of a generic 'this application has > crashed' dialog, which could be drilled into to show what is being > disclosed, and which of course can be denied. > > Good examples of the information in question are the system's precise > hardware and network configuration, what Chrome extensions it has > installed, and perhaps a screenshot of the failed application. > > I've fleshed this out in the following document, and would value opinions > on the value of a feature of this kind, and the merits of this particular > approach. > > > https://docs.google.com/document/pub?id=1pw2Bzvy6OEn8YY3fAcZiReJPmgB79swkx-NJAdcemPk > > Thanks! > > -Dave > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Review requested on MIME Sniffing Standard
On Mon, Nov 12, 2012 at 6:08 PM, Ian Hickson wrote: > On Mon, 12 Nov 2012, Gordon P. Hemsley wrote: >> But if everyone vows to just wait for 512 bytes (or EOF), then that's >> fine with me. > > I don't think we should require tools to wait for 512 bytes. This is an > area where if we have the requirement, some user agents are just going to > have a timeout anyway and ignore the spec; we gain nothing by making it > non-conforming to have a timeout. I'm inclined to agree with you, but I'm curious what other implementers have to say on the issue. >> > What are the use cases for ‘Sniffing archives specifically’? >> >> No idea. I only included it for completeness. > > Please don't spec things for completeness without use cases. :-) In that case, I need to know which you think you might want for HTML and which you know you won't. (I don't know of any other specs reliant on mimesniff.) -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Review requested on MIME Sniffing Standard
On Mon, Nov 12, 2012 at 10:06 AM, Henri Sivonen wrote: > Resending feedback previously written at > https://bugzilla.mozilla.org/show_bug.cgi?id=808593#c10 : > > I think the bits ‘type is equal to "font" or’ and ‘type is equal to > "archive" or’ are highly questionable. The most popular font types are > in the process of getting application/ types and the most popular > archives already have application/ types. Buzzkill. ;( > I suspect the ‘a reasonable amount of time has elapsed, as determined > by the user agent.’ is unnecessary. The HTML spec has the same > provision for the prescan. Firefox didn’t implement it, a > couple of people complained, then fixed their code, and the sky didn’t > fall. This line was present in a previous draft of the spec, as was the seeming allowance to begin matching the resource header before it had finished loading. For simplicity in the algorithm, I removed the latter, so I left the former in as an escape hatch for those who wanted to emulate that behavior. But if everyone vows to just wait for 512 bytes (or EOF), then that's fine with me. > What are the use cases for ‘Sniffing archives specifically’? No idea. I only included it for completeness. The 'rules for sniffing * specifically' are intended as hooks for other specs to tie into. If no spec requires you to implement it, then you have no need to implement it. HTML uses 'rules for sniffing images specifically' (and 'rules for distinguishing if a resource is text or binary'), and I imagine it could also find uses for 'rules for sniffing audio specifically' and 'rules for sniffing video specifically' (and maybe even 'rules for sniffing fonts specifically'). > It > appears that it sniffs ODF-style files > (http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part3.html#__RefHeading__752809_826425813 > ; EPUB, ODF, InDesign, etc.) and Open Packaging Conventions-based > files (https://en.wikipedia.org/wiki/Open_Packaging_Conventions ; > OOXML, XPS, etc.) files as zip archives. Is that intended and a > desirable outcome in the light of use cases? (In general, it would be > easier to review if the spec makes sense if the use cases and callers > of various sniffing functions were known.) I don't think that's intended, but I don't know. The selection of which bytes to sniff predates me, and I don't know what the use cases are. > Otherwise, looks good to me. Thanks for the review! -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
[whatwg] [mimesniff] Review requested on MIME Sniffing Standard
Hey all, As you might have heard, I have taken over editorship of the MIME Sniffing Standard from Adam Barth. As a first step in my editorship, I have taken the opportunity to rewrite the document in a more procedural and modular way (IMO). The content and meaning itself is not supposed to have changed, and I need your help to verify that that is the case: http://mimesniff.spec.whatwg.org/ In addition, this now means that I am open to hearing your suggestions about how to improve the document beyond its current (i.e. former) semantics. You can file bugs here: https://www.w3.org/Bugs/Public/enter_bug.cgi?product=WHATWG&component=MIME As this document was originally an IETF document, there are also old issues here: http://trac.tools.ietf.org/wg/websec/trac/query?component=mime-sniff It's not clear to me which of those remain outstanding on the current version of the document, and it would be helpful to me if individuals with a vested interest in them could migrate them to Bugzilla (with updated descriptions that reflect the current state of the document). This will ensure that I address them in a timely manner. Also, it would be helpful if you could mark them as blocking the general bug here: https://www.w3.org/Bugs/Public/show_bug.cgi?id=19746 And if you want to follow the commits as they happen, you can follow @mimesniff on Twitter: https://twitter.com/mimesniff Thanks! Gordon -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
[whatwg] [wiki] The WHATWG Wiki has been upgraded
For those who missed the announcement on IRC and Twitter last week: The WHATWG Wiki has been upgraded to MediaWiki 1.19.2: http://wiki.whatwg.org/ This update brings with it a lot of the changes you're probably already used to from Wikipedia, including the new Vector theme. Over the many years since the WHATWG Wiki was first set up, a lot of cruft has accumulated in its configuration files. I have attempted to remove a lot of that, in order to allow the modern default values to come through. I don't if this will have much effect on the everyday use of the wiki, but I thought I'd let you know. In addition to the primary software update, I have also installed a number of extensions, and these will have an effect on your use of the wiki. There are three extensions that I want to bring your attention to specifically. The first one is ParserFunctions, which allows you to use some logical functions in pages to (for example) create conditional output. This is most useful, IMO, in templates, so you can condition the display of the template based on the presence, absence, or value of template parameters. See [[Template:Obsolete]] for an example: http://wiki.whatwg.org/wiki/Template:Obsolete The second extension I want to bring your attention to is SyntaxHighlight. This allows you to use the element in a wiki page to automatically highlight whatever source code you include. Given who we are, I've set it up to assume the language you are highlighting is 'html5', but you can also specify another language using the 'lang' attribute. (Note: This is not the same 'lang' attribute that you would normally find in HTML. It's looking for a programming language, not a BCP47 language tag.) And the third, and potentially the most useful, extension is Gadgets. This extension allows any administrator to install JavaScript and CSS gadgets directly onto the wiki, for use by all. I've installed a subset of the gadgets installed on Wikipedia which I think are the most useful. I've also turned many of them on by default; you can see full list of available gadgets (and edit your personal gadget availability) by going to My preferences > Gadgets. To see the full list of installed extensions, go to [[Special:Version]]: http://wiki.whatwg.org/wiki/Special:Version If you know of any useful extensions or gadgets that you think are missing from the WHATWG Wiki, let me know and I'll be happy to install them. And, as I am now the caretaker of the wiki (taking over, I believe, from AryehGregor), let me know about any other wiki issues you might have. By the way, I think the wiki is a particularly useful place to store information that might otherwise get lost in the shuffle of IRC logs and e-mail archives, so if you have any such tidbits, head over to the wiki and write them down! If you don't yet have a wiki account, you'll have to ask someone for help, as we've had some issues with spam accounts. But don't worry, it's very simple to get help, as I've set it up so that any autoconfirmed user can register an account. All they need is your e-mail address and your desired username. If you just let out a cry on IRC, someone should be able to help you, or you can contact one of the permanent autoconfirmed users listed here: http://wiki.whatwg.org/index.php?title=Special:ListUsers&group=autoconfirmed Happy wikiing! Gordon P.S. If you think you should be a permanent autoconfirmed member (and you're not), ping me on IRC or drop me a line off-list and I'll see what I can do. ;) -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] base64 entities
On Fri, Aug 27, 2010 at 2:44 PM, Aryeh Gregor wrote: > > PHP offers no JS-string-literal-escape function. `addslashes` is very close, > > but won't handle some cases with non-ASCII characters correctly. Better to > > use `json_encode` to transfer the string, then write as text: > > > > elmt.textContent = > JSON_HEX_TAG); ?> > > > > (assuming innerText or Text Node backup for IE/older browsers.) > > Interesting, that's useful. Too bad it only works in PHP 5.2 or higher. PHP 5.2.0 came out in 2006. I don't see anything "too bad" about using PHP 5.2 or higher with new technology.[1] Regards, Gordon [1] See also: http://gophp5.org/ -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] Proposal: @srctype or @type on
On Tue, Jul 13, 2010 at 3:26 AM, Boris Zbarsky wrote: > On 7/12/10 11:31 PM, Gordon P. Hemsley wrote: >> >> The particular use case that prompted me to think about this is >> including a PDF via . In Firefox (last I checked), one is >> required to install a separate add-on in order to support in-browser >> display of PDF files on Mac OS X, since there is no native or integrated >> Adobe Reader support available. > > I'm pretty sure you can install the Adobe Reader plug-in on Mac if you want > to. Perhaps now, but that wasn't always the case—at least not for Firefox. I admit that my experience is somewhat outdated. Installing the third-party PDF viewer add-on is one of the first things I did, in a "set it and forget it" kind of way. (Plus, I'm still on Tiger.) But, again, the PDF example was just one possible use case. I'm sure there are plenty of other file types that cause similar situations, including the TIFF issue that I mentioned. >> Without the add-on, the user will be prompted to download the PDF file > > Which is exactly what would happen for a type="application/pdf" iframe, no? > Silently not showing the content doesn't seem acceptable. > > -Boris > Well, the idea is to have the browser operate more intelligently than that. The page in the iframe is (by definition) not the primary document that the user is trying to load, so it shouldn't have the power steal the user's attention immediately upon page load. It would be very disorienting, and would likely cause the user to lose their train of thought. I was thinking more along the lines of Flashblock does or what happens when the window in an can't load: The content would be replaced somehow by a message and a button/link to allow the user to manually download the contents of the iframe, if they so choose. It shouldn't make that decision for the user, as it's not the user's fault that their browser does not support the format of some ancillary document. At least, that's how I see it. Gordon -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] Proposal: @srctype or @type on
Nils, I don't hate the HTTP Content-Type header. In fact, I like it very much. But this proposal was intended to guide the user agent before they ever receive the HTTP Content-Type header. ;) Cheers, Gordon On Tue, Jul 13, 2010 at 2:48 AM, Nils Dagsson Moskopp wrote: > "Gordon P. Hemsley" schrieb am Tue, 13 Jul 2010 > 02:31:19 -0400: > >> It should not be assumed that whatever resource included via >> is going to be of type 'text/html' or another easily parsable type. >> Thus, it could be helpful for the author to give the user agent a >> hint as to what type of document it is requesting be displayed >> inline, and allow the user agent to choose not to display the >> contents of the if it feels it cannot support it. > > Have you thought of using HTTP Content-Type headers and classic MIME > type handling to determine compatibility ? > >> […] >> >> Now, I'm not a spec implementor by any means, but I am a web author >> and a web user, so I've been on both sides of this issue. And it >> doesn't appear that it would be too complicated to extend the >> existing support of @type. > > AFAIK, implementors could use HTTP Content-Type headers for the given > purpose. > >> Thoughts? > > Why do you hate HTTP Content-Type headers ? ;) > > > Cheers, > Nils > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
[whatwg] Proposal: @srctype or @type on
Hello all. There a number of attributes that are designed to give the user agent a preview of what MIME type to except for referenced resource. (And there are also attributes like @hreflang that preview other things.) And yet, , which has to load a full document, has no ability to allow the user agent to determine compatibility. Thus, I propose doing one of the following: (1) add @srctype to (2) extend the meaning of @type that applies to , , and to apply to , as well I'm more inclined to believe that option (2) is the better option. But now for the reasoning. It should not be assumed that whatever resource included via is going to be of type 'text/html' or another easily parsable type. Thus, it could be helpful for the author to give the user agent a hint as to what type of document it is requesting be displayed inline, and allow the user agent to choose not to display the contents of the if it feels it cannot support it. The particular use case that prompted me to think about this is including a PDF via . In Firefox (last I checked), one is required to install a separate add-on in order to support in-browser display of PDF files on Mac OS X, since there is no native or integrated Adobe Reader support available. Without the add-on, the user will be prompted to download the PDF file, which can be very disconcerting if the user wasn't even expecting a PDF file. And I'm sure there are plenty of other instances where this same situation occurs. (TIFF files, perhaps? Like on the U.S. Patent Office's website?) Now, I'm not a spec implementor by any means, but I am a web author and a web user, so I've been on both sides of this issue. And it doesn't appear that it would be too complicated to extend the existing support of @type. Thoughts? Gordon -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] select element should have a required attribute
I'm not sure how you interpreted, but I wanted to clarify, in case it wasn't clear. I'm pretty sure this person is asking why @required isn't allowed on elements. As in: http://dev.w3.org/html5/markup/forms-attributes.html#shared-form.attrs.required I don't know what the exact reasoning is for it not being on there, nor do I know exactly how @required is supposed to be enforced, but I do think that the method suggested in the bug is a bad one. Sometimes, authors will include an empty on purpose in order to allow for an empty option to be selected. Thus, as you've said, Ash, there will always be some sort of value sent from a element. And, including the option of an empty string, I can't think of any way that there wouldn't be a value sent. Gordon On Fri, Jun 18, 2010 at 7:04 AM, Ashley Sheridan wrote: > On Fri, 2010-06-18 at 11:35 +0200, Mounir Lamouri wrote: > > Hi, > > I'm wondering why select element do not have a required attribute. It > seems to be perfectly suitable. With the required attribute, select > element would be able to suffer from being missing and the :required > pseudo-class could apply. > > Is there a reason why the select element has no required attribute or > it's only an omission? > > Related bug:http://www.w3.org/Bugs/Public/show_bug.cgi?id=9625 > > Thanks, > -- > Mounir > > > Required as in it should always have a value sent? If so, then it always > does. The default value for a select element is not an empty string as an > is always there (unless someone has been stupid enough to create an > empty select list.) > > As such, some sort of value will always be sent. > > Thanks, > Ash > http://www.ashleysheridan.co.uk > > > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] Is there a way to stop scrolling when pressing directional arrows?
For what it's worth, I am actually of the opposite opinion, Ash. I like it when Flash steals the focus of the keyboard, and here's why: Besides the arrow keys, which are available to everyone, I also use the "Find As You Type" feature in Firefox. However, that usually means that I can't play any HTML5 games that use letters as play keys. Because the HTML5 game usually doesn't steal the focus of the keyboard, typing a letter key activates the FAYT feature and distracts me from the game. With that being said, Bespin (from Mozilla Labs) uses , and it has no problem stealing the keyboard focus (with JavaScript) for most keypresses. Gordon 2010/6/14 Ashley Sheridan > On Mon, 2010-06-14 at 13:38 -0600, Carlos Andrés Solís wrote: > > Hello! I've been noticing a problem in many HTML5 test apps, very > especially games. When the directional arrow buttons are pressed, the screen > scrolls. This is a problem that, as far as I know, Flash had solved by > changing the focus of the application to the app. Is this doable in HTML5? > - Carlos Solís > > > I don't think it's something that was 'solved' by Flash. To be honest, I'm > often annoyed at the way Flash steals the focus of all my key presses making > it almost impossible to navigate using only the keyboard. > > You could use Javascript to put the focus onto an object, capture all the > key presses on that and return false for them all maybe. > > Thanks, > Ash > http://www.ashleysheridan.co.uk > > > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] <% text %> and in corporate intranet html content
On Tue, Feb 9, 2010 at 10:05 PM, Biju wrote: > What should a user agent display when html content is... > > > <%@ page language="java" %> > > > At present IE and Safari display blank > > Firefox display <%@ page language="java" %> > > And for document.body.innerHTML browsers give > Firefox --> <%@ page language="java" %> > IE --><%@ page language="java" %> > and Safari gives blank > > Also for > > > > > Firefox gives blank > > But for > > abc " ?> xyz > > > Firefox display... > abc " ?> xyz > > ie, all the contents after first ">" > with .innerHTML --> abc " ?> xyz > > IE in this case again hide all content till "?>" > as well as preserve content including the white space in innerHTML > > Due to these problems browsing corporate intranet with Firefox is > little irritating. > Calling help desk and asking to provide fix will get a reply that > company has standardized on IE6, so please use IE. > > > So per HTML standard in both case what should user agent display and > as well as content of .innerHTML > > Thanks > Biju > For what it's worth, I filed a Mozilla bug on a similar issue, and it was marked INVALID. https://bugzilla.mozilla.org/show_bug.cgi?id=477455 Parser does not wait for "?>" to close blocks that begin with "http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] the cite element
On Tue, Oct 6, 2009 at 4:15 PM, Erik Vorhes wrote: > On Tue, Oct 6, 2009 at 2:52 PM, Gordon P. Hemsley wrote: >> I also propose allowing parenthetical citations and footnote markers >> (as is used in the various W3C/WHATWG specifications) to also be >> marked up with , though I'm not sure if TabAtkins agrees with me >> on that point. > > I suppose allows for more functionality in current UAs, but this > is an interesting proposition, especially if there were a way to > crosslink used in this way to the original source (or whatever > it would point to). Would it be something along the lines of for="aside-id">, or did you have something else in mind? > > Erik Hmm... I hadn't given much thought to the implementation of that, as I was more worried about the other part of the debate, but I think treating as analogous to in that situation is indeed a good idea. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] the cite element
(I'm ignoring all of the unproductive back-and-forth that has occurred thus far. This is meant to start the discussion off fresh.) I was discussing the element with TabAtkins on IRC and I proposed analyzing the actual word 'cite'. Using it as a verb, the definition of 'cite' applies to quotes/quotations, titles, and people, depending on the context. TabAtkins noted that the first use case is so far off of legacy implementations, that it wouldn't even be worth considering for (especially because we have other elements that function as such). That leaves usages of 'cite' for both titles of works and authors of works. Putting aside the issue of styling for a moment, these two pieces of data both fall under the semantic meaning of 'cite'. Thus, they should fall under the semantic meaning of . If an author should have the need to differentiate between the two, I propose that they use and . Thus, I propose the following (which TabAtkins generally agrees with): Leave the default styling of to be italicized for legacy implementations and allow any reference to any work or author, with the granularity decided by the individual web developer. I also propose allowing parenthetical citations and footnote markers (as is used in the various W3C/WHATWG specifications) to also be marked up with , though I'm not sure if TabAtkins agrees with me on that point. I hope this message can help bring the discussion back to a neutral zone that will lead to an amicable resolution of this long debate. Regards, Gordon -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] [html5] r4029 - [e] (0) Example of use without .
Ah. I was afraid you might say that. On Tue, Sep 29, 2009 at 6:54 PM, Ian Hickson wrote: > On Tue, 29 Sep 2009, Gordon P. Hemsley wrote: > > > > s/Html/html/ > > Actually that was intentional in that example. I like to show a variety of > syntaxes so that people can see that they can do whichever one they > prefer. > > -- > Ian Hickson U+1047E)\._.,--,'``.fL > http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. > Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] [html5] r4029 - [e] (0) Example of use without .
s/Html/html/ On Tue, Sep 29, 2009 at 4:30 AM, Simon Pieters wrote: > On Tue, 29 Sep 2009 07:57:21 +0200, wrote: > > Author: ianh >> Date: 2009-09-28 22:57:20 -0700 (Mon, 28 Sep 2009) >> New Revision: 4029 >> >> Modified: >> index >> source >> Log: >> [e] (0) Example of use without . >> >> Modified: index >> === >> --- index 2009-09-29 02:41:23 UTC (rev 4028) >> +++ index 2009-09-29 05:57:20 UTC (rev 4029) >> @@ -13031,7 +13031,60 @@ >> >> + >> + Here is a graduation programme with two sections, one for the >> + list of people graduating, and one for the description of the >> + ceremony. >> + >> + <!DOCTPE Html> >> > > s/DOCTPE/DOCTYPE/ > > -- > Simon Pieters > Opera Software > -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/
Re: [whatwg] article/section/details naming/definition problems
I'd sent this earlier, but it got caught in the message queue that apparently nobody checks. Let's see if it works this time. -- Forwarded message ------ From: Gordon P. Hemsley Date: Tue, Sep 15, 2009 at 11:31 PM Subject: Re: [whatwg] article/section/details naming/definition problems To: whatwg List On Tue, Sep 15, 2009 at 9:08 PM, Ian Hickson wrote: > On Tue, 15 Sep 2009, Jeremy Keith wrote: > > In that blog post, I point out that and were once > more > > divergent but have converged over time (since the @cite and @pubdate > > attributes were dropped from ). > > > > I've also seen a lot of confusion from authors wondering when to use > > > and when to use . Bruce wrote an article on HTML5 doctor > recently to > > address this: > > http://html5doctor.com/the-section-element/ > > > > Probably the best tutorial I've seen on this issue is from Ted: > > http://edward.oconnor.cx/2009/09/using-the-html5-sectioning-elements > > > > ...but even so, the confusion remains. The very fact that tutorials are > > required for what should be intuitive structural elements is worrying — I > > don't see the same issues around , or (now that > the > > content model has been changed) ...although there is continuing confusion > > around . > > I'd like to rename , if someone can come up with a better word > that means "blog post, blog comment, forum post, or widget". I do think > there is an important difference between a subpart of a page that is > a potential candidate for syndication, and a subsection of a page that > only makes sense with the rest of the page. > What about ? (Directly, it's a coincidence that RSS happens to have the same-named element, as I just used a thesaurus. But perhaps [indirectly] there's a reason RSS uses to begin with. And, after all, it's supposed to be used as a hint that it could be syndicated content, right?) -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/ -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ http://sasha.sourceforge.net/ • http://www.yoursasha.com/