Re: [whatwg] [MIME Sniffing] Editorial feedback
Comparisons between media types, as defined by MIME specifications, are done in an ASCII case-insensitive manner. [RFC2046] so, the problem is that your `note` here is ambiguous it's hard to understand that you're just saying `mime rfc says that mime comparisons are insensitive`, v. `this specification wants mime comparisons to be insensitive` you want the former; but `note:` doesn't cause that result; nor does the `[rfc]` at the end I'm tempted to just rename them to be less semantic. They're just symbols that don't mean anything, really. please do :) That's a lot of editing! I'm not sure that buys us much. i ask, because it actually was useful when i was dealing w/ someone else's spec they had hex digits and some of them were wrong it was much easier to read when the hex digits were in tt That is intentional. Sniffing SWF is bad times. i think it might be worth an actual NOTE in the spec explaining that SWF is intentionally not sniffed, and what that means for untyped SWF files (actually explaining how it flows and to which resulting sniff type) Thanks to Alfred HÎnes Boris Zbarsky David Singer Mark Pilgrim, and Russ Cox. you need some punctuation before `Boris`, `David`, and `Mark` :) If RDF-flag is 1 and RSS-flag is 1, then let the sniffed-type be application/rss+xml and abort these steps. could you change that to: If both RDF-flag and RSS-flag are 1, then ...?
Re: [whatwg] [MIME Sniffing] Editorial feedback
I've taken all of your suggestions, except as noted below. Thanks for your detailed feedback. Adam On Mon, Sep 26, 2011 at 2:27 PM, timeless timel...@gmail.com wrote: Otherwise, if the octets in s starting at pos match any of the sequences of octets in the first column of the following table, then the user agent MUST follow the steps given in the corresponding cell in the second column of the same row. | What's the stray `|` character at the end of that doing? The ToC feels double spaced, is that normal? Would you mind quoting your attributes in source? Things like class=no-num or href=#web-data scare me. It's easier if you just quote all attributes :) Also, I generally recommend `span ...x/span ` over `span ...x /span` - i.e. trailing space outside of span (see toc) pMany web servers supply incorrect Content-Type header fields with their HTTP Can you mark up `Content-Type` in something which results in roughly typewriter font? s/user agents/User Agents/ as in: responses. In order to be compatible with these servers, user agents consider Without a clear specification of how to sniff the media type, each user agent implementor was forced to reverse engineer the behavior of the other user agents and to develop s/the other/other/ -- there are some UAs who were ignored when the sniffing of a given UA was developed :) their own algorithm I'm not sure if `algorithm` here belongs in singular or plural, I got distracted :) an HTTP response to be interpreted as one media type but some user agents interpret the responses as another media type. s/responses/response/ (agreement with first part) However, if a user agent does interpret a low-privilege media type, such as image/gif, as a high-privilege media type, such as text/html, the user agent has created a privilege escalation vulnerability in the server. s/, the user agent/, then the user agent/ I believe abarth has addressed the above. This document describes a content sniffing algorithm that carefully balances the compatibility needs of user agent implementors with the security constraints. `the security constraints` is problematic, I don't think `the` references anything so either drop `the`, or provide a reference :/ and metrics collected from implementations deployed to a sizable number of users . s/ ././ There's actually a reference that goes there. I just haven't figured out how to do references yet. (such as strip any leading space characters or return false and abort these steps) are to be interpreted with the meaning of the key word (MUST, SHOULD, MAY, etc) s/etc/etc./g official-type should probably be given some styling -- preferably not the same styling as Content-Type (Such messages are invalid according to RFC2616. s/./.)/ The rfcs should be href references of some sort :) Yeah, I need to crack the references problem at some point. :) For octets received via HTTP, the Content-Type HTTP header field, if present, indicates the media type. Let the official-type be the media type indicted by the HTTP Content-Type header field, if present. If the Content-Type header field is absent or if its value cannot be interpreted as a media type (e.g. because its value doesn't contain a U+002F SOLIDUS ('/') character), then there is no official-type. (Such messages are invalid according to RFC2616. If an HTTP response contains multiple Content-Type header fields, the User Agent MUST use the textually last Content-Type header field to the official-type. For example, if the last Content-Type header field contains the value foo, then there is no official media type because foo cannot be interpreted as a media type (even if the HTTP response contains another Content-Type header field that could be interpreted as a media type). The for example part here applies to the previous paragraph, the sentence needs to be moved to the paragraph before the instruction for multiple header fields. It's an example that combines both rules. FTP RFC0959 Is there a reason for the leading 0? Comparisons between media types, as defined by MIME specifications, are done in an ASCII case-insensitive manner. [RFC2046] You need to somehow note that this is merely a note about mime equivalence and doesn't relate to how the spec works. I'm not sure I understand. It's in green and labeled as a note. If the official-type ends in +xml, or if it is either text/xml or application/xml, then let the sniffed-type be the official-type and abort these steps. Please mark up `sniffed-type` and `official-type` If the official-type is an image type supported by the User Agent (e.g., image/png, image/gif, image/jpeg, etc), then jump to the images section below. s/etc// If none of the first n octets are binary data octets then let the sniffed-type be text/plain and abort these steps. Binary Data Byte Ranges You don't actually define a `binary
[whatwg] [MIME Sniffing] Editorial feedback
Otherwise, if the octets in s starting at pos match any of the sequences of octets in the first column of the following table, then the user agent MUST follow the steps given in the corresponding cell in the second column of the same row. | What's the stray `|` character at the end of that doing? The ToC feels double spaced, is that normal? Would you mind quoting your attributes in source? Things like class=no-num or href=#web-data scare me. It's easier if you just quote all attributes :) Also, I generally recommend `span ...x/span ` over `span ...x /span` - i.e. trailing space outside of span (see toc) pMany web servers supply incorrect Content-Type header fields with their HTTP Can you mark up `Content-Type` in something which results in roughly typewriter font? s/user agents/User Agents/ as in: responses. In order to be compatible with these servers, user agents consider Without a clear specification of how to sniff the media type, each user agent implementor was forced to reverse engineer the behavior of the other user agents and to develop s/the other/other/ -- there are some UAs who were ignored when the sniffing of a given UA was developed :) their own algorithm I'm not sure if `algorithm` here belongs in singular or plural, I got distracted :) an HTTP response to be interpreted as one media type but some user agents interpret the responses as another media type. s/responses/response/ (agreement with first part) However, if a user agent does interpret a low-privilege media type, such as image/gif, as a high-privilege media type, such as text/html, the user agent has created a privilege escalation vulnerability in the server. s/, the user agent/, then the user agent/ I believe abarth has addressed the above. This document describes a content sniffing algorithm that carefully balances the compatibility needs of user agent implementors with the security constraints. `the security constraints` is problematic, I don't think `the` references anything so either drop `the`, or provide a reference :/ and metrics collected from implementations deployed to a sizable number of users . s/ ././ (such as strip any leading space characters or return false and abort these steps) are to be interpreted with the meaning of the key word (MUST, SHOULD, MAY, etc) s/etc/etc./g official-type should probably be given some styling -- preferably not the same styling as Content-Type (Such messages are invalid according to RFC2616. s/./.)/ The rfcs should be href references of some sort :) For octets received via HTTP, the Content-Type HTTP header field, if present, indicates the media type. Let the official-type be the media type indicted by the HTTP Content-Type header field, if present. If the Content-Type header field is absent or if its value cannot be interpreted as a media type (e.g. because its value doesn't contain a U+002F SOLIDUS ('/') character), then there is no official-type. (Such messages are invalid according to RFC2616. If an HTTP response contains multiple Content-Type header fields, the User Agent MUST use the textually last Content-Type header field to the official-type. For example, if the last Content-Type header field contains the value foo, then there is no official media type because foo cannot be interpreted as a media type (even if the HTTP response contains another Content-Type header field that could be interpreted as a media type). The for example part here applies to the previous paragraph, the sentence needs to be moved to the paragraph before the instruction for multiple header fields. FTP RFC0959 Is there a reason for the leading 0? Comparisons between media types, as defined by MIME specifications, are done in an ASCII case-insensitive manner. [RFC2046] You need to somehow note that this is merely a note about mime equivalence and doesn't relate to how the spec works. If the official-type ends in +xml, or if it is either text/xml or application/xml, then let the sniffed-type be the official-type and abort these steps. Please mark up `sniffed-type` and `official-type` If the official-type is an image type supported by the User Agent (e.g., image/png, image/gif, image/jpeg, etc), then jump to the images section below. s/etc// If none of the first n octets are binary data octets then let the sniffed-type be text/plain and abort these steps. Binary Data Byte Ranges You don't actually define a `binary data octet` as any item within the ranges defined in the `binary data byte ranges`. If the first octets match one of the octet sequences in the pattern column of the table in the unknown type section below, ignoring any rows whose cell in the security column says scriptable (or n/a), then let the sniffed-type be the type given in the corresponding cell in the sniffed type column on that row and abort these steps. If you could make `unknown type section` a link to the