Re: [whatwg] [MIME Sniffing] Editorial feedback

2011-10-04 Thread timeless
 Comparisons between media types, as defined by MIME specifications, are done 
 in an ASCII case-insensitive manner. [RFC2046]

so, the problem is that your `note` here is ambiguous

it's hard to understand that you're just saying `mime rfc says that
mime comparisons are insensitive`,
v. `this specification wants mime comparisons to be insensitive`

you want the former; but `note:` doesn't cause that result; nor does
the `[rfc]` at the end

 I'm tempted to just rename them to be less semantic.  They're just symbols 
 that don't mean anything, really.

please do :)

 That's a lot of editing!  I'm not sure that buys us much.

i ask, because it actually was useful when i was dealing w/ someone else's spec
they had hex digits and some of them were wrong
it was much easier to read when the hex digits were in tt

 That is intentional.  Sniffing SWF is bad times.

i think it might be worth an actual NOTE in the spec explaining that
SWF is intentionally not sniffed, and what that means for untyped SWF
files (actually explaining how it flows and to which resulting sniff
type)

 Thanks to Alfred HÎnes Boris Zbarsky David Singer Mark Pilgrim, and Russ Cox.

you  need some punctuation before `Boris`, `David`, and `Mark` :)

 If RDF-flag is 1 and RSS-flag is 1, then let the sniffed-type be 
 application/rss+xml and abort these steps.

could you change that to: If both RDF-flag and RSS-flag are 1, then ...?


Re: [whatwg] [MIME Sniffing] Editorial feedback

2011-09-28 Thread Adam Barth
I've taken all of your suggestions, except as noted below.  Thanks for
your detailed feedback.

Adam


On Mon, Sep 26, 2011 at 2:27 PM, timeless timel...@gmail.com wrote:
 Otherwise, if the octets in s starting at pos match any of the sequences of 
 octets in the first column of the following table, then the user agent MUST 
 follow the steps given in the corresponding cell in the second column of the 
 same row. |

 What's the stray `|` character at the end of that doing?

 The ToC feels double spaced, is that normal?

 Would you mind quoting your attributes in source? Things like
 class=no-num or href=#web-data scare me. It's easier if you just quote
 all attributes :)

 Also, I generally recommend `span ...x/span ` over `span ...x
 /span` - i.e. trailing space outside of span (see toc)

 pMany web servers supply incorrect Content-Type header fields with their 
 HTTP

 Can you mark up `Content-Type` in something which results in roughly
 typewriter font?

 s/user agents/User Agents/ as in:
 responses.  In order to be compatible with these servers, user agents 
 consider

 Without a clear specification of how to sniff the media type, each user 
 agent implementor was forced to reverse engineer the behavior of the other 
 user agents and to develop

 s/the other/other/ -- there are some UAs who were ignored when the
 sniffing of a given UA was developed :)

 their own algorithm

 I'm not sure if `algorithm` here belongs in singular or plural, I got
 distracted :)

 an HTTP response to be interpreted as one media type but some user agents 
 interpret the responses as another media type.

 s/responses/response/ (agreement with first part)

 However, if a user agent does interpret a low-privilege media type, such as 
 image/gif, as a high-privilege media type, such as text/html, the user agent 
 has created a privilege escalation vulnerability in the server.

 s/, the user agent/, then the user agent/



 I believe abarth has addressed the above.

 This document describes a content sniffing algorithm that carefully balances 
 the compatibility needs of user agent implementors with the security 
 constraints.

 `the security constraints` is problematic, I don't think `the`
 references anything
 so either drop `the`, or provide a reference :/

 and metrics collected from implementations deployed to a sizable number of 
 users .

 s/ ././

There's actually a reference that goes there.  I just haven't figured
out how to do references yet.

 (such as strip any leading space characters or return false and abort 
 these steps) are to be interpreted with the meaning of the key word 
 (MUST, SHOULD, MAY, etc)

 s/etc/etc./g

 official-type should probably be given some styling -- preferably
 not the same styling as Content-Type

 (Such messages are invalid according to RFC2616.

 s/./.)/

 The rfcs should be href references of some sort :)

Yeah, I need to crack the references problem at some point.  :)

 For octets received via HTTP, the Content-Type HTTP header field, if 
 present, indicates the media type. Let the official-type be the media type 
 indicted by the HTTP Content-Type header field, if present. If the 
 Content-Type header field is absent or if its value cannot be interpreted as 
 a media type (e.g. because its value doesn't contain a U+002F SOLIDUS ('/') 
 character), then there is no official-type. (Such messages are invalid 
 according to RFC2616.

 If an HTTP response contains multiple Content-Type header fields, the User 
 Agent MUST use the textually last Content-Type header field to the 
 official-type. For example, if the last Content-Type header field contains 
 the value foo, then there is no official media type because foo cannot 
 be interpreted as a media type (even if the HTTP response contains another 
 Content-Type header field that could be interpreted as a media type).

 The for example part here applies to the previous paragraph, the
 sentence needs to be moved to the paragraph before the instruction for
 multiple header fields.

It's an example that combines both rules.

 FTP RFC0959

 Is there a reason for the leading 0?

 Comparisons between media types, as defined by MIME specifications, are done 
 in an ASCII case-insensitive manner. [RFC2046]

 You need to somehow note that this is merely a note about mime
 equivalence and doesn't relate to how the spec works.

I'm not sure I understand.  It's in green and labeled as a note.

 If the official-type ends in +xml, or if it is either text/xml or 
 application/xml, then let the sniffed-type be the official-type and abort 
 these steps.

 Please mark up `sniffed-type` and `official-type`

 If the official-type is an image type supported by the User Agent (e.g., 
 image/png, image/gif, image/jpeg, etc), then jump to the images 
 section below.

 s/etc//

 If none of the first n octets are binary data octets then let the 
 sniffed-type be text/plain and abort these steps.
 Binary Data Byte Ranges

 You don't actually define a `binary 

[whatwg] [MIME Sniffing] Editorial feedback

2011-09-26 Thread timeless
 Otherwise, if the octets in s starting at pos match any of the sequences of 
 octets in the first column of the following table, then the user agent MUST 
 follow the steps given in the corresponding cell in the second column of the 
 same row. |

What's the stray `|` character at the end of that doing?

The ToC feels double spaced, is that normal?

Would you mind quoting your attributes in source? Things like
class=no-num or href=#web-data scare me. It's easier if you just quote
all attributes :)

Also, I generally recommend `span ...x/span ` over `span ...x
/span` - i.e. trailing space outside of span (see toc)

 pMany web servers supply incorrect Content-Type header fields with their 
 HTTP

Can you mark up `Content-Type` in something which results in roughly
typewriter font?

s/user agents/User Agents/ as in:
 responses.  In order to be compatible with these servers, user agents consider

 Without a clear specification of how to sniff the media type, each user 
 agent implementor was forced to reverse engineer the behavior of the other 
 user agents and to develop

s/the other/other/ -- there are some UAs who were ignored when the
sniffing of a given UA was developed :)

 their own algorithm

I'm not sure if `algorithm` here belongs in singular or plural, I got
distracted :)

 an HTTP response to be interpreted as one media type but some user agents 
 interpret the responses as another media type.

s/responses/response/ (agreement with first part)

 However, if a user agent does interpret a low-privilege media type, such as 
 image/gif, as a high-privilege media type, such as text/html, the user agent 
 has created a privilege escalation vulnerability in the server.

s/, the user agent/, then the user agent/



I believe abarth has addressed the above.

 This document describes a content sniffing algorithm that carefully balances 
 the compatibility needs of user agent implementors with the security 
 constraints.

`the security constraints` is problematic, I don't think `the`
references anything
so either drop `the`, or provide a reference :/

 and metrics collected from implementations deployed to a sizable number of 
 users .

s/ ././

 (such as strip any leading space characters or return false and abort 
 these steps) are to be interpreted with the meaning of the key word (MUST, 
 SHOULD, MAY, etc)

s/etc/etc./g

official-type should probably be given some styling -- preferably
not the same styling as Content-Type

 (Such messages are invalid according to RFC2616.

s/./.)/

The rfcs should be href references of some sort :)

 For octets received via HTTP, the Content-Type HTTP header field, if present, 
 indicates the media type. Let the official-type be the media type indicted by 
 the HTTP Content-Type header field, if present. If the Content-Type header 
 field is absent or if its value cannot be interpreted as a media type (e.g. 
 because its value doesn't contain a U+002F SOLIDUS ('/') character), then 
 there is no official-type. (Such messages are invalid according to RFC2616.

 If an HTTP response contains multiple Content-Type header fields, the User 
 Agent MUST use the textually last Content-Type header field to the 
 official-type. For example, if the last Content-Type header field contains 
 the value foo, then there is no official media type because foo cannot be 
 interpreted as a media type (even if the HTTP response contains another 
 Content-Type header field that could be interpreted as a media type).

The for example part here applies to the previous paragraph, the
sentence needs to be moved to the paragraph before the instruction for
multiple header fields.

 FTP RFC0959

Is there a reason for the leading 0?

 Comparisons between media types, as defined by MIME specifications, are done 
 in an ASCII case-insensitive manner. [RFC2046]

You need to somehow note that this is merely a note about mime
equivalence and doesn't relate to how the spec works.

 If the official-type ends in +xml, or if it is either text/xml or 
 application/xml, then let the sniffed-type be the official-type and abort 
 these steps.

Please mark up `sniffed-type` and `official-type`

 If the official-type is an image type supported by the User Agent (e.g., 
 image/png, image/gif, image/jpeg, etc), then jump to the images 
 section below.

s/etc//

 If none of the first n octets are binary data octets then let the 
 sniffed-type be text/plain and abort these steps.
 Binary Data Byte Ranges

You don't actually define a `binary data octet` as any item within the
ranges defined in the `binary data byte ranges`.

 If the first octets match one of the octet sequences in the pattern column 
 of the table in the unknown type section below, ignoring any rows whose 
 cell in the security column says scriptable (or n/a), then let the 
 sniffed-type be the type given in the corresponding cell in the sniffed 
 type column on that row and abort these steps.

If you could make `unknown type section` a link to the