I have a mixed feeling on this document and I am still thinking on larger issues with it, so expect a followup message on the subject.
In the meantime, below are more obvious issues:

RFC 2119 is not a Normative Reference, but it should be, as the document is using MUSTs.

2.  Metadata

  For octets received via HTTP, the Content-Type HTTP header field, if
  present, indicates the media type.  Let the official-type be the
  media type indicted by the HTTP Content-Type header field, if
  present.  If the Content-Type header field is absent or if its value
  cannot be interpreted as a media type (e.g. because its value doesn't
  contain a U+002F SOLIDUS ('/') character), then there is no official-
  type.

I would prefer if this text is clearer that the last sentence is dealing with messages which are invalid according to RFC 2616.


  For octets fetched over some other protocols, e.g.  FTP, there is no

FTP needs an Informative reference.

  type information.

  Note: Comparisons between media types, as defined by MIME
  specifications, are done in an ASCII case-insensitive manner.
  [RFC2046]

This should be a reference. Probably Normative?


3.  Web Pages

  2.  If the octets were fetched via HTTP and there is an HTTP Content-
      Type header field and the value of the last such header field has
      octets that *exactly* match the octets contained in one of the

Aren't you supposed to match media types case insensitively?

      following lines:


  4.  If the official-type is "unknown/unknown", "application/unknown",

I am very curious to know which products are generating "unknown".

Does application/unknown need to be registered in the media type registry?

      or "*/*", jump to the "unknown type" section below.


4.  Text or Binary

  This section defines the *rules for distinguishing if a resource is
  text or binary*.

  1.  The user agent MAY wait for 512 or more octets be to arrive.

Drop "be"?

         Note: Waiting for 512 octets octets to arrive causes the text-
         or-binary algorithm to be deterministic for a given sequence

Does it? I think this section is showing euristics which are likely (but are not guaranteed) to produce the correct result.

         of octets.


  4.  If none of the first n octets are binary data octets then let the
      sniffed-type be "text/plain" and abort these steps.

                        +-------------------------+
                        | Binary Data Byte Ranges |
                        +-------------------------+
                        | 0x00 -- 0x08            |
                        | 0x0B                    |
                        | 0x0E -- 0x1A            |
                        | 0x1C -- 0x1F            |
                        +-------------------------+

Does this table try to cover all UTF encodings: UTF-8, UTF-16LE and UTF-16BE?


5.  Unknown Type

I think some of the media types listed in this section need registering, e.g.
image/bmp, image/webp. But chairs can help out with this.


5.1.  Signature for H.264

I think this algorithm needs to be verified by somebody with experience in H.264. I can ask for a RAI Area review.


7.  Video

  This section defines the *rules for sniffing videos specifically*.

  If the first octets match one of the signatures in Section 5 for one
  of the following media types, then let the sniffed-type be the
  corresponding media type and abort these steps:

  o  video/H264

Note that the section 5 currently doesn't define video/H264

  o  video/webm

This media type needs registering as well.

_______________________________________________
websec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/websec

Reply via email to