Re: [whatwg] An alternative approach to section 9 of Mime Sniffing
On Thu, May 23, 2013 at 2:49 PM, Peter Occil wrote: > Explain further why you don't recommend ABNF for this case. We don't recommend ABNF in general because often ABNF results in a mismatch between prescribed and actual processing. E.g. Content-Type is defined as an ABNF and technically "text/html;" does not match that ABNF, but everyone (logically) processes that as "text/html" without parameters. It's much better to define the actual processing so implementers are less inclined to take shortcuts when implementing (test suites also help, but they're typically written way-after-the-fact). > You should also explain whether another change to make section 9 more > readable is > appropriate (though it currently is relatively readable as is). I'll leave that to Gordon. -- http://annevankesteren.nl/
Re: [whatwg] An alternative approach to section 9 of Mime Sniffing
The pattern matching algorithm is used because certain patterns require other-than-exact matching. That is why the "pattern mask" exists. This is particularly important for the "rules for identifying an unknown MIME type" (defined in 10.1), which matches ASCII characters case-insensitively; it is also important for a number of patterns that contain unimportant bytes that should be ignored (like WebP, in your example). The algorithm lays out the information in tabular form because that makes clearer the separation between the important bytes and the unimportant (or case-insensitive) bytes. Keep in mind that implementations may read one byte at a time; using ABNF would give them no benefit, and would likely make things more confusing. I wonder: What problem are you trying to solve with this proposal? (In the future, please add "[mimesniff]" to the beginning of your subject line for MIME Sniffing discussions; this will ensure that I see them and pay attention to them more quickly.) Regards, Gordon On Thu, May 23, 2013 at 2:10 AM, Peter Occil wrote: > I propose rewriting section 9 and parts of section 10 in a different way, to > use the ABNF format in RFC 5234. (Note that ABNFs are already used in the > current Fetch specification.) With this approach, the definitions for "byte > pattern", "pattern mask", and the "pattern matching algorithm" can be > eliminated (all of which are found before section 9.1). > > An example for the image pattern matching algorithm is given below. > > --- > > 9.1 Matching an image type pattern > > The image pattern matching algorithm takes a byte sequence as input. The > algorithm goes through the following image types in the order given. For > each image MIME type given below, if the start of the byte sequence matches > its ABNF, return the concatenation of "image/" and the name of the ABNF (in > lowercase), and terminate the image pattern matching algorithm. > > vnd.microsoft.icon = %x00.00.01.00 >; A Windows Icon signature. > bmp = %x42.4D >; The string "BM", a BMP signature. > gif = %x47.49.46.38 (%x37 / %x39) %x61 >; The string "GIF87a" or "GIF89a", a GIF signature. > webp = %x52.49.46.46 4OCTET %57.45.42.50.56.50 >; The string "RIFF" followed by four bytes followed by the string "WEBPVP". > png = %x89.50.4E.47.0D.0A.1A.0A >; The byte 0x89 followed by the string "PNG" >; followed by CR LF SUB LF, the PNG signature. > jpeg = %xFF.D8.FF >; The JPEG Start of Image marker followed by the indicator >; byte of another marker. > > If the start of the byte sequence doesn't match any ABNF given above, return > undefined. > > --- > > I would appreciate comments. > > --Peter -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] An alternative approach to section 9 of Mime Sniffing
Explain further why you don't recommend ABNF for this case. You should also explain whether another change to make section 9 more readable is appropriate (though it currently is relatively readable as is). --Peter -Original Message- From: Anne van Kesteren Sent: Thursday, May 23, 2013 2:15 AM To: Peter Occil Cc: WHATWG Subject: Re: [whatwg] An alternative approach to section 9 of Mime Sniffing On Thu, May 23, 2013 at 7:10 AM, Peter Occil wrote: I would appreciate comments. The only reason Fetch uses ABNF is to match HTTP(-bis) conventions. It's not a practice I recommend copying for non-header value definitions. -- http://annevankesteren.nl/
Re: [whatwg] An alternative approach to section 9 of Mime Sniffing
On Thu, May 23, 2013 at 7:10 AM, Peter Occil wrote: > I would appreciate comments. The only reason Fetch uses ABNF is to match HTTP(-bis) conventions. It's not a practice I recommend copying for non-header value definitions. -- http://annevankesteren.nl/
[whatwg] An alternative approach to section 9 of Mime Sniffing
I propose rewriting section 9 and parts of section 10 in a different way, to use the ABNF format in RFC 5234. (Note that ABNFs are already used in the current Fetch specification.) With this approach, the definitions for "byte pattern", "pattern mask", and the "pattern matching algorithm" can be eliminated (all of which are found before section 9.1). An example for the image pattern matching algorithm is given below. --- 9.1 Matching an image type pattern The image pattern matching algorithm takes a byte sequence as input. The algorithm goes through the following image types in the order given. For each image MIME type given below, if the start of the byte sequence matches its ABNF, return the concatenation of "image/" and the name of the ABNF (in lowercase), and terminate the image pattern matching algorithm. vnd.microsoft.icon = %x00.00.01.00 ; A Windows Icon signature. bmp = %x42.4D ; The string "BM", a BMP signature. gif = %x47.49.46.38 (%x37 / %x39) %x61 ; The string "GIF87a" or "GIF89a", a GIF signature. webp = %x52.49.46.46 4OCTET %57.45.42.50.56.50 ; The string "RIFF" followed by four bytes followed by the string "WEBPVP". png = %x89.50.4E.47.0D.0A.1A.0A ; The byte 0x89 followed by the string "PNG" ; followed by CR LF SUB LF, the PNG signature. jpeg = %xFF.D8.FF ; The JPEG Start of Image marker followed by the indicator ; byte of another marker. If the start of the byte sequence doesn't match any ABNF given above, return undefined. --- I would appreciate comments. --Peter