Re: [whatwg] An alternative approach to section 9 of Mime Sniffing

2013-05-23 Thread Anne van Kesteren
On Thu, May 23, 2013 at 2:49 PM, Peter Occil  wrote:
> Explain further why you don't recommend ABNF for this case.

We don't recommend ABNF in general because often ABNF results in a
mismatch between prescribed and actual processing. E.g. Content-Type
is defined as an ABNF and technically "text/html;" does not match that
ABNF, but everyone (logically) processes that as "text/html" without
parameters.

It's much better to define the actual processing so implementers are
less inclined to take shortcuts when implementing (test suites also
help, but they're typically written way-after-the-fact).


> You should also explain whether another change to make section 9 more 
> readable is
> appropriate (though it currently is relatively readable as is).

I'll leave that to Gordon.


--
http://annevankesteren.nl/


Re: [whatwg] [mimesniff] An alternative approach to section 9 of Mime Sniffing

2013-05-23 Thread Peter Occil
The pattern mask DF is currently only used in the algorithm for identifying 
an unknown MIME type, and even here for identifying only one MIME type, 
namely text/html.  This can be succintly covered with the following ABNF:


WHITESPACE = *( %x09 / %x0A / %x0C / %x0D / %x20 )
  ; any number of whitespace bytes
TAGTERM = %x20 / %x3E ; a tag-terminating byte: space or ">"
html = WHITESPACE (
"Note also that the notes in the example (in my previous message) are 
retained as comments in the ABNF, since they clarify what the byte pattern 
matches and help eliminate some of the confusion.


What problem am I trying to solve?

For one thing, look at section 5, parsing a MIME type.  It's currently an 
incomplete and unwieldy list of steps that don't clearly state what a MIME 
type should consist of.  Showing an ABNF next to the rules will help in this 
respect.


-Original Message- 
From: Gordon P. Hemsley

Sent: Thursday, May 23, 2013 11:14 AM
To: Peter Occil
Cc: WHATWG
Subject: Re: [whatwg] An alternative approach to section 9 of Mime Sniffing

The pattern matching algorithm is used because certain patterns
require other-than-exact matching. That is why the "pattern mask"
exists. This is particularly important for the "rules for identifying
an unknown MIME type" (defined in 10.1), which matches ASCII
characters case-insensitively; it is also important for a number of
patterns that contain unimportant bytes that should be ignored (like
WebP, in your example).

The algorithm lays out the information in tabular form because that
makes clearer the separation between the important bytes and the
unimportant (or case-insensitive) bytes. Keep in mind that
implementations may read one byte at a time; using ABNF would give
them no benefit, and would likely make things more confusing.

I wonder: What problem are you trying to solve with this proposal?

(In the future, please add "[mimesniff]" to the beginning of your
subject line for MIME Sniffing discussions; this will ensure that I
see them and pay attention to them more quickly.)

Regards,
Gordon

On Thu, May 23, 2013 at 2:10 AM, Peter Occil  wrote:
I propose rewriting section 9 and parts of section 10 in a different way, 
to use the ABNF format in RFC 5234. (Note that ABNFs are already  used in 
the current Fetch specification.) With this approach, the definitions for 
"byte pattern",  "pattern mask", and the "pattern matching algorithm" can 
be eliminated (all of which are found before section 9.1).


An example for the image pattern matching algorithm is given below.

---

9.1  Matching an image type pattern

The image pattern matching algorithm takes a byte sequence as input.  The 
algorithm goes through the following image types in the order given.  For 
each image MIME type given below, if the start of the byte sequence 
matches its ABNF, return the concatenation of "image/" and the name of the 
ABNF (in lowercase), and terminate the image pattern matching algorithm.


vnd.microsoft.icon = %x00.00.01.00
   ; A Windows Icon signature.
bmp = %x42.4D
   ; The string "BM", a BMP signature.
gif = %x47.49.46.38 (%x37 / %x39) %x61
   ; The string "GIF87a" or "GIF89a", a GIF signature.
webp = %x52.49.46.46 4OCTET %57.45.42.50.56.50
   ; The string "RIFF" followed by four bytes followed by the string 
"WEBPVP".

png = %x89.50.4E.47.0D.0A.1A.0A
   ; The byte 0x89 followed by the string "PNG"
   ; followed by CR LF SUB LF, the PNG signature.
jpeg = %xFF.D8.FF
   ; The JPEG Start of Image marker followed by the indicator
   ; byte of another marker.

If the start of the byte sequence doesn't match any ABNF given above, 
return undefined.


---

I would appreciate comments.

--Peter




--
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/ 



Re: [whatwg] An alternative approach to section 9 of Mime Sniffing

2013-05-23 Thread Gordon P. Hemsley
The pattern matching algorithm is used because certain patterns
require other-than-exact matching. That is why the "pattern mask"
exists. This is particularly important for the "rules for identifying
an unknown MIME type" (defined in 10.1), which matches ASCII
characters case-insensitively; it is also important for a number of
patterns that contain unimportant bytes that should be ignored (like
WebP, in your example).

The algorithm lays out the information in tabular form because that
makes clearer the separation between the important bytes and the
unimportant (or case-insensitive) bytes. Keep in mind that
implementations may read one byte at a time; using ABNF would give
them no benefit, and would likely make things more confusing.

I wonder: What problem are you trying to solve with this proposal?

(In the future, please add "[mimesniff]" to the beginning of your
subject line for MIME Sniffing discussions; this will ensure that I
see them and pay attention to them more quickly.)

Regards,
Gordon

On Thu, May 23, 2013 at 2:10 AM, Peter Occil  wrote:
> I propose rewriting section 9 and parts of section 10 in a different way, to 
> use the ABNF format in RFC 5234. (Note that ABNFs are already  used in the 
> current Fetch specification.) With this approach, the definitions for "byte 
> pattern",  "pattern mask", and the "pattern matching algorithm" can be 
> eliminated (all of which are found before section 9.1).
>
> An example for the image pattern matching algorithm is given below.
>
> ---
>
> 9.1  Matching an image type pattern
>
> The image pattern matching algorithm takes a byte sequence as input.  The 
> algorithm goes through the following image types in the order given.  For 
> each image MIME type given below, if the start of the byte sequence matches 
> its ABNF, return the concatenation of "image/" and the name of the ABNF (in 
> lowercase), and terminate the image pattern matching algorithm.
>
> vnd.microsoft.icon = %x00.00.01.00
>; A Windows Icon signature.
> bmp = %x42.4D
>; The string "BM", a BMP signature.
> gif = %x47.49.46.38 (%x37 / %x39) %x61
>; The string "GIF87a" or "GIF89a", a GIF signature.
> webp = %x52.49.46.46 4OCTET %57.45.42.50.56.50
>; The string "RIFF" followed by four bytes followed by the string "WEBPVP".
> png = %x89.50.4E.47.0D.0A.1A.0A
>; The byte 0x89 followed by the string "PNG"
>; followed by CR LF SUB LF, the PNG signature.
> jpeg = %xFF.D8.FF
>; The JPEG Start of Image marker followed by the indicator
>; byte of another marker.
>
> If the start of the byte sequence doesn't match any ABNF given above, return 
> undefined.
>
> ---
>
> I would appreciate comments.
>
> --Peter



-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] An alternative approach to section 9 of Mime Sniffing

2013-05-23 Thread Peter Occil

Explain further why you don't recommend ABNF for this case.  You should also
explain whether another change to make section 9 more readable is 
appropriate

(though it currently is relatively readable as is).

--Peter

-Original Message- 
From: Anne van Kesteren

Sent: Thursday, May 23, 2013 2:15 AM
To: Peter Occil
Cc: WHATWG
Subject: Re: [whatwg] An alternative approach to section 9 of Mime Sniffing

On Thu, May 23, 2013 at 7:10 AM, Peter Occil  wrote:

I would appreciate comments.


The only reason Fetch uses ABNF is to match HTTP(-bis) conventions.
It's not a practice I recommend copying for non-header value
definitions.


--
http://annevankesteren.nl/ 



Re: [whatwg] Inert nodes and element.click()

2013-05-23 Thread Boris Zbarsky

On 5/23/13 5:48 AM, Matt Falkenhagen wrote:

1. For an inert element, what happens on element.click() or
element.dispatchEvent(new Event('click'))?


What would make the most sense to me is to have these work as normal but 
for the node to not have any default activation behavior.


-Boris


[whatwg] Inert nodes and element.click()

2013-05-23 Thread Matt Falkenhagen
I have some questions about these concepts.

1. For an inert element, what happens on element.click() or
element.dispatchEvent(new Event('click'))? The spec says an inert node is
treated as absent "for the purposes of targeting user interaction events" [1].
My interpretation is that the element receives the 'click' event as usual; the
intent is to block actual user interaction, e.g., if user the physically clicks
on the element.

2. The definition of element.click() seems ambiguous. The spec says:
  The click() method must run synthetic click activation steps on the element.
[2] There is a 6 step algorithm for "synthetic click activation steps" followed
by a separate 6 step algorithm for "when a pointing device is clicked". Below
that is a note which seems to say "the above" happens when the click() method
is called [3]. It's ambiguous what "the above" refers to and if it's the second
algorithm, that seems to contradict the click() definition text.

[1] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/editing.html#inert-subtrees
[2] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/editing.html#activation
[3] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#run-synthetic-click-activation-steps


Re: [whatwg] Fetch: please review!

2013-05-23 Thread Simon Pieters
On Thu, 23 May 2013 07:11:45 +0200, Anne van Kesteren   
wrote:


On Wed, May 22, 2013 at 12:20 PM, Janusz Majnert   
wrote:

I have a few notes to make on the use of "byte string" notion.
First of all, let's look at the definition of "byte string":
"A byte string is a byte sequence written down as a string."
Where "byte" and "string" are:
"A byte is a sequence of eight bits, represented as a double-digit
hexadecimal number in the range 0x00 to 0xFF."
"A string is a sequence of code points." and later "A code point is a
Unicode code point and is represented as a four-to-six digit hexadecimal
number, typically prefixed with "U+"."

So, just by looking at the definition, I would expect a byte string to  
be a
sequence of hex numbers. That is of course not what is put in the  
examples

and not what this definition aimed for.


If you have a better way to do this, please do suggest. This problem
has been introduced by HTTP and I think it's important to make sure we
carefully distinguish between what are actually bytes and what are
strings, while still maintaining the readability of Content-Type over
expressing that as a sequence of hex numbers.


Maybe say that for readability, byte strings are not written as hex  
numbers but as strings encoded as ASCII.


Also, instead of distinguishing between the two by including or omitting  
quotes which seems subtle and hard to remember which is which, call out  
when something is a byte string rather than a string.


Example (using backticks for ):

[[
↪ `about`

If request's url's scheme data is `blank`, return a response whose headers  
consist of a single header whose name is the byte string `Content-Type`  
and value is the byte string `text/html;charset=utf-8`, and body is the  
empty string.


Otherwise, return a network error.
]]

(BTW should body be the empty byte string above?)

--
Simon Pieters
Opera Software