Re: Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-07 Thread Hallvord Reiar Michaelsen Steen
  It seems strange the spec would require a case-sensitive value for

  Content-Type in certain circumstances.

 There's only two things that seem to work well over a long period of
 time given multiple implementations and developers coding toward the

 dominant implementation (this describes the web).
 
 1. Require the same from everyone.
 

 2. Require randomness.


We're discussing the case of a MIME type parameter sent from a client to a 
server, the question is basically where to draw the line between what we spec 
and what we leave up to the implementation. 


Currently, according to the spec the charset param is expected to be sent in 
lower case if the charset the JS sets matches (case insensitively!) the charset 
the implementation sends data in, and the JS used lower case (i.e. 
text/plain;charset=utf-8 will send charset=utf-8), in upper case if the 
implementation rewrites any charset parameter (text/plain;charset=foo = 
text/plain;charset=UTF-8 and perhaps least expected 
text/plain;charset=utf-8;charset=foo = 
text/plain;charset=UTF-8;charset=UTF-8). So per the spec itself the value may 
sometimes be lower cased, sometimes upper cased, and it may sometimes be 
transformed to upper case even if it was originally given in lower case.


We have no evidence that servers require or prefer a certain case. Servers 
(like Apache, IIS and Nginx) are generally written by professionals who 
understand case insensitivity. Server-side scripting, on the other hand, is not 
necessarily of high quality and might end up requiring a certain case. If such 
scripts exist, and if it's not documented what case is expected, we will end up 
with one of those small gotchas that are so harmful to cross-implementation 
compat. (On the other hand, if we already have a state where a variety of input 
is accepted and narrow down what is considered legal, content may well follow - 
this risks creating one of those backwards incompatibilities that annoy users 
with older devices and versions. IMO as spec authors we should also keep 
backwards compatibility in mind and not diverge from existing implementations 
unless we have good reasons.)


TL;DR: I'm not aware of evidence that spec'ing this is required for compat, I 
do buy the argument that precision might cause better future compat, I'm 
however concerned about back compat and find it surprising that a strictly 
spec'ed implementation detail will sometimes transform the case the script 
actually used.



'HR
 Anything else is likely to lead some subset of developers to depend on
 certain things they really should not depend on and will force
 everyone to match the conventions of what they depend on (if you're in
 bad luck you'll get mutual exclusive dependencies; the web has those
 too). E.g. the ordering of the members of the canvas element is one
 such thing (trivial bad luck example is User-Agent).
 
 
 --
 http://annevankesteren.nl/

-- 
Hallvord R. M. Steen
Core tester, Opera Software








Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-07 Thread Julian Reschke

On 2013-05-07 00:44, Julian Aubourg wrote:

Hey Anne,

I don't quite get why you're saying HTTP is irrelevant.

As an example, regarding the content-type /request /header, the XHR spec
clearly states:

If a |Content-Type| header is in author request headers
http://www.w3.org/TR/XMLHttpRequest/#author-request-headers and
its value is a valid MIME type
http://dev.w3.org/html5/spec/infrastructure.html#valid-mime-type that
has a |charset| parameter whose value is not a case-insensitive
match for encoding, and encoding is not null, set all
the|charset| parameters of that |Content-Type| header to encoding.


So, at least, the encoding in the request content-type is clearly stated
as being case-insensitive.

BTW, Valid MIME type leads to (HTML 5.1):

A string is a valid MIME type if it matches the |media-type| rule
defined in section 3.7 Media Types of RFC 2616. In particular, a
valid MIME type

http://www.w3.org/html/wg/drafts/html/master/infrastructure.html#valid-mime-type
 may
include MIME type parameters. [HTTP]
http://www.w3.org/html/wg/drafts/html/master/iana.html#refsHTTP


Of course, nothing is explicitely specified regarding the /response
/content-type, because it is implicitely covered by HTTP (seeing as the
value is generated outside of the client -- except when using
overrideMimeType).

It's usage as defined by the XHR spec is irrelevant to the fact it is to
be considered case-insensitively : any software or hardware along the
network path is perfectly entitled to change the case of the
Content-Type header because HTTP clearly states case does not matter.

So, testing for a response Content-Type case-sensitively is /not /correct.

Things are less clear to me when it comes to white spaces. I find HTTP
quite evasive on the matter.
...


RFC 2616 is pretty clear if and only if you understand how implied 
linear whitespace works in it's version of ABNF.


In HTTPbis, we removed implied whitespace rules, so you may want to look at


http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p2-semantics-latest.html#media.type

instead (note that this is past WGLC and will be in IETF Last Call soonish).

Best regards, Julian



Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-07 Thread Julian Reschke

On 2013-05-07 11:39, Hallvord Reiar Michaelsen Steen wrote:

It seems strange the spec would require a case-sensitive value for



Content-Type in certain circumstances.



There's only two things that seem to work well over a long period of
time given multiple implementations and developers coding toward the



dominant implementation (this describes the web).

1. Require the same from everyone.




2. Require randomness.



We're discussing the case of a MIME type parameter sent from a client to a 
server, the question is basically where to draw the line between what we spec 
and what we leave up to the implementation.


Currently, according to the spec the charset param is expected to be sent in lower 
case if the charset the JS sets matches (case insensitively!) the charset the 
implementation sends data in, and the JS used lower case (i.e. 
text/plain;charset=utf-8 will send charset=utf-8), in upper case if the 
implementation rewrites any charset parameter (text/plain;charset=foo = 
text/plain;charset=UTF-8 and perhaps least expected 
text/plain;charset=utf-8;charset=foo = text/plain;charset=UTF-8;charset=UTF-8). 
So per the spec itself the value may sometimes be lower cased, sometimes upper cased, 
and it may sometimes be transformed to upper case even if it was originally given in 
lower case.


We have no evidence that servers require or prefer a certain case. Servers 
(like Apache, IIS and Nginx) are generally written by professionals who 
understand case insensitivity. Server-side scripting, on the other hand, is not 
necessarily of high quality and might end up requiring a certain case. If such 
scripts exist, and if it's not documented what case is expected, we will end up 
with one of those small gotchas that are so harmful to cross-implementation 
compat. (On the other hand, if we already have a state where a variety of input 
is accepted and narrow down what is considered legal, content may well follow - 
this risks creating one of those backwards incompatibilities that annoy users 
with older devices and versions. IMO as spec authors we should also keep 
backwards compatibility in mind and not diverge from existing implementations 
unless we have good reasons.)


TL;DR: I'm not aware of evidence that spec'ing this is required for compat, I 
do buy the argument that precision might cause better future compat, I'm 
however concerned about back compat and find it surprising that a strictly 
spec'ed implementation detail will sometimes transform the case the script 
actually used.
...


Indeed. See also 
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/0955.html about 
the requirement to rewrite charset parameters in-place, and - slightly 
related - https://www.w3.org/Bugs/Public/show_bug.cgi?id=15312 about 
the requirement to lowercase header field names in CORS.


Best regards, Julian



Re: Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-07 Thread Anne van Kesteren
On Tue, May 7, 2013 at 2:39 AM, Hallvord Reiar Michaelsen Steen
hallv...@opera.com wrote:
 TL;DR: I'm not aware of evidence that spec'ing this is required for compat, I 
 do buy the argument that precision might cause better future compat, I'm 
 however concerned about back compat and find it surprising that a strictly 
 spec'ed implementation detail will sometimes transform the case the script 
 actually used.

Yeah, we might have to preserve casing of the encoding label if
there's a (byte case-insensitive) match to begin with.


--
http://annevankesteren.nl/



Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Julian Aubourg
 Aren't both text/html;charset=windows-1252 and text/html;
charset=windows-1252 valid MIME types? Should we make the tests a bit more
accepting?

Reading http://www.w3.org/Protocols/rfc1341/4_Content-Type.html it's not
crystal clear if spaces are accepted, although white spaces and space
are clearly cited in the grammar as forbidden in tokens. My understanding
is that the intent is for white spaces to be ignored but I could be wrong.
Truth is the spec could use some consistency and precision.

  test script sets charset=utf-8 and charset=UTF-8 on the wire is
considered a failure

Those tests must ignore case. The type, subtype, and parameter names are
not case sensitive.



On 6 May 2013 18:31, Hallvord Reiar Michaelsen Steen hallv...@opera.comwrote:

 Two of the tests in
 http://w3c-test.org/web-platform-tests/master/XMLHttpRequest/send-content-type-string.htmfails
  in Firefox just because there is a space before the word charset.



 Aren't both text/html;charset=windows-1252 and text/html;
 charset=windows-1252 valid MIME types? Should we make the tests a bit more
 accepting?



 Also, there's a test in
 http://w3c-test.org/web-platform-tests/master/XMLHttpRequest/send-content-type-charset.htmthat
  fails in Chrome because it asserts charset must be lower case, i.e.
 test script sets charset=utf-8 and charset=UTF-8 on the wire is considered
 a failure. Does that make sense?



 --
 Hallvord R. M. Steen
 Core tester, Opera Software









Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Anne van Kesteren
On Mon, May 6, 2013 at 9:31 AM, Hallvord Reiar Michaelsen Steen
hallv...@opera.com wrote:
 ...

The reason the tests test that is because the specification requires
exactly that. If you want to change the tests, you'd first have to
change the specification. (What HTTP says on the matter is not
relevant.)


--
http://annevankesteren.nl/



Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Julian Aubourg
Hey Anne,

I don't quite get why you're saying HTTP is irrelevant.

As an example, regarding the content-type *request *header, the XHR spec
clearly states:

If a Content-Type header is in author request
headershttp://www.w3.org/TR/XMLHttpRequest/#author-request-headers
and
 its value is a valid MIME 
 typehttp://dev.w3.org/html5/spec/infrastructure.html#valid-mime-type that
 has a charset parameter whose value is not a case-insensitive match for
 encoding, and encoding is not null, set all thecharset parameters of that
 Content-Type header to encoding.


So, at least, the encoding in the request content-type is clearly stated as
being case-insensitive.

BTW, Valid MIME type leads to (HTML 5.1):

A string is a valid MIME type if it matches the media-type rule defined in
 section 3.7 Media Types of RFC 2616. In particular, a valid MIME 
 typehttp://www.w3.org/html/wg/drafts/html/master/infrastructure.html#valid-mime-type
  may
 include MIME type parameters. 
 [HTTP]http://www.w3.org/html/wg/drafts/html/master/iana.html#refsHTTP


Of course, nothing is explicitely specified regarding the *response
*content-type,
because it is implicitely covered by HTTP (seeing as the value is generated
outside of the client -- except when using overrideMimeType).

It's usage as defined by the XHR spec is irrelevant to the fact it is to be
considered case-insensitively : any software or hardware along the network
path is perfectly entitled to change the case of the Content-Type header
because HTTP clearly states case does not matter.

So, testing for a response Content-Type case-sensitively is *not *correct.

Things are less clear to me when it comes to white spaces. I find HTTP
quite evasive on the matter.

Please, correct me if I'm wrong and feel free to point me to the exact
sentences in the XHR spec that calls for an exception regarding
case-insensitivity of MIME types (as defined in HTTP which XHR references
through HTML 5.1). I may very well have missed those.

Cheers,

-- Julian



On 6 May 2013 19:22, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, May 6, 2013 at 9:31 AM, Hallvord Reiar Michaelsen Steen
 hallv...@opera.com wrote:
  ...

 The reason the tests test that is because the specification requires
 exactly that. If you want to change the tests, you'd first have to
 change the specification. (What HTTP says on the matter is not
 relevant.)


 --
 http://annevankesteren.nl/




Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Anne van Kesteren
On Mon, May 6, 2013 at 3:44 PM, Julian Aubourg j...@ubourg.net wrote:
 I don't quite get why you're saying HTTP is irrelevant.

For the requirements where the XMLHttpRequest says to put a certain
byte string as a value of a header, that's what the implementation has
to do, and nothing else. We could make the XMLHttpRequest talk about
the value in a more abstract manner rather than any particular
serialization and leave the serialization undefined, but it's not
clear we should do that.


 As an example, regarding the content-type request header, the XHR spec
 clearly states:

 If a Content-Type header is in author request headers and its value is a
 valid MIME type that has a charset parameter whose value is not a
 case-insensitive match for encoding, and encoding is not null, set all
 the charset parameters of that Content-Type header to encoding.

Yeah, this part needs to be updated at some point to actually state
what should happen in terms of parsing and such, but for now it's
clear enough.


 So, testing for a response Content-Type case-sensitively is not correct.

It is if the specification requires a specific byte string as value.


 Things are less clear to me when it comes to white spaces. I find HTTP quite
 evasive on the matter.

You can have a space there, but not per the requirements in XMLHttpRequest.


--
http://annevankesteren.nl/



Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Julian Aubourg
You made the whole thing a lot clearer to me, thank you :)

It seems strange the spec would require a case-sensitive value for
Content-Type in certain circumstances.  Are these deviations from the
case-insensitiveness of the header really necessary ? Are they beneficial
for authors ? It seems to me they promote bad practice (case-sensitive
testing of Content-Type).


On 7 May 2013 01:20, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, May 6, 2013 at 3:44 PM, Julian Aubourg j...@ubourg.net wrote:
  I don't quite get why you're saying HTTP is irrelevant.

 For the requirements where the XMLHttpRequest says to put a certain
 byte string as a value of a header, that's what the implementation has
 to do, and nothing else. We could make the XMLHttpRequest talk about
 the value in a more abstract manner rather than any particular
 serialization and leave the serialization undefined, but it's not
 clear we should do that.


  As an example, regarding the content-type request header, the XHR spec
  clearly states:
 
  If a Content-Type header is in author request headers and its value is a
  valid MIME type that has a charset parameter whose value is not a
  case-insensitive match for encoding, and encoding is not null, set all
  the charset parameters of that Content-Type header to encoding.

 Yeah, this part needs to be updated at some point to actually state
 what should happen in terms of parsing and such, but for now it's
 clear enough.


  So, testing for a response Content-Type case-sensitively is not correct.

 It is if the specification requires a specific byte string as value.


  Things are less clear to me when it comes to white spaces. I find HTTP
 quite
  evasive on the matter.

 You can have a space there, but not per the requirements in XMLHttpRequest.


 --
 http://annevankesteren.nl/



Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Anne van Kesteren
On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote:
 It seems strange the spec would require a case-sensitive value for
 Content-Type in certain circumstances.  Are these deviations from the
 case-insensitiveness of the header really necessary ? Are they beneficial
 for authors ? It seems to me they promote bad practice (case-sensitive
 testing of Content-Type).

There's only two things that seem to work well over a long period of
time given multiple implementations and developers coding toward the
dominant implementation (this describes the web).

1. Require the same from everyone.

2. Require randomness.

Anything else is likely to lead some subset of developers to depend on
certain things they really should not depend on and will force
everyone to match the conventions of what they depend on (if you're in
bad luck you'll get mutual exclusive dependencies; the web has those
too). E.g. the ordering of the members of the canvas element is one
such thing (trivial bad luck example is User-Agent).


--
http://annevankesteren.nl/



Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Julian Aubourg
I hear you, but isn't having a case-sensitive value of Content-Type *in
certain circumstances* triggering the kind of problem you're talking about
(developers to depend on
certain things they really should not depend on) ?

As I see it, the tests in question here are doing something that is wrong
in the general use-case from an author's POW.

By requiring the same from every *implementor*, aren't we pushing *authors *in
the trap you describe. Case in point : the author of the test is testing
Content-Type case-sensitively while it is improper (from an author POW) in
any other circumstance. The same code will fail if, say, the server sets a
Content-Type. Shouldn't we protect authors from such inconsistencies ?



On 7 May 2013 01:39, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote:
  It seems strange the spec would require a case-sensitive value for
  Content-Type in certain circumstances.  Are these deviations from the
  case-insensitiveness of the header really necessary ? Are they beneficial
  for authors ? It seems to me they promote bad practice (case-sensitive
  testing of Content-Type).

 There's only two things that seem to work well over a long period of
 time given multiple implementations and developers coding toward the
 dominant implementation (this describes the web).

 1. Require the same from everyone.

 2. Require randomness.

 Anything else is likely to lead some subset of developers to depend on
 certain things they really should not depend on and will force
 everyone to match the conventions of what they depend on (if you're in
 bad luck you'll get mutual exclusive dependencies; the web has those
 too). E.g. the ordering of the members of the canvas element is one
 such thing (trivial bad luck example is User-Agent).


 --
 http://annevankesteren.nl/




Re: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread Charles McCathie Nevile
On Tue, 07 May 2013 01:39:26 +0200, Anne van Kesteren ann...@annevk.nl  
wrote:



On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote:

It seems strange the spec would require a case-sensitive value for
Content-Type in certain circumstances.  Are these deviations from the
case-insensitiveness of the header really necessary ? Are they  
beneficial for authors ?


This is how the web is rings like an 'argument from authority'. I'm  
generally less concerned about those than I believe you are, but I think  
Julien's questions here are important.



It seems to me they promote bad practice (case-sensitive testing of
Content-Type).


There's only two things that seem to work well over a long period of
time given multiple implementations and developers coding toward the
dominant implementation (this describes the web).


(maybe.)


1. Require the same from everyone.


So is there a concrete dominanant implementation that is case-sensitive?

Because requiring case-insensistive matching from everyone would seem to  
meet your requirement above, in principle. And it might even be that with  
good clear specifications and good test suites that the dominant  
implementation reinforces a simpler path for authors.



Anything else is likely to lead some subset of developers to depend on
certain things they really should not depend on and will force
everyone to match the conventions of what they depend on


I know this has happened on the web for various cases. But it actually  
depends on having a sufficiently non-conformant implementation be  
sufficiently important to dominate (rather than be a known error case that  
is commonly monkey-patched until in a decade or so it just evaporates). I  
don't see any proof that it is *bound* to happen.


cheers

Chaals

--
Charles McCathie Nevile - Consultant (web standards) CTO Office, Yandex
  cha...@yandex-team.ru Find more at http://yandex.com



RE: [XHR] test nitpicks: MIME type / charset requirements

2013-05-06 Thread HU, BIN
Since XHR is the API to facilitate a valid HTTP transaction, IMHO, it should be 
fully compliant with HTTP - no more and no less. A valid HTTP request and 
response should be interpreted consistently across UA's and devices.

Interoperability is very important across UA's and devices. If the XHR, either 
spec or implementation, is not fully compliant with HTTP, it will give users an 
unpleasant experience resulting from the interoperability issue.

Thanks
Bin
-Original Message-
From: Charles McCathie Nevile [mailto:cha...@yandex-team.ru] 
Sent: Monday, May 06, 2013 6:06 PM
To: Julian Aubourg; Anne van Kesteren
Cc: Hallvord Reiar Michaelsen Steen; public-webapps WG
Subject: Re: [XHR] test nitpicks: MIME type / charset requirements

On Tue, 07 May 2013 01:39:26 +0200, Anne van Kesteren ann...@annevk.nl  
wrote:

 On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote:
 It seems strange the spec would require a case-sensitive value for
 Content-Type in certain circumstances.  Are these deviations from the
 case-insensitiveness of the header really necessary ? Are they  
 beneficial for authors ?

This is how the web is rings like an 'argument from authority'. I'm  
generally less concerned about those than I believe you are, but I think  
Julien's questions here are important.

 It seems to me they promote bad practice (case-sensitive testing of
 Content-Type).

 There's only two things that seem to work well over a long period of
 time given multiple implementations and developers coding toward the
 dominant implementation (this describes the web).

(maybe.)

 1. Require the same from everyone.

So is there a concrete dominanant implementation that is case-sensitive?

Because requiring case-insensistive matching from everyone would seem to  
meet your requirement above, in principle. And it might even be that with  
good clear specifications and good test suites that the dominant  
implementation reinforces a simpler path for authors.

 Anything else is likely to lead some subset of developers to depend on
 certain things they really should not depend on and will force
 everyone to match the conventions of what they depend on

I know this has happened on the web for various cases. But it actually  
depends on having a sufficiently non-conformant implementation be  
sufficiently important to dominate (rather than be a known error case that  
is commonly monkey-patched until in a decade or so it just evaporates). I  
don't see any proof that it is *bound* to happen.

cheers

Chaals

-- 
Charles McCathie Nevile - Consultant (web standards) CTO Office, Yandex
   cha...@yandex-team.ru Find more at http://yandex.com