Re: Re: [XHR] test nitpicks: MIME type / charset requirements
It seems strange the spec would require a case-sensitive value for Content-Type in certain circumstances. There's only two things that seem to work well over a long period of time given multiple implementations and developers coding toward the dominant implementation (this describes the web). 1. Require the same from everyone. 2. Require randomness. We're discussing the case of a MIME type parameter sent from a client to a server, the question is basically where to draw the line between what we spec and what we leave up to the implementation. Currently, according to the spec the charset param is expected to be sent in lower case if the charset the JS sets matches (case insensitively!) the charset the implementation sends data in, and the JS used lower case (i.e. text/plain;charset=utf-8 will send charset=utf-8), in upper case if the implementation rewrites any charset parameter (text/plain;charset=foo = text/plain;charset=UTF-8 and perhaps least expected text/plain;charset=utf-8;charset=foo = text/plain;charset=UTF-8;charset=UTF-8). So per the spec itself the value may sometimes be lower cased, sometimes upper cased, and it may sometimes be transformed to upper case even if it was originally given in lower case. We have no evidence that servers require or prefer a certain case. Servers (like Apache, IIS and Nginx) are generally written by professionals who understand case insensitivity. Server-side scripting, on the other hand, is not necessarily of high quality and might end up requiring a certain case. If such scripts exist, and if it's not documented what case is expected, we will end up with one of those small gotchas that are so harmful to cross-implementation compat. (On the other hand, if we already have a state where a variety of input is accepted and narrow down what is considered legal, content may well follow - this risks creating one of those backwards incompatibilities that annoy users with older devices and versions. IMO as spec authors we should also keep backwards compatibility in mind and not diverge from existing implementations unless we have good reasons.) TL;DR: I'm not aware of evidence that spec'ing this is required for compat, I do buy the argument that precision might cause better future compat, I'm however concerned about back compat and find it surprising that a strictly spec'ed implementation detail will sometimes transform the case the script actually used. 'HR Anything else is likely to lead some subset of developers to depend on certain things they really should not depend on and will force everyone to match the conventions of what they depend on (if you're in bad luck you'll get mutual exclusive dependencies; the web has those too). E.g. the ordering of the members of the canvas element is one such thing (trivial bad luck example is User-Agent). -- http://annevankesteren.nl/ -- Hallvord R. M. Steen Core tester, Opera Software
Re: [XHR] test nitpicks: MIME type / charset requirements
On 2013-05-07 00:44, Julian Aubourg wrote: Hey Anne, I don't quite get why you're saying HTTP is irrelevant. As an example, regarding the content-type /request /header, the XHR spec clearly states: If a |Content-Type| header is in author request headers http://www.w3.org/TR/XMLHttpRequest/#author-request-headers and its value is a valid MIME type http://dev.w3.org/html5/spec/infrastructure.html#valid-mime-type that has a |charset| parameter whose value is not a case-insensitive match for encoding, and encoding is not null, set all the|charset| parameters of that |Content-Type| header to encoding. So, at least, the encoding in the request content-type is clearly stated as being case-insensitive. BTW, Valid MIME type leads to (HTML 5.1): A string is a valid MIME type if it matches the |media-type| rule defined in section 3.7 Media Types of RFC 2616. In particular, a valid MIME type http://www.w3.org/html/wg/drafts/html/master/infrastructure.html#valid-mime-type may include MIME type parameters. [HTTP] http://www.w3.org/html/wg/drafts/html/master/iana.html#refsHTTP Of course, nothing is explicitely specified regarding the /response /content-type, because it is implicitely covered by HTTP (seeing as the value is generated outside of the client -- except when using overrideMimeType). It's usage as defined by the XHR spec is irrelevant to the fact it is to be considered case-insensitively : any software or hardware along the network path is perfectly entitled to change the case of the Content-Type header because HTTP clearly states case does not matter. So, testing for a response Content-Type case-sensitively is /not /correct. Things are less clear to me when it comes to white spaces. I find HTTP quite evasive on the matter. ... RFC 2616 is pretty clear if and only if you understand how implied linear whitespace works in it's version of ABNF. In HTTPbis, we removed implied whitespace rules, so you may want to look at http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p2-semantics-latest.html#media.type instead (note that this is past WGLC and will be in IETF Last Call soonish). Best regards, Julian
Re: [XHR] test nitpicks: MIME type / charset requirements
On 2013-05-07 11:39, Hallvord Reiar Michaelsen Steen wrote: It seems strange the spec would require a case-sensitive value for Content-Type in certain circumstances. There's only two things that seem to work well over a long period of time given multiple implementations and developers coding toward the dominant implementation (this describes the web). 1. Require the same from everyone. 2. Require randomness. We're discussing the case of a MIME type parameter sent from a client to a server, the question is basically where to draw the line between what we spec and what we leave up to the implementation. Currently, according to the spec the charset param is expected to be sent in lower case if the charset the JS sets matches (case insensitively!) the charset the implementation sends data in, and the JS used lower case (i.e. text/plain;charset=utf-8 will send charset=utf-8), in upper case if the implementation rewrites any charset parameter (text/plain;charset=foo = text/plain;charset=UTF-8 and perhaps least expected text/plain;charset=utf-8;charset=foo = text/plain;charset=UTF-8;charset=UTF-8). So per the spec itself the value may sometimes be lower cased, sometimes upper cased, and it may sometimes be transformed to upper case even if it was originally given in lower case. We have no evidence that servers require or prefer a certain case. Servers (like Apache, IIS and Nginx) are generally written by professionals who understand case insensitivity. Server-side scripting, on the other hand, is not necessarily of high quality and might end up requiring a certain case. If such scripts exist, and if it's not documented what case is expected, we will end up with one of those small gotchas that are so harmful to cross-implementation compat. (On the other hand, if we already have a state where a variety of input is accepted and narrow down what is considered legal, content may well follow - this risks creating one of those backwards incompatibilities that annoy users with older devices and versions. IMO as spec authors we should also keep backwards compatibility in mind and not diverge from existing implementations unless we have good reasons.) TL;DR: I'm not aware of evidence that spec'ing this is required for compat, I do buy the argument that precision might cause better future compat, I'm however concerned about back compat and find it surprising that a strictly spec'ed implementation detail will sometimes transform the case the script actually used. ... Indeed. See also http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/0955.html about the requirement to rewrite charset parameters in-place, and - slightly related - https://www.w3.org/Bugs/Public/show_bug.cgi?id=15312 about the requirement to lowercase header field names in CORS. Best regards, Julian
Re: Re: [XHR] test nitpicks: MIME type / charset requirements
On Tue, May 7, 2013 at 2:39 AM, Hallvord Reiar Michaelsen Steen hallv...@opera.com wrote: TL;DR: I'm not aware of evidence that spec'ing this is required for compat, I do buy the argument that precision might cause better future compat, I'm however concerned about back compat and find it surprising that a strictly spec'ed implementation detail will sometimes transform the case the script actually used. Yeah, we might have to preserve casing of the encoding label if there's a (byte case-insensitive) match to begin with. -- http://annevankesteren.nl/
Re: [XHR] test nitpicks: MIME type / charset requirements
Aren't both text/html;charset=windows-1252 and text/html; charset=windows-1252 valid MIME types? Should we make the tests a bit more accepting? Reading http://www.w3.org/Protocols/rfc1341/4_Content-Type.html it's not crystal clear if spaces are accepted, although white spaces and space are clearly cited in the grammar as forbidden in tokens. My understanding is that the intent is for white spaces to be ignored but I could be wrong. Truth is the spec could use some consistency and precision. test script sets charset=utf-8 and charset=UTF-8 on the wire is considered a failure Those tests must ignore case. The type, subtype, and parameter names are not case sensitive. On 6 May 2013 18:31, Hallvord Reiar Michaelsen Steen hallv...@opera.comwrote: Two of the tests in http://w3c-test.org/web-platform-tests/master/XMLHttpRequest/send-content-type-string.htmfails in Firefox just because there is a space before the word charset. Aren't both text/html;charset=windows-1252 and text/html; charset=windows-1252 valid MIME types? Should we make the tests a bit more accepting? Also, there's a test in http://w3c-test.org/web-platform-tests/master/XMLHttpRequest/send-content-type-charset.htmthat fails in Chrome because it asserts charset must be lower case, i.e. test script sets charset=utf-8 and charset=UTF-8 on the wire is considered a failure. Does that make sense? -- Hallvord R. M. Steen Core tester, Opera Software
Re: [XHR] test nitpicks: MIME type / charset requirements
On Mon, May 6, 2013 at 9:31 AM, Hallvord Reiar Michaelsen Steen hallv...@opera.com wrote: ... The reason the tests test that is because the specification requires exactly that. If you want to change the tests, you'd first have to change the specification. (What HTTP says on the matter is not relevant.) -- http://annevankesteren.nl/
Re: [XHR] test nitpicks: MIME type / charset requirements
Hey Anne, I don't quite get why you're saying HTTP is irrelevant. As an example, regarding the content-type *request *header, the XHR spec clearly states: If a Content-Type header is in author request headershttp://www.w3.org/TR/XMLHttpRequest/#author-request-headers and its value is a valid MIME typehttp://dev.w3.org/html5/spec/infrastructure.html#valid-mime-type that has a charset parameter whose value is not a case-insensitive match for encoding, and encoding is not null, set all thecharset parameters of that Content-Type header to encoding. So, at least, the encoding in the request content-type is clearly stated as being case-insensitive. BTW, Valid MIME type leads to (HTML 5.1): A string is a valid MIME type if it matches the media-type rule defined in section 3.7 Media Types of RFC 2616. In particular, a valid MIME typehttp://www.w3.org/html/wg/drafts/html/master/infrastructure.html#valid-mime-type may include MIME type parameters. [HTTP]http://www.w3.org/html/wg/drafts/html/master/iana.html#refsHTTP Of course, nothing is explicitely specified regarding the *response *content-type, because it is implicitely covered by HTTP (seeing as the value is generated outside of the client -- except when using overrideMimeType). It's usage as defined by the XHR spec is irrelevant to the fact it is to be considered case-insensitively : any software or hardware along the network path is perfectly entitled to change the case of the Content-Type header because HTTP clearly states case does not matter. So, testing for a response Content-Type case-sensitively is *not *correct. Things are less clear to me when it comes to white spaces. I find HTTP quite evasive on the matter. Please, correct me if I'm wrong and feel free to point me to the exact sentences in the XHR spec that calls for an exception regarding case-insensitivity of MIME types (as defined in HTTP which XHR references through HTML 5.1). I may very well have missed those. Cheers, -- Julian On 6 May 2013 19:22, Anne van Kesteren ann...@annevk.nl wrote: On Mon, May 6, 2013 at 9:31 AM, Hallvord Reiar Michaelsen Steen hallv...@opera.com wrote: ... The reason the tests test that is because the specification requires exactly that. If you want to change the tests, you'd first have to change the specification. (What HTTP says on the matter is not relevant.) -- http://annevankesteren.nl/
Re: [XHR] test nitpicks: MIME type / charset requirements
On Mon, May 6, 2013 at 3:44 PM, Julian Aubourg j...@ubourg.net wrote: I don't quite get why you're saying HTTP is irrelevant. For the requirements where the XMLHttpRequest says to put a certain byte string as a value of a header, that's what the implementation has to do, and nothing else. We could make the XMLHttpRequest talk about the value in a more abstract manner rather than any particular serialization and leave the serialization undefined, but it's not clear we should do that. As an example, regarding the content-type request header, the XHR spec clearly states: If a Content-Type header is in author request headers and its value is a valid MIME type that has a charset parameter whose value is not a case-insensitive match for encoding, and encoding is not null, set all the charset parameters of that Content-Type header to encoding. Yeah, this part needs to be updated at some point to actually state what should happen in terms of parsing and such, but for now it's clear enough. So, testing for a response Content-Type case-sensitively is not correct. It is if the specification requires a specific byte string as value. Things are less clear to me when it comes to white spaces. I find HTTP quite evasive on the matter. You can have a space there, but not per the requirements in XMLHttpRequest. -- http://annevankesteren.nl/
Re: [XHR] test nitpicks: MIME type / charset requirements
You made the whole thing a lot clearer to me, thank you :) It seems strange the spec would require a case-sensitive value for Content-Type in certain circumstances. Are these deviations from the case-insensitiveness of the header really necessary ? Are they beneficial for authors ? It seems to me they promote bad practice (case-sensitive testing of Content-Type). On 7 May 2013 01:20, Anne van Kesteren ann...@annevk.nl wrote: On Mon, May 6, 2013 at 3:44 PM, Julian Aubourg j...@ubourg.net wrote: I don't quite get why you're saying HTTP is irrelevant. For the requirements where the XMLHttpRequest says to put a certain byte string as a value of a header, that's what the implementation has to do, and nothing else. We could make the XMLHttpRequest talk about the value in a more abstract manner rather than any particular serialization and leave the serialization undefined, but it's not clear we should do that. As an example, regarding the content-type request header, the XHR spec clearly states: If a Content-Type header is in author request headers and its value is a valid MIME type that has a charset parameter whose value is not a case-insensitive match for encoding, and encoding is not null, set all the charset parameters of that Content-Type header to encoding. Yeah, this part needs to be updated at some point to actually state what should happen in terms of parsing and such, but for now it's clear enough. So, testing for a response Content-Type case-sensitively is not correct. It is if the specification requires a specific byte string as value. Things are less clear to me when it comes to white spaces. I find HTTP quite evasive on the matter. You can have a space there, but not per the requirements in XMLHttpRequest. -- http://annevankesteren.nl/
Re: [XHR] test nitpicks: MIME type / charset requirements
On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote: It seems strange the spec would require a case-sensitive value for Content-Type in certain circumstances. Are these deviations from the case-insensitiveness of the header really necessary ? Are they beneficial for authors ? It seems to me they promote bad practice (case-sensitive testing of Content-Type). There's only two things that seem to work well over a long period of time given multiple implementations and developers coding toward the dominant implementation (this describes the web). 1. Require the same from everyone. 2. Require randomness. Anything else is likely to lead some subset of developers to depend on certain things they really should not depend on and will force everyone to match the conventions of what they depend on (if you're in bad luck you'll get mutual exclusive dependencies; the web has those too). E.g. the ordering of the members of the canvas element is one such thing (trivial bad luck example is User-Agent). -- http://annevankesteren.nl/
Re: [XHR] test nitpicks: MIME type / charset requirements
I hear you, but isn't having a case-sensitive value of Content-Type *in certain circumstances* triggering the kind of problem you're talking about (developers to depend on certain things they really should not depend on) ? As I see it, the tests in question here are doing something that is wrong in the general use-case from an author's POW. By requiring the same from every *implementor*, aren't we pushing *authors *in the trap you describe. Case in point : the author of the test is testing Content-Type case-sensitively while it is improper (from an author POW) in any other circumstance. The same code will fail if, say, the server sets a Content-Type. Shouldn't we protect authors from such inconsistencies ? On 7 May 2013 01:39, Anne van Kesteren ann...@annevk.nl wrote: On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote: It seems strange the spec would require a case-sensitive value for Content-Type in certain circumstances. Are these deviations from the case-insensitiveness of the header really necessary ? Are they beneficial for authors ? It seems to me they promote bad practice (case-sensitive testing of Content-Type). There's only two things that seem to work well over a long period of time given multiple implementations and developers coding toward the dominant implementation (this describes the web). 1. Require the same from everyone. 2. Require randomness. Anything else is likely to lead some subset of developers to depend on certain things they really should not depend on and will force everyone to match the conventions of what they depend on (if you're in bad luck you'll get mutual exclusive dependencies; the web has those too). E.g. the ordering of the members of the canvas element is one such thing (trivial bad luck example is User-Agent). -- http://annevankesteren.nl/
Re: [XHR] test nitpicks: MIME type / charset requirements
On Tue, 07 May 2013 01:39:26 +0200, Anne van Kesteren ann...@annevk.nl wrote: On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote: It seems strange the spec would require a case-sensitive value for Content-Type in certain circumstances. Are these deviations from the case-insensitiveness of the header really necessary ? Are they beneficial for authors ? This is how the web is rings like an 'argument from authority'. I'm generally less concerned about those than I believe you are, but I think Julien's questions here are important. It seems to me they promote bad practice (case-sensitive testing of Content-Type). There's only two things that seem to work well over a long period of time given multiple implementations and developers coding toward the dominant implementation (this describes the web). (maybe.) 1. Require the same from everyone. So is there a concrete dominanant implementation that is case-sensitive? Because requiring case-insensistive matching from everyone would seem to meet your requirement above, in principle. And it might even be that with good clear specifications and good test suites that the dominant implementation reinforces a simpler path for authors. Anything else is likely to lead some subset of developers to depend on certain things they really should not depend on and will force everyone to match the conventions of what they depend on I know this has happened on the web for various cases. But it actually depends on having a sufficiently non-conformant implementation be sufficiently important to dominate (rather than be a known error case that is commonly monkey-patched until in a decade or so it just evaporates). I don't see any proof that it is *bound* to happen. cheers Chaals -- Charles McCathie Nevile - Consultant (web standards) CTO Office, Yandex cha...@yandex-team.ru Find more at http://yandex.com
RE: [XHR] test nitpicks: MIME type / charset requirements
Since XHR is the API to facilitate a valid HTTP transaction, IMHO, it should be fully compliant with HTTP - no more and no less. A valid HTTP request and response should be interpreted consistently across UA's and devices. Interoperability is very important across UA's and devices. If the XHR, either spec or implementation, is not fully compliant with HTTP, it will give users an unpleasant experience resulting from the interoperability issue. Thanks Bin -Original Message- From: Charles McCathie Nevile [mailto:cha...@yandex-team.ru] Sent: Monday, May 06, 2013 6:06 PM To: Julian Aubourg; Anne van Kesteren Cc: Hallvord Reiar Michaelsen Steen; public-webapps WG Subject: Re: [XHR] test nitpicks: MIME type / charset requirements On Tue, 07 May 2013 01:39:26 +0200, Anne van Kesteren ann...@annevk.nl wrote: On Mon, May 6, 2013 at 4:33 PM, Julian Aubourg j...@ubourg.net wrote: It seems strange the spec would require a case-sensitive value for Content-Type in certain circumstances. Are these deviations from the case-insensitiveness of the header really necessary ? Are they beneficial for authors ? This is how the web is rings like an 'argument from authority'. I'm generally less concerned about those than I believe you are, but I think Julien's questions here are important. It seems to me they promote bad practice (case-sensitive testing of Content-Type). There's only two things that seem to work well over a long period of time given multiple implementations and developers coding toward the dominant implementation (this describes the web). (maybe.) 1. Require the same from everyone. So is there a concrete dominanant implementation that is case-sensitive? Because requiring case-insensistive matching from everyone would seem to meet your requirement above, in principle. And it might even be that with good clear specifications and good test suites that the dominant implementation reinforces a simpler path for authors. Anything else is likely to lead some subset of developers to depend on certain things they really should not depend on and will force everyone to match the conventions of what they depend on I know this has happened on the web for various cases. But it actually depends on having a sufficiently non-conformant implementation be sufficiently important to dominate (rather than be a known error case that is commonly monkey-patched until in a decade or so it just evaporates). I don't see any proof that it is *bound* to happen. cheers Chaals -- Charles McCathie Nevile - Consultant (web standards) CTO Office, Yandex cha...@yandex-team.ru Find more at http://yandex.com