Thanks, Julian. Dick and I will integrate this into the draft.
-- Mike
-----Original Message-----
From: Julian Reschke [mailto:[email protected]]
Sent: Thursday, July 12, 2012 1:31 AM
To: Mike Jones
Cc: [email protected]
Subject: Re: [OAUTH-WG] Preliminary OAuth Core draft -29
On 2012-07-09 17:01, Julian Reschke wrote:
> On 2012-07-09 16:48, Mike Jones wrote:
>> HTML5 is not cited because it's a working draft - not an approved
>> standard. In what way is "the definition of the media type in HTML4
>> is known to be insufficient"? People have been successfully
>> implementing form-urlencoding with it for quite some time. :-) Is
>> there a specific wording change that you'd suggest that we make that
>> doesn't involve citing a working draft, rather than an approved standard?
>
> For instance, the HTML4 "definition" doesn't even mention what to do
> with non-ASCII characters.
>
> I understand that it's not particularly attractive, but citing HTML4
> just because it's a "standard" isn't really helpful for people who
> actually follow the link and try to understand what needs to be
> implemented.
> ...
Here's an attempt to describe the encoding in terms of HTML4, plus additional
instruction. This would need to be referenced anyway where the spec currently
refers to the HTML4 media type definition:
-- snip --
Appendix X. Use of the application/x-www-form-urlencoded Media Type
At the time of publication of this specification, the
"application/x-www-form-urlencoded" media type was defined in Section
17.13.4 of [HTML4], but not registered in the IANA media types registry
(<http://www.iana.org/assignments/media-types/index.html>). Furthermore, the
definition is incomplete as it does not consider non-US-ASCII characters.
To address this shortcoming, when generating payloads using this media type,
names and values MUST be encoded using the "UTF-8" character encoding scheme
([RFC3629]) first; the resulting octet sequence then needs to be further
encoded using the escaping rules defined in [HTML4].
When parsing data from a payload using this media type, the names and values
resulting from reversing the name/value encoding consequently need to be
treated as octet sequences, to be decoded using the "UTF-8"
character encoding scheme.
Example: A value consisting of the six Unicode code points (1) U+0020 (SPACE),
(2) U+0025 (PERCENT SIGN), (3) U+0026 (AMPERSAND), (4) U+002B (PLUS SIGN), (5)
U+00A3 (POUND SIGN), and (6) U+20AC (EURO SIGN) would be encoded into the octet
sequence below (using hexadecimal notation):
20 25 26 2B C2 A3 E2 82 AC
and then represented in the payload as:
+%25%26%2B%C2%A3%E2%82%AC
-- snip --
Best regards, Julian
_______________________________________________
OAuth mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/oauth