Thanks, Julian.  Dick and I will integrate this into the draft.

                                -- Mike

-----Original Message-----
From: Julian Reschke [mailto:[email protected]] 
Sent: Thursday, July 12, 2012 1:31 AM
To: Mike Jones
Cc: [email protected]
Subject: Re: [OAUTH-WG] Preliminary OAuth Core draft -29

On 2012-07-09 17:01, Julian Reschke wrote:
> On 2012-07-09 16:48, Mike Jones wrote:
>> HTML5 is not cited because it's a working draft - not an approved 
>> standard.  In what way is "the definition of the media type in HTML4 
>> is known to be insufficient"?  People have been successfully 
>> implementing form-urlencoding with it for quite some time. :-)  Is 
>> there a specific wording change that you'd suggest that we make that 
>> doesn't involve citing a working draft, rather than an approved standard?
>
> For instance, the HTML4 "definition" doesn't even mention what to do 
> with non-ASCII characters.
>
> I understand that it's not particularly attractive, but citing HTML4 
> just because it's a "standard" isn't really helpful for people who 
> actually follow the link and try to understand what needs to be 
> implemented.
> ...

Here's an attempt to describe the encoding in terms of HTML4, plus additional 
instruction. This would need to be referenced anyway where the spec currently 
refers to the HTML4 media type definition:

-- snip --
Appendix X. Use of the application/x-www-form-urlencoded Media Type

At the time of publication of this specification, the 
"application/x-www-form-urlencoded" media type was defined in Section
17.13.4 of [HTML4], but not registered in the IANA media types registry 
(<http://www.iana.org/assignments/media-types/index.html>). Furthermore, the 
definition is incomplete as it does not consider non-US-ASCII characters.

To address this shortcoming, when generating payloads using this media type, 
names and values MUST be encoded using the "UTF-8" character encoding scheme 
([RFC3629]) first; the resulting octet sequence then needs to be further 
encoded using the escaping rules defined in [HTML4].

When parsing data from a payload using this media type, the names and values 
resulting from reversing the name/value encoding consequently need to be 
treated as octet sequences, to be decoded using the "UTF-8" 
character encoding scheme.

Example: A value consisting of the six Unicode code points (1) U+0020 (SPACE), 
(2) U+0025 (PERCENT SIGN), (3) U+0026 (AMPERSAND), (4) U+002B (PLUS SIGN), (5) 
U+00A3 (POUND SIGN), and (6) U+20AC (EURO SIGN) would be encoded into the octet 
sequence below (using hexadecimal notation):

   20 25 26 2B C2 A3 E2 82 AC

and then represented in the payload as:

   +%25%26%2B%C2%A3%E2%82%AC

-- snip --

Best regards, Julian


_______________________________________________
OAuth mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/oauth

Reply via email to