Anne van Kesteren wrote: > On Tue, 08 Sep 2009 21:40:22 +0200, NARUSE, Yui <[email protected]> wrote: >> First is about 4.10.16.4 URL-encoded form data. >> http://www.whatwg.org/specs/web-apps/current-work/#application/x-www-form-urlencoded-encoding-algorithm >> >> >> In this algorithm at 6.2.1, >> "SP, *, -, ., 0 .. 9, A .. Z, _, a .. z" is not escaped. >> But many other specs which use application/x-www-form-urlencoded refers > > Which other specifications?
Following specifications. (sorry some of them are earlier RFC) XForms 1.0 http://www.w3.org/TR/xforms/#serialize-urlencode "then non-ASCII and reserved characters (as defined by [RFC 2396] as amended by subsequent documents in the IETF track) are escaped" -> so RFC3986 HTML 4 http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1 "reserved characters are escaped as described in [RFC1738]" RFC1738 http://www.faqs.org/rfcs/rfc1738.html unreserved = alpha | digit | safe | extra safe = "$" | "-" | "_" | "." | "+" extra = "!" | "*" | "'" | "(" | ")" | "," TAG Finding "refer to section 2.1 of [RFC2396]." http://www.w3.org/2001/tag/doc/whenToUseGet.html#i18n RFC2396 http://www.faqs.org/rfcs/rfc2396.html unreserved = alphanum | mark mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" WSDL 2.0 http://www.w3.org/TR/wsdl20-bindings/#_http_x-www-form-urlencoded "Replacement values falling outside the range (ALPHA and DIGIT below are defined as per [IETF RFC 4234]): ALPHA | DIGIT | "-" | "." | "_" | "~" | "!" | "$" | "&" | "'" | "(" | ")" | "*" | "+" | "," | ";" | "=" | ":" | "@", MUST be percent-encoded." >> URI's unreserved. And it in RFC3986 is >> unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" >> Why ~ is escaped and * is not escaped? > > What do browsers do? IE8 QUERY_STRING: t=+%21%5c%22%5c%23%24%25%26%27%28%29*%2b%2c-.%2f0123456789%3a%3b%3c%3d%3e...@abcdefghijklmnopqrstuvwxyz%5b%5c%5c%5d%5e_%60abcdefghijklmnopqrstuvwxyz%7b%7c%7d%7e not escaped: *...@_ Firefox 3.5 QUERY_STRING: t=+%21%5C%22%5C%23%24%25%26%27%28%29*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E not escaped: *-._ Chrome2 QUERY_STRING: t=+%21%5C%22%5C%23%24%25%26%27%28%29*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E not escaped: *-._ Opera9 QUERY_STRING: t=+%21%5C%22%5C%23%24%25%26%27%28%29%2A%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E not escaped: -._ Hmm, Firefox and Chrome follow this, IE adds @, Opera removes *. If this spec use safer side, * may be also escaped. >> Third is about Web addresses in HTML 5. (this spec is also this ML?) >> http://www.w3.org/html/wg/href/draft > > You want [email protected] or [email protected] for that draft. Thanks, I'll send it. -- NARUSE, Yui <[email protected]>
