Thanks for the extensive list, but I am hoping there is some consensus
amongst the various browsers about which characters they actually encode.
It would appear that IE and Netscape both pass the % symbol unencoded.
E.g if you want to pass the following string to mean discount=50% off $230
<a href="http://host/servlet/Turbine/discount/50%230">
And you count on the browser to do the conversion you are going to get
discount=50#0
However, if there are any browsers out there that actually encode the %
symbol and you are trying to account for the fact that most do not, so you
encode the % on the server before sending the link in the page.
<a href="http://host/servlet/Turbine/discount/50%25230">
This mythical browser is going to further encode the % so you get
discount=50%25230.
Is the spec that % should be encoded and the browsers are implementing it
wrong? Or that they are correct in their implementation, so that one should
never expect a browser might double encode the %, resulting in an error when
we encode on the server? I guess I will have to go look at the spec.
There is some ambiguity. You cannot encode a "/" and expect it to work.
having you send out the following path info: either fraction/1/2 or
fraction/1%2f2 it is going to come back as fraction/1/2 because the browser
is going to encode the / and not the %.
browser sees: fraction/1/2 and converts to fraction%2f1%2f2 and Apache will
convert back to fraction/1/2. But if the browser sees: fraction/1%2f2 and
converts to fraction%2f1%2f2 and Apache will convert to fraction/1/2. To
prevent ambiguity the browser should encode the %. Since we know it doesn't
(in all circumstances?), we must encode the % on the server if we expect it
on return and it is impossible to send fraction=1/2 as path info.
But are there any other characters which must be encoded on the server for
compatibility with some inept browser?
----- Original Message -----
From: Ilmari Karonen <[EMAIL PROTECTED]>
To: Turbine <[EMAIL PROTECTED]>
Sent: Thursday, March 23, 2000 3:01 PM
Subject: Re: url encoding and DynamicURI
>
>
> On Thu, 23 Mar 2000, John McNally wrote:
> > My limited investigation lead me to believe that only % needs encoding
on
> > the server side. Does anyone agree with this or is the problem more
> > difficult?
>
> It is more difficult. Essentially, the characters that MUST be encoded
> are those which have a special meaning in unencoded form. These include
> ";", "/", "?", ":", "@", "&", "=", "#" and "%". Other characters that
> should be encoded are " ", "<", ">", """ (double quote) and possibly also
> "{", "}", "|", "\", "^", "~", "[", "]" and "`". All nonprintable control
> characters must also be encoded, of course.
>
> The first nine characters are the problematic ones, though, since while
> the others can be encoded unconditionally, these must _not_ be encoded
> if they are used as metacharacters, for example to delimit query string
> parameters or path elements, and _must_ be encoded otherwise.
>
> The authoritative reference on this is RFC 1738, specifically section 2.2
> and the BNF definitions in section 5. Be careful in reading it, though,
> since some of the terms used are defined in ways that are subtly different
> from the common usage.
>
> --
> Ilmari Karonen
> http://www.sci.fi/~iltzu/
>
>
>
> ------------------------------------------------------------
> To subscribe: [EMAIL PROTECTED]
> To unsubscribe: [EMAIL PROTECTED]
> Problems?: [EMAIL PROTECTED]
>
------------------------------------------------------------
To subscribe: [EMAIL PROTECTED]
To unsubscribe: [EMAIL PROTECTED]
Problems?: [EMAIL PROTECTED]