On Mon, 4 Feb 2002, Bill Barker wrote:

> My understanding of this is that if the request is for:
>     /el-niņo.jsp
> then most of the time Tomcat will read it correctly. But it will return for
> requestURI:
>     /el-ni%A1o.jso
> The "safe chars" map to the same code points under iso-latin-1 and utf-8
> (that's why they are "safe chars").  UEncoder is strict in what is safe, but
> the RFC isn't.  You are allowed to use exteded chars if the other side is
> capable of detecting the charset.

I wouldn't change this behavior - I think it's better to return the
second form rather than first. The URL is supposed to be 7-bit safe.
It is something you can write on a paper or type on any keyboard.

%A1 is not the same under 8859_1 and utf8 ( AFAIK - I may be
wrong ). And "/el-niņo.jsp" is hard to type on a keyboard or to view
for people with non-8859_1 charsets. ( %A1 will have a very different
char ).

IMHO the RFC is clear enough about what a 'safe char' is, and my
understanding was that anything >0x7f isn't.

( the 'encoded' URI is something you are supposed to print, go
to a different computer, type, and get to the page. You can't
type ņ on a chinese or greek keyboard )

Costin



--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to