On Mon, 4 Feb 2002, Bill Barker wrote: > My understanding of this is that if the request is for: > /el-niņo.jsp > then most of the time Tomcat will read it correctly. But it will return for > requestURI: > /el-ni%A1o.jso > The "safe chars" map to the same code points under iso-latin-1 and utf-8 > (that's why they are "safe chars"). UEncoder is strict in what is safe, but > the RFC isn't. You are allowed to use exteded chars if the other side is > capable of detecting the charset.
I wouldn't change this behavior - I think it's better to return the second form rather than first. The URL is supposed to be 7-bit safe. It is something you can write on a paper or type on any keyboard. %A1 is not the same under 8859_1 and utf8 ( AFAIK - I may be wrong ). And "/el-niņo.jsp" is hard to type on a keyboard or to view for people with non-8859_1 charsets. ( %A1 will have a very different char ). IMHO the RFC is clear enough about what a 'safe char' is, and my understanding was that anything >0x7f isn't. ( the 'encoded' URI is something you are supposed to print, go to a different computer, type, and get to the page. You can't type ņ on a chinese or greek keyboard ) Costin -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>