> From: Chris Burdess [mailto:[EMAIL PROTECTED] 
> Subject: Possible bug in request parameter decoding
> 
> According to
>   http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars
> request parameters are encoded in UTF-8.

Well, that's not quite how I read it.  By definition (RFC 2396), URIs
are not supposed to contain non-ASCII values.  The HTML 4.0 appendix
referred to above makes the somewhat contradictory suggestion to use
UTF-8 to handle non-ASCII, ignoring the fact that UTF-8 encoding
produces byte values outside of the ASCII range.  Since the discussion
in this area of the appendix is related to browser, not server,
behavior, it's not really relevant to what Tomcat should do when it
encounters illegal (non-ASCII) values in a URI supplied on a browser
request.

In any event, as Tim noted, you can configure how the connector should
interpret non-ASCII bytes by specifying a value for URIEncoding in your
server.xml file.  I suspect that the default value of ISO-8859-1 is
there largely for historical reasons, since it was the predominant
encoding before UTF-8 became popular.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is thus for use only by the intended recipient. If you
received this in error, please contact the sender and delete the e-mail
and its attachments from all computers.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to