-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jan,

On 10/24/2009 5:58 AM, Pfeifer Jan wrote:
> String decoded = new String(param.getBytes("iso-8859-1"),"UTF-8");

(I'm all out of breath from replacing those " escapes with "
symbols... I need to get more exercise).

The above line of code is only valid if:

1. The bytes coming from the client were supposed to be UTF-8
2. Your server has been configured to interpret the data coming from
   clients unconditionally as ISO-8859-1
3. The characters you are trying to decode are in the ASCII character
   set

Why the third constraint? Because, if the clients sends UTF-8 and the
server decodes that as ISO-8859-1, information is lost in the
translation... the bytes are not going to be magically re-combined into
UTF-8 bytes when you call getBytes("ISO-8859-1") on them. It's only
going to make things worse.

The only time transcoding bytes is appropriate is when you are decoding
GET parameters, because any POST parameters ought to have been sent with
a correct Content-Type (including a charset) parameter.

It would be better to install a filter to set the character encoding of
the request /before/ any data has been read from it if you were worried
about the client sending an incorrect content type.

As for GET parameters, you're pretty much screwed as Andre points out:
there's just no standard for URL encoding (okay, yes, there is a
standard: use URL/%-encoded ISO-8859-1, unless the browser is modern and
uses UTF-8 instead of ISO-8859-1 as its default URL encoding). It's just
a mess.

> for
> a start, I know about URIEncoding in server.xml and about using
> Encoding filter,but we use this for decoding GET request for
> historical reasons. Or is there more "correct" way to
> decode String?

If you always want your strings decoded as UTF-8, then set
URIEncoding="UTF-8" on your <Connector> and be done with it. Don't have
your webapp's code re-coding strings that come from clients.

Again, read the CharacterEncoding page on the Wiki, as previously
suggested. All will become clear.

Well, the solution becomes clear, at least.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrlyfgACgkQ9CaO5/Lv0PAenQCgsmZN7pMGMuhuBO9x1hZ3z5A2
MV0AoJW1MtGpPwWDGrdwy50NhETwvedX
=2ZXB
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to