form values being sent as HTML entities instead of UTF-8 chars

jeff . guttadauro Thu, 10 Jan 2002 09:03:08 -0800

Hello.

     I'm having a problem which I believe is related to Tomcat 4, since I
didn't see this happening on 3.2 before I upgraded.  I have a form on a page
that is set to UTF-8 character encoding.  When I paste a Unicode character
into an input field and submit the form, the characters are being received as
HTML character entities ( for example, &#1234; ) instead of the UTF-8 bytes.


     There seem to be so many encoding settings all over the place, and I'm
really confused as to which settings control what.  I used to just have the
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"> tag in my
page, and things worked fine before the upgrade.  I thought that the tag would
tell the browser which encoding to use, and it would handle displaying content
using that encoding and would also submit form values in that same encoding.
But, it seems as though Tomcat is affecting how the values are submitted.
I've seen some discussion of a character encoding filter, but I'm not sure how
it all works.  I've seen a bunch of different things: the META tag, using a
page directive for contentType, using a page directive for pageEncoding, using
request.setCharacterEncoding, using response.setCharacterEncoding, using this
SetCharacterEncodingFilter class.  Could anyone offer some clarification of
what the different ways to set character encoding do differently and in what
situations they should be used?

Thanks in advance for any light you can shed on this for me!
-Jeff



--
To unsubscribe:   <mailto:[EMAIL PROTECTED]>
For additional commands: <mailto:[EMAIL PROTECTED]>
Troubles with the list: <mailto:[EMAIL PROTECTED]>

form values being sent as HTML entities instead of UTF-8 chars

Reply via email to