signoff SERVLET-INTEREST -----Original Message----- From: Tomas Zeman [mailto:[EMAIL PROTECTED]] Sent: Tuesday, May 29, 2001 3:57 AM To: [EMAIL PROTECTED] Subject: UTF-8 and getParameter() WAS: unicode special characters Hi Marco, I have been experimenting with servlets/JSP and UTF-8 for a while and I came to this : If I set res.setContentType("text/html; charset=UTF-8;"); in my servlet, and I will POST data to server, for each spedial character (not in ISO-8859-1), it will send that character in HEXA as 2 bytes - for example, if I send to the server small "e" with stud/wedge, the server gets text=%C4%9B , which is correct value in UTF-8 for that character in hexa. And now acme the main guestion , what the servlet angine will do with it. I use Resin 1.2.5 and if I do String par = req.getParamater("text"); for that 1 e with stud, the par string will have length() of 2 which is OK and it will display OK on the page as 1 character thanks to the header. but If I set setContentType("text/html; charset=UTF-8"); without the last ";" after UTF-8, the par string length() is only 1 !! and It displays Ok on the page too. Does anyone here has working UTF-8 site ? I have spend too much hours trying to find good solution on this problem. (btw1.: I use the <form> as <form method=\"POST\" accept-charset=\"UTF-8\">) (btw2.: I think this mass is because Javas char has 2 bytes by default) Tomas Zeman >Date: Mon, 28 May 2001 15:47:51 +0200 >From: Marco Trevisan <[EMAIL PROTECTED]> >Subject: Re: How can i send unicode special characters with my request ? > >Actually I found a solution for a subset of characters of our interest >with Tomcat 3.2.1. >We made a patch for the euro symbol by setting the character set with: >response.setContentType("text/html; charset=UTF-8"); >and manually decoding from responses an "intermediate" 8859_1 codepage. >This is due to the 3.2.1 implementation wich ALWAYS opens a reader >from the request with 8859_1 enconding. >I think you can do the same with the characters you are interested in. >I'm interested of alternative solutions while waiting for a 2.3 compliant >servlet container where one can specify with which encoding the request is >formed >(the browser doesn't tell anything about it) or how other servlet containers >behave. > >Reference: >Search "charset used for parameters decoding on HTTP request Tomcat3.x,4 " >in [EMAIL PROTECTED] > >Regards, >Marco ________________________________________________________________________ ___ To unsubscribe, send email to [EMAIL PROTECTED] and include in the body of the message "signoff SERVLET-INTEREST". Archives: http://archives.java.sun.com/archives/servlet-interest.html Resources: http://java.sun.com/products/servlet/external-resources.html LISTSERV Help: http://www.lsoft.com/manuals/user/user.html ___________________________________________________________________________ To unsubscribe, send email to [EMAIL PROTECTED] and include in the body of the message "signoff SERVLET-INTEREST". Archives: http://archives.java.sun.com/archives/servlet-interest.html Resources: http://java.sun.com/products/servlet/external-resources.html LISTSERV Help: http://www.lsoft.com/manuals/user/user.html
Re: UTF-8 and getParameter() WAS: unicode special characters
Athar, Zarina (MED, Exec Search) Tue, 29 May 2001 09:20:37 -0700
- UTF-8 and getParameter() WAS: unicode spe... Tomas Zeman
- Re: UTF-8 and getParameter() WAS: un... Marco Trevisan
- Athar, Zarina (MED, Exec Search)