Try HttpServletRequest.setContentType("text/html; charset=UTF-8");

The page directive is the JSP tag for doing this.

Addison

===========================================================
Addison P. Phillips                    Principal Consultant
Inter-Locale LLC                http://www.inter-locale.com
Los Gatos, CA, USA          mailto:[EMAIL PROTECTED]

+1 408.210.3569 (mobile)              +1 408.904.4762 (fax)
===========================================================
Globalization Engineering & Consulting Services

On Wed, 29 Nov 2000, Bhalchandra Patil wrote:

> thanks addison,
> 
> but i am still not clear.
> I have rectified the mistake in the META tag.
> 
> I could not find any equivalent command for the >  <%@ page
> contentType="text/html; charset=UTF-8" %> in servlet apis.
> 
> but, i have changed the default character set of the webserver to utf-8.
> 
> Now i am entering only one chinese character in the Textfield (with name say
> "uniname") in html page, whose unicode value is 20840 decimal or 0x5158 hex.
> When i submit the page to the server, the request goes to servlet and when i
> say String str = request.getParameter("uniname"), it should give me a string
> with length 1 ( and (long)str.charAt(0) should give me 20840 ).
> [ Rather i want such string ]
> 
> Is it the right format of the unicode string what i am expecting?
> 
> Instead, it gives me a string with two characters with ascii values 145 and
> 83.
> 
> Is there any fundamental mistake i am doing or its something to do with
> webserver's handling of posted unicode data?
> 
> 
> regards,
> bhala
> 
> ----- Original Message -----
> From: <[EMAIL PROTECTED]>
> To: Bhalchandra Patil <[EMAIL PROTECTED]>
> Cc: Unicode List <[EMAIL PROTECTED]>
> Sent: Wednesday, November 29, 2000 10:42 PM
> Subject: Re: posting of unicode data to servlet
> 
> 
> > Hi Bhala,
> >
> > When you use request.getParameter( ) the request class converts the data
> > POSTed to a Java String object. This includes converting the data from
> > whatever the servlet *thinks* the page is encoded as to Java's internal
> > representation, which is UCS-2 (i.e. Unicode).
> >
> > It is important to tell the servlet what the encoding of the page is,
> > therefore. Just putting a META tag into the page won't do it. In a JSP
> > page, for example, you can declare:
> >
> >  <%@ page contentType="text/html; charset=UTF-8" %>
> >
> > Note that your META tag has a typo in it. There should not be a
> > double-quote after the charset=.
> >
> > You should be aware that you can generate the page in any valid character
> > set and weblogic's servlet engine will convert the results to Unicode for
> > you. For example, you might choose to use the Big5 character encoding for
> > a Traditional Chinese page. The page directive will result in data POSTed
> > to you being converted to a Java String (and thus Unicode).
> >
> > If you want to get access to the specific *characters* in the String you
> > can use the various methods for accessing chars and char arrays in the
> > String class in conjunction with the Character class to access all kinds
> > of useful information about specific characters. Using getBytes() the way
> > you've described will result in converting the characters to a byte
> > oriented encoding, such as UTF-8, which is not really what you want to do
> > in this case.
> >
> > Best Regards,
> >
> > Addison
> >
> > ===========================================================
> > Addison P. Phillips                    Principal Consultant
> > Inter-Locale LLC                http://www.inter-locale.com
> > Los Gatos, CA, USA          mailto:[EMAIL PROTECTED]
> >
> > +1 408.210.3569 (mobile)              +1 408.904.4762 (fax)
> > ===========================================================
> > Globalization Engineering & Consulting Services
> >
> > On Wed, 29 Nov 2000, Bhalchandra Patil wrote:
> >
> > > Hi,
> > >
> > > i am running an servlet on weblogic ( jre 1.2). The html page should
> accept
> > > input in any character set say chinese. That value is posted to the
> servlet.
> > > I want to retrieve the unicode value of the character in the servlet.
> > >
> > > In the html page, i have specified meta tag
> > > <META HTTP_EQUIV="Content-Type" content="text-html; charset="UTF-8">
> > >
> > > in servlet, i am using String str  = request.getParameter("name")
> > > str.getBytes("UTF8") does not work.
> > >
> > > What should i do to get the unicode values of the characters entered.
> > >
> > > Please help!!!!!
> > >
> > > regards,
> > > bhala
> > >
> > >
> > >
> > >
> >
> 
> 

Reply via email to