This is exactly what should happen. You are working with characters not bytes
hence you see 1 UTF-8 character.

Mark

> -----Original Message-----
> From: Asher Tarnopolski [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, July 04, 2004 11:18 PM
> To: Tomcat Users List
> Subject: Re: utf-8 with tomcat 5: second round 
> 
> hey mark, thanks for response.
> i run the code i pasted below.
> for example, i enter one hebrew letter. it's utf
> code is 1488.
> on tc 4.0.xx i get the following results:
> 
> 7 (the length of its utf-8 code)
>  א (the letter itself in utf-8 encoding)
>  א(same as above parsed to be visible in browser)
> 
> in tc 5 i get this:
> 1(which already lets me know that this is not really utf-8)
> the entered hebrew letter
> the entered hebrew letter (nothing is parsed, so '&' signed 
> wasn't even met)
> this is it.
> 
> ----- Original Message -----
> From: "Mark Thomas" <[EMAIL PROTECTED]>
> To: "'Tomcat Users List'" <[EMAIL PROTECTED]>; "'Asher
> Tarnopolski'" <[EMAIL PROTECTED]>
> Sent: Sunday, July 04, 2004 8:46 PM
> Subject: RE: utf-8 with tomcat 5: second round
> 
> 
> > Asher,
> >
> > A few questions...
> >
> > What do you put in the text box on the form and what output 
> do you see?
> >
> > Are you really using "<form act="/tests/utf.jsp" 
> method=post>" or do you
> mean
> > <form action="/tests/utf.jsp" method=post>?
> >
> > When I did my test I copied your UTF-8 character form the 
> bugzilla report
> and
> > pasted into the text box. I was seeing question marks in 
> the output until
> I
> > added the <[EMAIL PROTECTED] pageEncoding="UTF-8"%> The test was on XP 
> (as per the
> bug
> > report) and I assume you used IE as the browser.
> >
> > The URI encoding is a red herring in this case. Because you 
> are using post
> it is
> > only the request encoding that matters.
> >
> > The full text of my test JSP is below.
> >
> > Mark
> >
> > <%@ page language="java" import="java.lang.*,java.util.*" %>
> > <%@ page pageEncoding="UTF-8" %>
> > <html>
> > <body>
> >
> > <form action="bug29900.jsp" method=post>
> > <input type=text name=source >
> > <input type=submit>
> > <form>
> > <p>
> >
> > <%
> > request.setCharacterEncoding("UTF-8");
> >
> > if(request.getParameter("source")!=null)
> > {
> >   out.println(request.getParameter("source").length()+"<p>");
> >
> >   out.println(request.getParameter("source"));
> >
> >   StringBuffer sb = new StringBuffer();
> >   for(int i=0; i<request.getParameter("source").length(); i++)
> >   {
> >     if(request.getParameter("source").charAt(i) == '&')
> >       sb.append("&");
> >     else
> >       sb.append(request.getParameter("source").charAt(i));
> >
> >   }
> >   out.println("<p>"+ sb.toString());
> > }
> > %>
> >
> > </body>
> > </html>
> >
> >
> >
> > > -----Original Message-----
> > > From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
> > > Sent: Sunday, July 04, 2004 6:25 PM
> > > To: [EMAIL PROTECTED]
> > > Subject: utf-8 with tomcat 5: second round
> > >
> > > hi folks,
> > > i've published a question about it a couple of days ago, but
> > > didn't get any responses.
> > > i've tried some things i found in bugzilla, but they didn't
> > > help. so, i wanna try to get your help once more.
> > > once more about my problem:
> > > i try to send utf-8 encoded parameters in POST body, but they
> > > arrived encoded in ISO...
> > > this worked perfectly with tomcat 4.0.x.
> > > from the info i've got from a developer at bugzilla i learned
> > > that the difference between tc4.0 and tc5
> > > that causes the change is actually in coyote http1.1
> > > connector. there is an  attribute
> > > called useBodyEncodingForURI which was set to "true" in tc4,
> > > but became "false" in tc5.
> > > setting it to "true" together with <%@ page
> > > pageEncoding="UTF-8" %> and
> > > <%request.setCharacterEncoding("UTF-8");%> will make the 
> difference.
> > > i made the change, the jsp tags are in the code and coyote
> > > settings look like this now:
> > >
> > > <code>
> > > <!-- Define a non-SSL Coyote HTTP/1.1 Connector on port 8080 -->
> > >     <Connector port="8080"
> > >                maxThreads="150" minSpareThreads="25"
> > > maxSpareThreads="75"
> > >                enableLookups="false" redirectPort="8443"
> > > acceptCount="100"
> > >                debug="0" connectionTimeout="20000"
> > >                useBodyEncodingForURI="true"
> > >                disableUploadTimeout="true" />
> > > </code>
> > >
> > > but this doesn't help! another request to bugzilla didn't
> > > help either, i was told that this is not a bug in tomcat,
> > > so they are not going to deal with the question. well, may be
> > > it's not a tomcat bug, but it should be some kind of bug.
> > > any ideas?
> > >
> > > my testing code comes here:
> > >
> > > <code>
> > >
> > > <[EMAIL PROTECTED] contentType="text/html; charset=utf-8"%>
> > > <[EMAIL PROTECTED] pageEncoding="utf-8"%>
> > > <html>
> > > <head>
> > > </head>
> > > <body>
> > >
> > > <form act="/tests/utf.jsp" method=post>
> > > <input type=text name=source >
> > > <input type=submit>
> > > <form>
> > > <p>
> > >
> > > <%
> > > request.setCharacterEncoding("UTF-8");
> > >
> > > if(request.getParameter("source")!=null)
> > > {
> > >   out.println(request.getParameter("source").length()+"<p>");
> > >
> > >   out.println(request.getParameter("source"));
> > >
> > >   StringBuffer sb = new StringBuffer();
> > >   for(int i=0; i<request.getParameter("source").length(); i++)
> > >   {
> > >     if(request.getParameter("source").charAt(i) == '&')
> > >       sb.append("&");
> > >     else
> > >       sb.append(request.getParameter("source").charAt(i));
> > >
> > >   }
> > >   out.println("<p>"+ sb.toString());
> > > }
> > > %>
> > >
> > > </body>
> > > </html>
> > >
> > >
> > > </code>
> > >
> >
> >
> >
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to