Problems with utf-8 encoding

2005-09-19 Thread Yair Zohar


Hello,
I'm using Tomcat 4.1.18
I'm trying to read hebrew data in utf-8 encoding from the database. As a 
check I entered a utf-8 encoded 'alef' letter to the database field.
(I see it in the database as one letter 'alef'). The jsp page that 
displays the data, prints two chars instead of one. I checked the values 
of these chars and
they are 215 114, which are the utf-8 combination to create the letter 
'alef'  (so I was told).


jps code:

%@ page language=java contentType=text/html;charset=UTF-8 
pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %


jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
jsp:setProperty name=tables property=*/

% request.setCharacterEncoding(UTF-8);%

html

head
   meta http-equiv=Content-Type content=text/html; charset=UTF-8
/head


I tried it in all the combinations of the 'UTF-8' directives.
Does some have an idea how can I tell tomcat to display it as one char 
(the letter alef) and not two separated gibrish chars?

Or maybe it's another issue ?
Thanks ahead,
Yair.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Problems with utf-8 encoding - continue

2005-09-19 Thread Yair Zohar


sorry for the double mail,
I forgot to add my server.xml encoding definitions:

Connector className=org.apache.coyote.tomcat4.CoyoteConnector
  port=8080 URIEncoding=UTF-8 useBodyEncodingForURI=true
  minProcessors=5 maxProcessors=75
  enableLookups=true redirectPort=8443
  acceptCount=100 debug=0 connectionTimeout=2
  useURIValidationHack=false disableUploadTimeout=true  /

I tried it with and without the useBodyEncodingForURI=truedirective.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with utf-8 encoding

2005-09-19 Thread Anto Paul
On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:
 
 Hello,
 I'm using Tomcat 4.1.18
 I'm trying to read hebrew data in utf-8 encoding from the database. As a
 check I entered a utf-8 encoded 'alef' letter to the database field.
 (I see it in the database as one letter 'alef'). The jsp page that
 displays the data, prints two chars instead of one. I checked the values
 of these chars and
 they are 215 114, which are the utf-8 combination to create the letter
 'alef'  (so I was told).
 
 jps code:
 
 %@ page language=java contentType=text/html;charset=UTF-8
 pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %
 
 jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
 jsp:setProperty name=tables property=*/
 
 % request.setCharacterEncoding(UTF-8);%
 
 html
 
 head
 meta http-equiv=Content-Type content=text/html; charset=UTF-8
 /head
 

Move % request.setCharacterEncoding(UTF-8);% to before jsp:useBean tag.

-- 
rgds
Anto Paul

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with utf-8 encoding

2005-09-19 Thread Yair Zohar

Anto Paul wrote:


On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:
 


Hello,
I'm using Tomcat 4.1.18
I'm trying to read hebrew data in utf-8 encoding from the database. As a
check I entered a utf-8 encoded 'alef' letter to the database field.
(I see it in the database as one letter 'alef'). The jsp page that
displays the data, prints two chars instead of one. I checked the values
of these chars and
they are 215 114, which are the utf-8 combination to create the letter
'alef'  (so I was told).

jps code:

%@ page language=java contentType=text/html;charset=UTF-8
pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %

jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
jsp:setProperty name=tables property=*/

% request.setCharacterEncoding(UTF-8);%

html

head
   meta http-equiv=Content-Type content=text/html; charset=UTF-8
/head

   



Move % request.setCharacterEncoding(UTF-8);% to before jsp:useBean tag.

 


Thanks for replying,
It didn't fix the problem, I still see the same two chars.
Yair.



Re: Problems with utf-8 encoding - continue

2005-09-19 Thread Jilles van Gurp

Why aren't you using setContentType(text/html, utf-8) on the response?

What content-type is the server actually returning (use the live http 
headers extension for firefox or something similar to find out).


What database and jdbc driver are you using? What method are you using 
to store the string in the database?


I've had utf-8 trouble with several databases. For example mysql 4.1 + 
the latest jdbc driver + setCharacterStream had some strange effects. 
First of all you need to tell mysql to use utf-8 (it defaults to 
something else) and even if you do that setCharacterStream has some 
issues that go away if you use setString. Oracle on the other hand 
cannot insert strings larger than 4KB with setString so you need to use 
setCharacterStream. Incidently, the mysql driver implementation of 
setCharacterString is implemented using setString!


Regards,

Jilles

Yair Zohar wrote:


sorry for the double mail,
I forgot to add my server.xml encoding definitions:

Connector className=org.apache.coyote.tomcat4.CoyoteConnector
  port=8080 URIEncoding=UTF-8 
useBodyEncodingForURI=true

  minProcessors=5 maxProcessors=75
  enableLookups=true redirectPort=8443
  acceptCount=100 debug=0 connectionTimeout=2
  useURIValidationHack=false 
disableUploadTimeout=true  /


I tried it with and without the useBodyEncodingForURI=truedirective.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with utf-8 encoding

2005-09-19 Thread Anto Paul
Does your browser supports hebrew ?. If you are just getting the data
from database and displaying, it should work fine. What database and
JDBC driver you are using ?.


On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:
 Anto Paul wrote:
 
 On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:
 
 
 Hello,
 I'm using Tomcat 4.1.18
 I'm trying to read hebrew data in utf-8 encoding from the database. As a
 check I entered a utf-8 encoded 'alef' letter to the database field.
 (I see it in the database as one letter 'alef'). The jsp page that
 displays the data, prints two chars instead of one. I checked the values
 of these chars and
 they are 215 114, which are the utf-8 combination to create the letter
 'alef'  (so I was told).
 
 jps code:
 
 %@ page language=java contentType=text/html;charset=UTF-8
 pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %
 
 jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
 jsp:setProperty name=tables property=*/
 
 % request.setCharacterEncoding(UTF-8);%
 
 html
 
 head
 meta http-equiv=Content-Type content=text/html; charset=UTF-8
 /head
 
 
 
 
 Move % request.setCharacterEncoding(UTF-8);% to before jsp:useBean tag.
 
 
 
 Thanks for replying,
 It didn't fix the problem, I still see the same two chars.
 Yair.
 
 
 


-- 
rgds
Anto Paul

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with utf-8 encoding - continue

2005-09-19 Thread Christoph Kutzinski

Jilles van Gurp wrote:
Oracle on the other hand 
cannot insert strings larger than 4KB with setString so you need to use 
setCharacterStream.


FYI:

This is common knowledge that used to be right, but isn't anymore.
With the Oracle 10g JDBC driver you can set arbitrary length strings 
with setString


http://www.oracle.com/technology/sample_code/tech/java/codesnippet/jdbc/clob10g/handlingclobsinoraclejdbc10g.html


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Problems with utf-8 encoding

2005-09-19 Thread Guy Katz
put an encoding filter in front of your servlet/jsp's that sets a UTF-8 
encoding for incoming requests and outgoing responses. its your safest bet for 
tomcat 4 as far as i remember.

-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 9:43 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Anto Paul wrote:

On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:
  

Hello,
I'm using Tomcat 4.1.18
I'm trying to read hebrew data in utf-8 encoding from the database. As a
check I entered a utf-8 encoded 'alef' letter to the database field.
(I see it in the database as one letter 'alef'). The jsp page that
displays the data, prints two chars instead of one. I checked the values
of these chars and
they are 215 114, which are the utf-8 combination to create the letter
'alef'  (so I was told).

jps code:

%@ page language=java contentType=text/html;charset=UTF-8
pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %

jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
jsp:setProperty name=tables property=*/

% request.setCharacterEncoding(UTF-8);%

html

head
meta http-equiv=Content-Type content=text/html; charset=UTF-8
/head




Move % request.setCharacterEncoding(UTF-8);% to before jsp:useBean tag.

  

Thanks for replying,
It didn't fix the problem, I still see the same two chars.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with utf-8 encoding - continue

2005-09-19 Thread Yair Zohar


Jilles van Gurp wrote:



Why aren't you using setContentType(text/html, utf-8) on the response?


As I use jsp, I don't know how can I control the response that way.



What content-type is the server actually returning (use the live http 
headers extension for firefox or something similar to find out).


I couldn't find if this extension is installed, or how to install it.
In the page info :

Type: text/html
Encoding: UTF-8

Meta:
Content-Type text/html; charset=UTF-8



What database and jdbc driver are you using? What method are you using 
to store the string in the database?


I use mysql 4.1.14 + connector 3.1.10
the url for the driver is: 
jdbc:mysql://+Utils.getServerName()+:3306/+Utils.getDatabaseName()+?characterEncoding=UTF-8characterSetResults=UTF-8


the tables definitions:
ENGINE=MyISAM DEFAULT CHARSET=utf8



I've had utf-8 trouble with several databases. For example mysql 4.1 + 
the latest jdbc driver + setCharacterStream had some strange effects. 
First of all you need to tell mysql to use utf-8 (it defaults to 
something else) and even if you do that setCharacterStream has some 
issues that go away if you use setString. Oracle on the other hand 
cannot insert strings larger than 4KB with setString so you need to 
use setCharacterStream. Incidently, the mysql driver implementation of 
setCharacterString is implemented using setString!


I use Statement class executeUpdate(String str) method to update and 
executeQuery(String str) to query the database.




Regards,

Jilles

Yair Zohar wrote:



sorry for the double mail,
I forgot to add my server.xml encoding definitions:

Connector className=org.apache.coyote.tomcat4.CoyoteConnector
  port=8080 URIEncoding=UTF-8 
useBodyEncodingForURI=true

  minProcessors=5 maxProcessors=75
  enableLookups=true redirectPort=8443
  acceptCount=100 debug=0 connectionTimeout=2
  useURIValidationHack=false 
disableUploadTimeout=true  /


I tried it with and without the useBodyEncodingForURI=truedirective.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with utf-8 encoding

2005-09-19 Thread Yair Zohar

Guy Katz wrote:


put an encoding filter in front of your servlet/jsp's that sets a UTF-8 
encoding for incoming requests and outgoing responses. its your safest bet for 
tomcat 4 as far as i remember.

-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 9:43 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Anto Paul wrote:

 


On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:


   


Hello,
I'm using Tomcat 4.1.18
I'm trying to read hebrew data in utf-8 encoding from the database. As a
check I entered a utf-8 encoded 'alef' letter to the database field.
(I see it in the database as one letter 'alef'). The jsp page that
displays the data, prints two chars instead of one. I checked the values
of these chars and
they are 215 114, which are the utf-8 combination to create the letter
'alef'  (so I was told).

jps code:

%@ page language=java contentType=text/html;charset=UTF-8
pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %

jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
jsp:setProperty name=tables property=*/

% request.setCharacterEncoding(UTF-8);%

html

head
  meta http-equiv=Content-Type content=text/html; charset=UTF-8
/head

  

 


Move % request.setCharacterEncoding(UTF-8);% to before jsp:useBean tag.



   


Thanks for replying,
It didn't fix the problem, I still see the same two chars.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



 

Guy, can you direct me to practical documentation on implementing such a 
filter ?


RE: Problems with utf-8 encoding

2005-09-19 Thread Guy Katz
google it.
there's a lot.

-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 11:08 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Guy Katz wrote:

put an encoding filter in front of your servlet/jsp's that sets a UTF-8 
encoding for incoming requests and outgoing responses. its your safest bet for 
tomcat 4 as far as i remember.

-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 9:43 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Anto Paul wrote:

  

On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:
 



Hello,
I'm using Tomcat 4.1.18
I'm trying to read hebrew data in utf-8 encoding from the database. As a
check I entered a utf-8 encoded 'alef' letter to the database field.
(I see it in the database as one letter 'alef'). The jsp page that
displays the data, prints two chars instead of one. I checked the values
of these chars and
they are 215 114, which are the utf-8 combination to create the letter
'alef'  (so I was told).

jps code:

%@ page language=java contentType=text/html;charset=UTF-8
pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %

jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
jsp:setProperty name=tables property=*/

% request.setCharacterEncoding(UTF-8);%

html

head
   meta http-equiv=Content-Type content=text/html; charset=UTF-8
/head

   

  

Move % request.setCharacterEncoding(UTF-8);% to before jsp:useBean tag.

 



Thanks for replying,
It didn't fix the problem, I still see the same two chars.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



  

Guy, can you direct me to practical documentation on implementing such a 
filter ?

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with utf-8 encoding

2005-09-19 Thread Yair Zohar

Guy Katz wrote:


google it.
there's a lot.

-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 11:08 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Guy Katz wrote:

 


put an encoding filter in front of your servlet/jsp's that sets a UTF-8 
encoding for incoming requests and outgoing responses. its your safest bet for 
tomcat 4 as far as i remember.

-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 9:43 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Anto Paul wrote:



   


On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:


  

 


Hello,
I'm using Tomcat 4.1.18
I'm trying to read hebrew data in utf-8 encoding from the database. As a
check I entered a utf-8 encoded 'alef' letter to the database field.
(I see it in the database as one letter 'alef'). The jsp page that
displays the data, prints two chars instead of one. I checked the values
of these chars and
they are 215 114, which are the utf-8 combination to create the letter
'alef'  (so I was told).

jps code:

%@ page language=java contentType=text/html;charset=UTF-8
pageEncoding=UTF-8 info=Tables Handler import=tablesHandler.* %

jsp:useBean id=tables scope=page class=tablesHandler.TableViewer /
jsp:setProperty name=tables property=*/

% request.setCharacterEncoding(UTF-8);%

html

head
 meta http-equiv=Content-Type content=text/html; charset=UTF-8
/head

 



   


Move % request.setCharacterEncoding(UTF-8);% to before jsp:useBean tag.



  

 


Thanks for replying,
It didn't fix the problem, I still see the same two chars.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





   

Guy, can you direct me to practical documentation on implementing such a 
filter ?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



 


Hi again,
I implemented the SetCharacterEncodingFilter from the tomcat 4 examples,
In order to check the control I have on the character encoding of the 
request and

response I changed the doFilter method to be:

public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
   throws IOException, ServletException {

   request.setCharacterEncoding(UTF-8);
   System.out.println(Request: +request.getCharacterEncoding());
  
   response.setContentType(text/html; charset=UTF-8);

   System.out.println(Response: +response.getCharacterEncoding());
  


   // Pass control on to the next filter
   chain.doFilter(request, response);

   }

request.getCharacterEncoding() returns null. It also returns null if I 
put the request.setCharacterEncoding(UTF-8);

as a remark. (my page contains utf-8 encoding directives).
The response however is set to UTF-8.
Can this explain my problem using UTF-8 encoding ?
Does anybody know how to solve it ?



Re: Problems with utf-8 encoding

2005-09-19 Thread Yair Zohar


Yair Zohar wrote:


Guy Katz wrote:


google it.
there's a lot.

-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 11:08 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Guy Katz wrote:

 

put an encoding filter in front of your servlet/jsp's that sets a 
UTF-8 encoding for incoming requests and outgoing responses. its 
your safest bet for tomcat 4 as far as i remember.


-Original Message-
From: Yair Zohar [mailto:[EMAIL PROTECTED]
Sent: Monday, September 19, 2005 9:43 AM
To: Tomcat Users List
Subject: Re: Problems with utf-8 encoding


Anto Paul wrote:



  


On 9/19/05, Yair Zohar [EMAIL PROTECTED] wrote:


 



Hello,
I'm using Tomcat 4.1.18
I'm trying to read hebrew data in utf-8 encoding from the 
database. As a

check I entered a utf-8 encoded 'alef' letter to the database field.
(I see it in the database as one letter 'alef'). The jsp page that
displays the data, prints two chars instead of one. I checked the 
values

of these chars and
they are 215 114, which are the utf-8 combination to create the 
letter

'alef'  (so I was told).

jps code:

%@ page language=java contentType=text/html;charset=UTF-8
pageEncoding=UTF-8 info=Tables Handler 
import=tablesHandler.* %


jsp:useBean id=tables scope=page 
class=tablesHandler.TableViewer /

jsp:setProperty name=tables property=*/

% request.setCharacterEncoding(UTF-8);%

html

head
 meta http-equiv=Content-Type content=text/html; charset=UTF-8
/head

 

   
  


Move % request.setCharacterEncoding(UTF-8);% to before 
jsp:useBean tag.




 



Thanks for replying,
It didn't fix the problem, I still see the same two chars.
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





  


Guy, can you direct me to practical documentation on implementing 
such a filter ?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



 


Hi again,
I implemented the SetCharacterEncodingFilter from the tomcat 4 examples,
In order to check the control I have on the character encoding of the 
request and

response I changed the doFilter method to be:

public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
   throws IOException, ServletException {

   request.setCharacterEncoding(UTF-8);
   System.out.println(Request: +request.getCharacterEncoding());
 response.setContentType(text/html; charset=UTF-8);
   System.out.println(Response: +response.getCharacterEncoding());
 
   // Pass control on to the next filter

   chain.doFilter(request, response);

   }

request.getCharacterEncoding() returns null. It also returns null if I 
put the request.setCharacterEncoding(UTF-8);

as a remark. (my page contains utf-8 encoding directives).
The response however is set to UTF-8.
Can this explain my problem using UTF-8 encoding ?
Does anybody know how to solve it ?



The filter does the work, it solved the problem.
Thanks all who helped,
Yair.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Tomcat 5.0.30 - UTF-8 encoding not working

2005-06-02 Thread Karanjkar, Sanjay V \(IT\)
Hi msjava,

I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to 
Tomcat5.0.30/JDK1.4.2.
On ServletExec, our app was showing/saving UTF-8 strings correctly. However, 
after migration to Tomcat, the pages are not showing UTF-8 encoded content 
correctly.

All our JSP pages contain the following:
---
%@ page import=java.util.*, java.lang.* contentType=text/html; 
charset=UTF-8 % ...
...
META http-equiv=Content-Type content=text/html; charset=UTF-8


The web.xml file contains:
-
!DOCTYPE web-app PUBLIC -//Sun Microsystems, Inc.//DTD Web Application 
2.3//EN
http://java.sun.com/dtd/web-app_2_3.dtd;



A filter servlet for all JSPs:
-
public void doFilter(ServletRequest request, ServletResponse response, 
FilterChain filterChain)
{
  try 
  {
if (null != encoding)
{
  request.setCharacterEncoding(encoding);
}
filterChain.doFilter(request, response); 




Do I need to do something else for Tomcat? In particular, do I need to do the 
stuff mentioned here: http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8

 
Thanks in advance
Sanjay 

 
NOTICE: If received in error, please destroy and notify sender.  Sender does 
not waive confidentiality or privilege, and use is prohibited. 
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Tomcat 5.0.30 - UTF-8 encoding not working

2005-06-02 Thread Mark Thomas

Karanjkar, Sanjay V (IT) wrote:

Hi msjava,

I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to 
Tomcat5.0.30/JDK1.4.2.
On ServletExec, our app was showing/saving UTF-8 strings correctly. However, 
after migration to Tomcat, the pages are not showing UTF-8 encoded content 
correctly.


If your are POSTing your data, request.setCharacterEncoding(UTF-8) 
should do the trick but you MUST call this before any parameters are read.


If you are encoding your data in the URI, you will need to set the 
URIEncoding attribute on the coyote connector to UTF-8 to ensure that 
the URI is decoded correctly.



Do I need to do something else for Tomcat? In particular, do I need to do the 
stuff mentioned here: http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8

1. Yes
2  3 - No . These might work under some circumstances but 2. is trying 
to change a read-only property and 3. is hacking around the data not 
being handled correctly in the first place.


When I am testing this, I use the following JSP to make sure Tomcat is 
correctly configured. A clean Tomcat install should only require 
URIEncoding=UTF-8 to be added to the connector in server.xml for these 
to work for any UTF-8 data. You should test it with both method=post 
and method=get


%@ page contentType=text/html; charset=UTF-8 %
!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN
html
  head
titleUTF-8 test page/title
  /head
  body
pUTF-8 data posted to this form was:
%
  request.setCharacterEncoding(UTF-8);
  out.print(request.getParameter(mydata));
%

/p
form method=post action=index.jsp
  enctype=application/x-www-form-urlencoded
  input type=text name=mydata
  input type=submit value=Submit /
  input type=reset value=Reset /
/form
  /body
/html

If this works, then the chances are your app isn't quite right. If you 
have a test case that doesn't work (try and make it as simple as 
possible) post it to the list and I'll take a look.


Mark

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Tomcat 5.0.30 - UTF-8 encoding not working

2005-06-02 Thread Karanjkar, Sanjay V \(IT\)
Hi,

Apologies, my previous mail was missing a few things...

Correction - Tomcat *does* show UTF-8 encoded data correctly (after fetching 
from the database). It also saves UTF-8 encoded data correctly (I verified this 
by looking at a saved record). However, the place where it fails is when I pass 
UTF-8 data as a URL parameter to a popup screen.

In the attached screenshot, you can see that the main screen fetches and 
displays UTF-8 data correctly but the popup screen (which pops up on clicking 
the Edit button) shows garbled characters.

I checked the encoding on the popup screen and it does show me UTF-8. Am I 
losing the encoding when constructing the URL string? Note that this all works 
fine when I use ServletExec..

Fyi, the popup screen is launched via the following javascript code:

function editConfirmComment()
{
  var form = document.frm_update;
  var confirmComment = form.updComment.value;

  var url = '../../fc3Common/view/externalCommentDetails.jsp?dummy=dummy'
  + 'confirmComment=' + encodeURIComponent(confirmComment);
  popupWindow(url, 'ExternalCommentDetails', 480, 240);
}

Mark, in this case would I need to do as you said in your comments?
 If you are encoding your data in the URI, you will need to set the 
 URIEncoding attribute on the coyote connector to UTF-8 to ensure 
 that the URI is decoded correctly.
 A clean Tomcat install should only require URIEncoding=UTF-8 
 to be added to the connector in server.xml for these to work 
 for any UTF-8 data.

One issue is that my app would be hosted on a web farm. As the above looks to 
be a server-wide change, it will affect other apps hosted on the instance too, 
right?

Thanks and regards
Sanjay
Morgan Stanley

-Original Message-
From: Mark Thomas [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 03, 2005 12:27 AM
To: Tomcat Users List
Subject: Re: Tomcat 5.0.30 - UTF-8 encoding not working

Karanjkar, Sanjay V (IT) wrote:
 Hi msjava,
 
 I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to 
 Tomcat5.0.30/JDK1.4.2.
 On ServletExec, our app was showing/saving UTF-8 strings correctly. However, 
 after migration to Tomcat, the pages are not showing UTF-8 encoded content 
 correctly.

If your are POSTing your data, request.setCharacterEncoding(UTF-8)
should do the trick but you MUST call this before any parameters are read.

If you are encoding your data in the URI, you will need to set the URIEncoding 
attribute on the coyote connector to UTF-8 to ensure that the URI is decoded 
correctly.

 Do I need to do something else for Tomcat? In particular, do I need to 
 do the stuff mentioned here: 
 http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8
1. Yes
2  3 - No . These might work under some circumstances but 2. is trying to 
change a read-only property and 3. is hacking around the data not being handled 
correctly in the first place.

When I am testing this, I use the following JSP to make sure Tomcat is 
correctly configured. A clean Tomcat install should only require 
URIEncoding=UTF-8 to be added to the connector in server.xml for these to 
work for any UTF-8 data. You should test it with both method=post 
and method=get

%@ page contentType=text/html; charset=UTF-8 % !DOCTYPE HTML PUBLIC 
-//W3C//DTD HTML 4.01 Transitional//EN html
   head
 titleUTF-8 test page/title
   /head
   body
 pUTF-8 data posted to this form was:
 %
   request.setCharacterEncoding(UTF-8);
   out.print(request.getParameter(mydata));
 %

 /p
 form method=post action=index.jsp
   enctype=application/x-www-form-urlencoded
   input type=text name=mydata
   input type=submit value=Submit /
   input type=reset value=Reset /
 /form
   /body
/html

If this works, then the chances are your app isn't quite right. If you have a 
test case that doesn't work (try and make it as simple as
possible) post it to the list and I'll take a look.

Mark

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED] 

 
NOTICE: If received in error, please destroy and notify sender.  Sender does 
not waive confidentiality or privilege, and use is prohibited. 
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Tomcat 5.0.30 - UTF-8 encoding not working

2005-06-02 Thread Karanjkar, Sanjay V \(IT\)
 
Hi Mark,

Adding URIEncoding=UTF-8 to the coyote connector did the trick. Thanks a 
bunch!
My guess is that our app will be hosted on a tomcat instance that only hosts 
UTF-8-aware apps.

Thanks and regards
Sanjay

-Original Message-
From: Karanjkar, Sanjay V (IT) 
Sent: Friday, June 03, 2005 10:44 AM
To: Tomcat Users List; [EMAIL PROTECTED]
Subject: RE: Tomcat 5.0.30 - UTF-8 encoding not working

Hi,

Apologies, my previous mail was missing a few things...

Correction - Tomcat *does* show UTF-8 encoded data correctly (after fetching 
from the database). It also saves UTF-8 encoded data correctly (I verified this 
by looking at a saved record). However, the place where it fails is when I pass 
UTF-8 data as a URL parameter to a popup screen.

In the attached screenshot, you can see that the main screen fetches and 
displays UTF-8 data correctly but the popup screen (which pops up on clicking 
the Edit button) shows garbled characters.

I checked the encoding on the popup screen and it does show me UTF-8. Am I 
losing the encoding when constructing the URL string? Note that this all works 
fine when I use ServletExec..

Fyi, the popup screen is launched via the following javascript code:

function editConfirmComment()
{
  var form = document.frm_update;
  var confirmComment = form.updComment.value;

  var url = '../../fc3Common/view/externalCommentDetails.jsp?dummy=dummy'
  + 'confirmComment=' + encodeURIComponent(confirmComment);
  popupWindow(url, 'ExternalCommentDetails', 480, 240);
}

Mark, in this case would I need to do as you said in your comments?
 If you are encoding your data in the URI, you will need to set the 
 URIEncoding attribute on the coyote connector to UTF-8 to ensure 
 that the URI is decoded correctly.
 A clean Tomcat install should only require URIEncoding=UTF-8 
 to be added to the connector in server.xml for these to work for any 
 UTF-8 data.

One issue is that my app would be hosted on a web farm. As the above looks to 
be a server-wide change, it will affect other apps hosted on the instance too, 
right?

Thanks and regards
Sanjay
Morgan Stanley

-Original Message-
From: Mark Thomas [mailto:[EMAIL PROTECTED]
Sent: Friday, June 03, 2005 12:27 AM
To: Tomcat Users List
Subject: Re: Tomcat 5.0.30 - UTF-8 encoding not working

Karanjkar, Sanjay V (IT) wrote:
 Hi msjava,
 
 I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to 
 Tomcat5.0.30/JDK1.4.2.
 On ServletExec, our app was showing/saving UTF-8 strings correctly. However, 
 after migration to Tomcat, the pages are not showing UTF-8 encoded content 
 correctly.

If your are POSTing your data, request.setCharacterEncoding(UTF-8)
should do the trick but you MUST call this before any parameters are read.

If you are encoding your data in the URI, you will need to set the URIEncoding 
attribute on the coyote connector to UTF-8 to ensure that the URI is decoded 
correctly.

 Do I need to do something else for Tomcat? In particular, do I need to 
 do the stuff mentioned here:
 http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8
1. Yes
2  3 - No . These might work under some circumstances but 2. is trying to 
change a read-only property and 3. is hacking around the data not being handled 
correctly in the first place.

When I am testing this, I use the following JSP to make sure Tomcat is 
correctly configured. A clean Tomcat install should only require 
URIEncoding=UTF-8 to be added to the connector in server.xml for these to 
work for any UTF-8 data. You should test it with both method=post 
and method=get

%@ page contentType=text/html; charset=UTF-8 % !DOCTYPE HTML PUBLIC 
-//W3C//DTD HTML 4.01 Transitional//EN html
   head
 titleUTF-8 test page/title
   /head
   body
 pUTF-8 data posted to this form was:
 %
   request.setCharacterEncoding(UTF-8);
   out.print(request.getParameter(mydata));
 %

 /p
 form method=post action=index.jsp
   enctype=application/x-www-form-urlencoded
   input type=text name=mydata
   input type=submit value=Submit /
   input type=reset value=Reset /
 /form
   /body
/html

If this works, then the chances are your app isn't quite right. If you have a 
test case that doesn't work (try and make it as simple as
possible) post it to the list and I'll take a look.

Mark

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

 
NOTICE: If received in error, please destroy and notify sender.  Sender does 
not waive confidentiality or privilege, and use is prohibited. 

 
NOTICE: If received in error, please destroy and notify sender.  Sender does 
not waive confidentiality or privilege, and use is prohibited

RE: tomcat 5 and UTF-8 encoding

2004-12-07 Thread Allistair Crossley
someone else had a similar issue with hebrew and you can read what happened 
here:

http://issues.apache.org/bugzilla/show_bug.cgi?id=32500

Allistair.

 -Original Message-
 From: Peter Johnson [mailto:[EMAIL PROTECTED]
 Sent: 07 December 2004 03:41
 To: Tomcat Users List
 Subject: Re: tomcat 5 and UTF-8 encoding
 
 
 Sarah,
 
 I recall a post a week or so ago regarding the contentType 
 string losing 
 the space after the ;
 
 This may be causing the issue.
 
 PJ
 
 Sarah wrote:
 
 Hi,
I need to use jsp to display some data in Japanese 
 character from MS SQL server database.  I have already set 
 the encoding in jsp to be:
  
 %@ page language=java contentType=text/html; charset=UTF-8 %   
 If I use tomcat version 5.0.18, then the japanese character 
 is displayed correctly.  However, if I use 5.0.28 or 5.5.4, 
 the characters are something like ???.  If I right click 
 the html page generated from jsp on the above versions, I can 
 see the encoding to be Western instead of UTF-8 like what 
 happened with 5.0.18.  Does anyone know what cause this 
 problem and if any configuration of Tomcat needs to be made.  
 Thank you very much for your help.
  
  
 Sarah
 
  
 -
 Do you Yahoo!?
  Read only the mail you want - Yahoo! Mail SpamGuard.
   
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 


FONT SIZE=1 FACE=VERDANA,ARIAL COLOR=BLUE 
---
QAS Ltd.
Developers of QuickAddress Software
a href=http://www.qas.com;www.qas.com/a
Registered in England: No 2582055
Registered in Australia: No 082 851 474
---
/FONT


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: tomcat 5 and UTF-8 encoding

2004-12-07 Thread Shapira, Yoav

Hi,
There are tons of other such stories in the archives of this list and of
Bugzilla.  Nothing new here.

Yoav Shapira http://www.yoavshapira.com


-Original Message-
From: Allistair Crossley [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 07, 2004 4:04 AM
To: Tomcat Users List
Subject: RE: tomcat 5 and UTF-8 encoding

someone else had a similar issue with hebrew and you can read what
happened
here:

http://issues.apache.org/bugzilla/show_bug.cgi?id=32500

Allistair.

 -Original Message-
 From: Peter Johnson [mailto:[EMAIL PROTECTED]
 Sent: 07 December 2004 03:41
 To: Tomcat Users List
 Subject: Re: tomcat 5 and UTF-8 encoding


 Sarah,

 I recall a post a week or so ago regarding the contentType
 string losing
 the space after the ;

 This may be causing the issue.

 PJ

 Sarah wrote:

 Hi,
I need to use jsp to display some data in Japanese
 character from MS SQL server database.  I have already set
 the encoding in jsp to be:
 
 %@ page language=java contentType=text/html; charset=UTF-8 % 

 If I use tomcat version 5.0.18, then the japanese character
 is displayed correctly.  However, if I use 5.0.28 or 5.5.4,
 the characters are something like ???.  If I right click
 the html page generated from jsp on the above versions, I can
 see the encoding to be Western instead of UTF-8 like what
 happened with 5.0.18.  Does anyone know what cause this
 problem and if any configuration of Tomcat needs to be made.
 Thank you very much for your help.
 
 
 Sarah
 
 
 -
 Do you Yahoo!?
  Read only the mail you want - Yahoo! Mail SpamGuard.
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




FONT SIZE=1 FACE=VERDANA,ARIAL COLOR=BLUE
---
QAS Ltd.
Developers of QuickAddress Software
a href=http://www.qas.com;www.qas.com/a
Registered in England: No 2582055
Registered in Australia: No 082 851 474
---
/FONT


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




This e-mail, including any attachments, is a confidential business 
communication, and may contain information that is confidential, proprietary 
and/or privileged.  This e-mail is intended only for the individual(s) to whom 
it is addressed, and may not be saved, copied, printed, disclosed or used by 
anyone else.  If you are not the(an) intended recipient, please immediately 
delete this e-mail from your computer system and notify the sender.  Thank you.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



tomcat 5 and UTF-8 encoding

2004-12-06 Thread Sarah
Hi,
   I need to use jsp to display some data in Japanese character from MS SQL 
server database.  I have already set the encoding in jsp to be:
 
%@ page language=java contentType=text/html; charset=UTF-8 % 
 
If I use tomcat version 5.0.18, then the japanese character is displayed 
correctly.  However, if I use 5.0.28 or 5.5.4, the characters are something 
like ???.  If I right click the html page generated from jsp on the above 
versions, I can see the encoding to be Western instead of UTF-8 like what 
happened with 5.0.18.  Does anyone know what cause this problem and if any 
configuration of Tomcat needs to be made.  Thank you very much for your help.
 
 
Sarah


-
Do you Yahoo!?
 Read only the mail you want - Yahoo! Mail SpamGuard.

Re: tomcat 5 and UTF-8 encoding

2004-12-06 Thread Peter Johnson
Sarah,
I recall a post a week or so ago regarding the contentType string losing 
the space after the ;

This may be causing the issue.
PJ
Sarah wrote:
Hi,
  I need to use jsp to display some data in Japanese character from MS SQL 
server database.  I have already set the encoding in jsp to be:
%@ page language=java contentType=text/html; charset=UTF-8 % 

If I use tomcat version 5.0.18, then the japanese character is displayed correctly.  However, if I 
use 5.0.28 or 5.5.4, the characters are something like ???.  If I right click the html 
page generated from jsp on the above versions, I can see the encoding to be Western instead of 
UTF-8 like what happened with 5.0.18.  Does anyone know what cause this problem and if 
any configuration of Tomcat needs to be made.  Thank you very much for your help.
Sarah
		
-
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: UTF-8 Encoding in Jsp | RESOLVED

2004-12-03 Thread Andoni
I concur, thanks for posting your findings.

Also if I may ask: please don't change the subject of your mails. Those of
us who view this list as a newsgroup get all messed up!

Andoni.

- Original Message - 
From: Shapira, Yoav [EMAIL PROTECTED]
Newsgroups: gmane.comp.jakarta.tomcat.user
Sent: Thursday, December 02, 2004 2:08 PM
Subject: RE: UTF-8 Encoding in Jsp | RESOLVED



Hi,
Thanks for posting your findings ;)

Yoav Shapira http://www.yoavshapira.com


-Original Message-
From: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 02, 2004 9:03 AM
To: Tomcat Users List
Subject: RE: UTF-8 Encoding in Jsp | RESOLVED

Hi all,

First of all thanks to all the people who helped in the first place (I
am grateful). The problem was resolved and was due to some problem with
the home grown framework we were using with the application.

Tomcat had nothing to do with the problem and content type is the only
thing required to make it work. As far as the database persistence was
concerned, oracle did no mistake in storing the data but when our
framework was persisting the values, it somehow corrupted the data
somewhere in the middle of submitting the page with non-english
characters and writing to the database.

We found this problem by simply writing a simple jsp page without using
the framework and rendered some non-english characters successfully.

Thanks again,
Arnab


-Original Message-
From: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 01, 2004 4:08 PM
To: Tomcat Users List
Subject: RE: UTF-8 Encoding in Jsp

Hi,

Thanks for the reply but it did not work. May be I didn't explain the
problem correctly.

I am running an application that supports all the languages but only in
some specific places of the application and I have made those places
UTF-8 complaint.

Further, they are being saved to Database (Oracle 9). When we are
reading the data back from the database, junk characters are displayed
on the screen. Yes, the database is set to support UTF-8 Encoding and
this is working with the old version of tomcat 3.3 and not with current
upgraded version of tomcat 5.0

There are also places in the application where drop downs contain some
different language support and we can see those charsets (Japanese,
Chinese etc) appearing. Only, when I try to display on the screen
through the jsp file, I am encountering this problem of junk characters
begin displayed.

Hope I have set more context around the problem. Please help me resolve
this issue.

Thanks,
Arnab

-Original Message-
From: Mariano [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 01, 2004 12:54 PM
To: 'Tomcat Users List'
Subject: RE: UTF-8 Encoding in Jsp

You should use too:

head
 META http-equiv=Content-Type content=text/html;
charset=UTF-8
/head

and this scriptlet:

 request.setCharacterEncoding(UTF-8);

at the beginning.

I hope this help you

-Mensaje original-
De: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Enviado el: martes, 30 de noviembre de 2004 15:28
Para: Tomcat Users List
Asunto: UTF-8 Encoding in Jsp


Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




This e-mail, including any attachments, is a confidential business
communication, and may contain information that is confidential, proprietary
and/or privileged.  This e-mail is intended only for the individual(s) to
whom it is addressed, and may not be saved, copied, printed, disclosed or
used by anyone else.  If you are not the(an) intended recipient, please
immediately delete this e-mail from your computer system and notify the
sender.  Thank you.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 Encoding in Jsp | RESOLVED

2004-12-02 Thread Arnab Chakravarty
Hi all,

First of all thanks to all the people who helped in the first place (I
am grateful). The problem was resolved and was due to some problem with
the home grown framework we were using with the application.

Tomcat had nothing to do with the problem and content type is the only
thing required to make it work. As far as the database persistence was
concerned, oracle did no mistake in storing the data but when our
framework was persisting the values, it somehow corrupted the data
somewhere in the middle of submitting the page with non-english
characters and writing to the database.

We found this problem by simply writing a simple jsp page without using
the framework and rendered some non-english characters successfully.

Thanks again,
Arnab


-Original Message-
From: Arnab Chakravarty [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 01, 2004 4:08 PM
To: Tomcat Users List
Subject: RE: UTF-8 Encoding in Jsp

Hi,

Thanks for the reply but it did not work. May be I didn't explain the
problem correctly.

I am running an application that supports all the languages but only in
some specific places of the application and I have made those places
UTF-8 complaint.

Further, they are being saved to Database (Oracle 9). When we are
reading the data back from the database, junk characters are displayed
on the screen. Yes, the database is set to support UTF-8 Encoding and
this is working with the old version of tomcat 3.3 and not with current
upgraded version of tomcat 5.0

There are also places in the application where drop downs contain some
different language support and we can see those charsets (Japanese,
Chinese etc) appearing. Only, when I try to display on the screen
through the jsp file, I am encountering this problem of junk characters
begin displayed.

Hope I have set more context around the problem. Please help me resolve
this issue.

Thanks,
Arnab

-Original Message-
From: Mariano [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 01, 2004 12:54 PM
To: 'Tomcat Users List'
Subject: RE: UTF-8 Encoding in Jsp

You should use too:

head
META http-equiv=Content-Type content=text/html;
charset=UTF-8
/head

and this scriptlet:

request.setCharacterEncoding(UTF-8);

at the beginning.

I hope this help you

-Mensaje original-
De: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Enviado el: martes, 30 de noviembre de 2004 15:28
Para: Tomcat Users List
Asunto: UTF-8 Encoding in Jsp


Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 Encoding in Jsp | RESOLVED

2004-12-02 Thread Shapira, Yoav

Hi,
Thanks for posting your findings ;)

Yoav Shapira http://www.yoavshapira.com


-Original Message-
From: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 02, 2004 9:03 AM
To: Tomcat Users List
Subject: RE: UTF-8 Encoding in Jsp | RESOLVED

Hi all,

First of all thanks to all the people who helped in the first place (I
am grateful). The problem was resolved and was due to some problem with
the home grown framework we were using with the application.

Tomcat had nothing to do with the problem and content type is the only
thing required to make it work. As far as the database persistence was
concerned, oracle did no mistake in storing the data but when our
framework was persisting the values, it somehow corrupted the data
somewhere in the middle of submitting the page with non-english
characters and writing to the database.

We found this problem by simply writing a simple jsp page without using
the framework and rendered some non-english characters successfully.

Thanks again,
Arnab


-Original Message-
From: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 01, 2004 4:08 PM
To: Tomcat Users List
Subject: RE: UTF-8 Encoding in Jsp

Hi,

Thanks for the reply but it did not work. May be I didn't explain the
problem correctly.

I am running an application that supports all the languages but only in
some specific places of the application and I have made those places
UTF-8 complaint.

Further, they are being saved to Database (Oracle 9). When we are
reading the data back from the database, junk characters are displayed
on the screen. Yes, the database is set to support UTF-8 Encoding and
this is working with the old version of tomcat 3.3 and not with current
upgraded version of tomcat 5.0

There are also places in the application where drop downs contain some
different language support and we can see those charsets (Japanese,
Chinese etc) appearing. Only, when I try to display on the screen
through the jsp file, I am encountering this problem of junk characters
begin displayed.

Hope I have set more context around the problem. Please help me resolve
this issue.

Thanks,
Arnab

-Original Message-
From: Mariano [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 01, 2004 12:54 PM
To: 'Tomcat Users List'
Subject: RE: UTF-8 Encoding in Jsp

You should use too:

head
   META http-equiv=Content-Type content=text/html;
charset=UTF-8
/head

and this scriptlet:

   request.setCharacterEncoding(UTF-8);

at the beginning.

I hope this help you

-Mensaje original-
De: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Enviado el: martes, 30 de noviembre de 2004 15:28
Para: Tomcat Users List
Asunto: UTF-8 Encoding in Jsp


Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




This e-mail, including any attachments, is a confidential business 
communication, and may contain information that is confidential, proprietary 
and/or privileged.  This e-mail is intended only for the individual(s) to whom 
it is addressed, and may not be saved, copied, printed, disclosed or used by 
anyone else.  If you are not the(an) intended recipient, please immediately 
delete this e-mail from your computer system and notify the sender.  Thank you.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 Encoding in Jsp

2004-12-01 Thread Arnab Chakravarty
Hi,

Thanks for the reply but it did not work. May be I didn't explain the
problem correctly.

I am running an application that supports all the languages but only in
some specific places of the application and I have made those places
UTF-8 complaint.

Further, they are being saved to Database (Oracle 9). When we are
reading the data back from the database, junk characters are displayed
on the screen. Yes, the database is set to support UTF-8 Encoding and
this is working with the old version of tomcat 3.3 and not with current
upgraded version of tomcat 5.0

There are also places in the application where drop downs contain some
different language support and we can see those charsets (Japanese,
Chinese etc) appearing. Only, when I try to display on the screen
through the jsp file, I am encountering this problem of junk characters
begin displayed.

Hope I have set more context around the problem. Please help me resolve
this issue.

Thanks,
Arnab

-Original Message-
From: Mariano [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 01, 2004 12:54 PM
To: 'Tomcat Users List'
Subject: RE: UTF-8 Encoding in Jsp

You should use too:

head
META http-equiv=Content-Type content=text/html;
charset=UTF-8
/head

and this scriptlet:

request.setCharacterEncoding(UTF-8);

at the beginning.

I hope this help you

-Mensaje original-
De: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Enviado el: martes, 30 de noviembre de 2004 15:28
Para: Tomcat Users List
Asunto: UTF-8 Encoding in Jsp


Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 Encoding in Jsp

2004-12-01 Thread Allistair Crossley
Hi,

These encoding issues are always a nightmare ;) 

There are some relevant areas of the Servlet spec you may want to look at wrt 
encoding, notably (Internationalization and Request data encoding).

In terms of UTF-8 not coming back correctly from your database you need to 
ensure that when they were _added_ that the character encoding was UTF-8. You 
should also verify yuor database is in UTF-8 mode. If both these statements are 
true, then you need to read Internationalization in the Servlet spec which says 

If the servlet does not specify a character encoding before the getWriter
method of the ServletResponse interface is called or the response is committed,
the default ISO-8859-1 is used.

In other words, you need to call setLocale or setCharacterEncoding before the 
response is committed. I am not entirely sure whether that is actually what 
that JSP page directive is doing, maybe it is. Perhaps in your JSP you can 
output %= request.getCharacterEncoding() % to make sure your UTF-8 has been 
set. If it is null, it has not been set. If it _is_ UTF-8 then the character 
data is either not actually UTF-8 coming from the database either because a) 
your database driver connection URL is not operating in UTF-8 mode, b) the data 
when put into the database was not UTF-8 or c) the database is not running 
UTF-8.

In terms of sending data to the database as UTF-8 check your driver parameters 
(normally on the URL string) and also database setting. You also need to take 
note of this section of the Servlet spec. We had to write a servlet filter to 
change our inbound form posts to the correct encoding for our database Cp1252.

Request data encoding extract 

The default encoding of a request the container uses to create the
request reader and parse POST data must be ISO-8859-1 if none has been
specified by the client request. However, in order to indicate to the developer 
in this
case the failure of the client to send a character encoding, the container 
returns null
from the getCharacterEncoding method.

If the client hasn't set character encoding and the request data is encoded with
a different encoding than the default as described above, breakage can occur. To
remedy this situation, a new method setCharacterEncoding(String enc) has
been added to the ServletRequest interface. Developers can override the
character encoding supplied by the container by calling this method. It must be
called prior to parsing any post data or reading any input from the request.

Hope this info gets you thinking, Allistair.

 -Original Message-
 From: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
 Sent: 01 December 2004 10:38
 To: Tomcat Users List
 Subject: RE: UTF-8 Encoding in Jsp
 
 
 Hi,
 
 Thanks for the reply but it did not work. May be I didn't explain the
 problem correctly.
 
 I am running an application that supports all the languages 
 but only in
 some specific places of the application and I have made those places
 UTF-8 complaint.
 
 Further, they are being saved to Database (Oracle 9). When we are
 reading the data back from the database, junk characters are displayed
 on the screen. Yes, the database is set to support UTF-8 Encoding and
 this is working with the old version of tomcat 3.3 and not 
 with current
 upgraded version of tomcat 5.0
 
 There are also places in the application where drop downs contain some
 different language support and we can see those charsets (Japanese,
 Chinese etc) appearing. Only, when I try to display on the screen
 through the jsp file, I am encountering this problem of junk 
 characters
 begin displayed.
 
 Hope I have set more context around the problem. Please help 
 me resolve
 this issue.
 
 Thanks,
 Arnab
 
 -Original Message-
 From: Mariano [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, December 01, 2004 12:54 PM
 To: 'Tomcat Users List'
 Subject: RE: UTF-8 Encoding in Jsp
 
 You should use too:
 
 head
   META http-equiv=Content-Type content=text/html;
 charset=UTF-8
 /head
 
 and this scriptlet:
 
   request.setCharacterEncoding(UTF-8);
 
 at the beginning.
 
 I hope this help you
 
 -Mensaje original-
 De: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
 Enviado el: martes, 30 de noviembre de 2004 15:28
 Para: Tomcat Users List
 Asunto: UTF-8 Encoding in Jsp
 
 
 Hi all,
 
 I need to make my all jsp files compatible with UTF-8 
 Encoding and even
 though I am using the directives:
 
 %@ page pageEncoding=UTF-8%
 %@ page contentType = text/html;charset=UTF-8%
 
 in the jsp files, cannot make it work.
 
 Using tomcat version 5. Is there any config changes I need to make for
 the UTF-8 Encoding to work.
 
 Please help.
 
 Thanks in advance,
 Arnab
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED

Re: UTF-8 Encoding in Jsp

2004-12-01 Thread Andoni
I would recommend that you make the Entire Site UTF-8. The parts that are in
English will still work no problem but I would really not try mixing the
encoding for requests.

The junk characters you are getting back are also not actually junk. You
can work out what encoding is being used by interpreting these string and
knowing what the intended string is. Also the fact that you are not just
getting lots of ? characters means that it is not Oracle that is having
the problem.

I will read the other reply when I get a chance and see if I have any
further contributions but for now I really strenuously suggest making ALL
the pages/servlets UTF-8.

Regards,
Andoni.

- Original Message - 
From: Arnab Chakravarty [EMAIL PROTECTED]
Newsgroups: gmane.comp.jakarta.tomcat.user
Sent: Wednesday, December 01, 2004 10:37 AM
Subject: RE: UTF-8 Encoding in Jsp


Hi,

Thanks for the reply but it did not work. May be I didn't explain the
problem correctly.

I am running an application that supports all the languages but only in
some specific places of the application and I have made those places
UTF-8 complaint.

Further, they are being saved to Database (Oracle 9). When we are
reading the data back from the database, junk characters are displayed
on the screen. Yes, the database is set to support UTF-8 Encoding and
this is working with the old version of tomcat 3.3 and not with current
upgraded version of tomcat 5.0

There are also places in the application where drop downs contain some
different language support and we can see those charsets (Japanese,
Chinese etc) appearing. Only, when I try to display on the screen
through the jsp file, I am encountering this problem of junk characters
begin displayed.

Hope I have set more context around the problem. Please help me resolve
this issue.

Thanks,
Arnab

-Original Message-
From: Mariano [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 01, 2004 12:54 PM
To: 'Tomcat Users List'
Subject: RE: UTF-8 Encoding in Jsp

You should use too:

head
META http-equiv=Content-Type content=text/html;
charset=UTF-8
/head

and this scriptlet:

request.setCharacterEncoding(UTF-8);

at the beginning.

I hope this help you

-Mensaje original-
De: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Enviado el: martes, 30 de noviembre de 2004 15:28
Para: Tomcat Users List
Asunto: UTF-8 Encoding in Jsp


Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



UTF-8 Encoding in Jsp

2004-11-30 Thread Arnab Chakravarty
Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UTF-8 Encoding in Jsp

2004-11-30 Thread Andoni
Hello,

First and foremost I would say: be absolutely sure that it is the JSP's
fault. I hope you are not getting some data from a database and trying to
show it? Be sure that your editor is saving the JSP in UTF-8 format.

Add the flag:
-Dfile.encoding=UTF-8
to the CATALINA_OPTS environment variable in your catalina.bat (or
equivalent) startup file.

and use:
req.setCharacterEncoding(UTF-8);
to set the encoding on the request.

This may help:

http://marc.theaimsgroup.com/?l=tomcat-userm=105524550416364w=2

Though you can ignore the method wich is used to set the encoding as the
above line does the same job in servlets.

Andoni.



- Original Message - 
From: Arnab Chakravarty [EMAIL PROTECTED]
Newsgroups: gmane.comp.jakarta.tomcat.user
Sent: Tuesday, November 30, 2004 2:28 PM
Subject: UTF-8 Encoding in Jsp


Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 Encoding in Jsp

2004-11-30 Thread Mariano
You should use too:

head
META http-equiv=Content-Type content=text/html; charset=UTF-8
/head

and this scriptlet:

request.setCharacterEncoding(UTF-8);

at the beginning.

I hope this help you

-Mensaje original-
De: Arnab Chakravarty [mailto:[EMAIL PROTECTED]
Enviado el: martes, 30 de noviembre de 2004 15:28
Para: Tomcat Users List
Asunto: UTF-8 Encoding in Jsp


Hi all,

I need to make my all jsp files compatible with UTF-8 Encoding and even
though I am using the directives:

%@ page pageEncoding=UTF-8%
%@ page contentType = text/html;charset=UTF-8%

in the jsp files, cannot make it work.

Using tomcat version 5. Is there any config changes I need to make for
the UTF-8 Encoding to work.

Please help.

Thanks in advance,
Arnab

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )

2004-09-02 Thread Rick
Thanks for you info Mark.
  It does appear that a part of my issue stems from my .properties files
being in UTF-8.
So I have to ask the question, why has this changed since if I run the same
code in 5.0.24, I have no issue, and 5.0.28 has a problem.   It sounds like
a substantial problem that UTF-8 resource bundles aren't supported any more.


Besides this simple example, I'm still seeing problems with a servlet
returning XML in UTF-8. Again, no issue in 5.0.24, only after 5.0.25.

I will put together a sample and post it shortly.

Thanks again for the help,

Rick

-Original Message-
From: Mark Thomas [mailto:[EMAIL PROTECTED] 
Posted At: Wednesday, September 01, 2004 4:14 PM
Posted To: Tomcat Dev
Conversation: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )
Subject: RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )


OK. I have a simple test case and all seems to be well. See the end of this
message for the contents of my test files.

My environment:
Win XP SP2 - brave I know but all has been OK so far ;) JDK 1.4.2_05 Tomcat
5.0 branch, HEAD (latest) from CVS (very close to 5.0.28)

Points to note:
1. All my test files are ASCII files.
2. I had all sorts of problems with non-ASCII properties files. I didn't get
to the bottom of it but I think Windows was adding junk to the start of the
file if it was UTF-8 encoded. Maybe having the first line as a comment would
fix this but I haven't tested this.
3. There were times where Eclipse and Windows were reporting the exact same
file as having different encodings. There is something odd here but I didn't
look at this any further.
4. I had property file issues with 4.1.HEAD as well as 5.0.HEAD.
5. The downside of using ASCII files is that entering the UTF-8 characters
by hand is a real pain. A simple conversion app should fix this though.
6. Apart from the property file issue, everything seems fine.

Test files follow.

Hope this helps,

Mark

PS I noticed that you cross-posted to the dev list. Please don't do this.
Any message cross-posted is less likely rather than more likely to get a
response.

=== utf8.jsp 
%@ page language=java import=java.lang.*,java.util.*
contentType=text/html; charset=UTF-8 % !DOCTYPE HTML PUBLIC -//W3C//DTD
HTML 4.01 Transitional//EN html
  head
titleUTF-8 Encoding issue/title
  /head
  body
pText from JSP page (which is ASCII encoded)./p
form action=utf8.jsp method=post
  pEnglishinput type=radio value=en name=language /p
  pJapaneseinput type=radio value=ja name=language /p
  input type=submit value=Post form data /
/form
pText from resources bundle:/p
%
  String language = request.getParameter(language);
  
  if (language == null) {
language=en;
  }
  
  Locale locale = null;
  if (language.equalsIgnoreCase(en)) {
locale = Locale.ENGLISH;
  } else {
locale = Locale.JAPAN;
  }
  
  ResourceBundle bundle =
ResourceBundle.getBundle(foo.bar.LocalStrings,
locale);
  out.println(p + bundle.getString(test) + /p);
%
p%=request.getParameter(language) %/p
  /body
/html

= LocalStrings_en.properties =
test=Test string from resources bundle

= LocalStrings_ja.properties =
test=\u30d5\u30a1\u30a4\u30eb\u30ed



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )

2004-09-01 Thread Mark Thomas
The change (which is required by the spec) is that if the character set has not
been set before a call to getWriter() then it will default to ISO-8859-1. There
was some discussion on the tomcat-dev list about this (see
http://marc.theaimsgroup.com/?l=tomcat-devm=109104739719572w=2)

I'll try and put together a very simple JSP test case and get back to you.

Mark

 -Original Message-
 From: Rick [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, September 01, 2004 3:44 AM
 To: 'Tomcat Users List'; [EMAIL PROTECTED]
 Subject: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )
 
 Since 5.0.27, pretty much all of my UTF-8 i8 code seems to be 
 messed up. 
 
 The problem seems to have been caused by whatever fix was 
 created for issue
 --
 ServletResponse.setContentType sets response encoding after 
 getWriter was
 called (Bugtraq 5062838) (luehe) 
 --
 
 Now it seems almost impossible to properly set the encoding 
 type of some of
 my JSPs and all of my Servlets that return UTF-8 XML data.
 
 As an example, my login page allows the user to switch to 
 Japanese text.
 Text data is read with a ResourceBundle, which reads from a 
 UTF-8 encoded
 .properties file.
 
 If the encoding of the .jsp page itself is in ASCII, then I 
 can't get the
 characters to show up at all any more.
 I have to save the .jsp page as UTF-8.  
 Added set JAVA_OPTS=-Dfile.encoding=UTF-8 to my catalina.bat file
 
 Then, If I try to set a character set in my page header, it messes up.
 
 This works in some cases...
 %@ page language=java import=java.util.* 
 contentType=text/html %
 response.getCharacterEncoding() = ISO-8859-1
 
 The really scary part is that with no meta or charset 
 actually set, that the
 browser(IE) correctly changes to UTF-8 and displays the 
 content fine.   But
 if I change the actual file encoding of the .jsp page from 
 UTF-8 back to
 ASCII. Then IE does not change to UTF-8 and the page is 
 messed up again.
 Why does the actual encoding of the .jsp file itself dictate 
 the response
 sent to the client?
 
 It appears that the actual encoding of the source file 
 someone how gets past
 along and then I'm unable to alter the character encoding, 
 and if I try, it
 just causes everything to go to hell.
 
 
 This use to work before 5.0.27, but now doesn't, even though 
 all data and
 pages are encoded in UTF-8.
 %@ page language=java import=java.util.* contentType=text/html;
 charset=UTF-8 %
 response.getCharacterEncoding() = UTF-8
 
 
 Before 5.0.27, all I had to do to get my output in UTF-8 was ...
  contentType=text/html; charset=UTF-8
 
 Now I have to mess with the actual .jsp file page encodings 
 and still can't
 get most to work properly as well as none of my servlets will 
 return correct
 UTF-8 data.  
 
 I have tried setting pageEncoding in the page tag as well 
 with no luck.
 
 
 Thanks for anyone's insight or help on this, its never fun to 
 find out that
 something that had been working quite solid , up and blows up 
 for no good
 reason.
 
 Current dev machine is on windows xp by the way, vanilla 
 install of Tomcat
 5.0.28.
 I will be setting this up on a Linux box for more testing shortly.
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )

2004-09-01 Thread Mark Thomas
OK. I have a simple test case and all seems to be well. See the end of this
message for the contents of my test files.

My environment:
Win XP SP2 - brave I know but all has been OK so far ;)
JDK 1.4.2_05
Tomcat 5.0 branch, HEAD (latest) from CVS (very close to 5.0.28)

Points to note:
1. All my test files are ASCII files.
2. I had all sorts of problems with non-ASCII properties files. I didn't get to
the bottom of it but I think Windows was adding junk to the start of the file if
it was UTF-8 encoded. Maybe having the first line as a comment would fix this
but I haven't tested this.
3. There were times where Eclipse and Windows were reporting the exact same file
as having different encodings. There is something odd here but I didn't look at
this any further.
4. I had property file issues with 4.1.HEAD as well as 5.0.HEAD.
5. The downside of using ASCII files is that entering the UTF-8 characters by
hand is a real pain. A simple conversion app should fix this though.
6. Apart from the property file issue, everything seems fine.

Test files follow.

Hope this helps,

Mark

PS I noticed that you cross-posted to the dev list. Please don't do this. Any
message cross-posted is less likely rather than more likely to get a response.

=== utf8.jsp 
%@ page language=java import=java.lang.*,java.util.*
contentType=text/html; charset=UTF-8 %
!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN
html
  head
titleUTF-8 Encoding issue/title
  /head
  body
pText from JSP page (which is ASCII encoded)./p
form action=utf8.jsp method=post
  pEnglishinput type=radio value=en name=language /p
  pJapaneseinput type=radio value=ja name=language /p
  input type=submit value=Post form data /
/form
pText from resources bundle:/p
%
  String language = request.getParameter(language);
  
  if (language == null) {
language=en;
  }
  
  Locale locale = null;
  if (language.equalsIgnoreCase(en)) {
locale = Locale.ENGLISH;
  } else {
locale = Locale.JAPAN;
  }
  
  ResourceBundle bundle = ResourceBundle.getBundle(foo.bar.LocalStrings,
locale);
  out.println(p + bundle.getString(test) + /p);
%
p%=request.getParameter(language) %/p
  /body
/html

= LocalStrings_en.properties =
test=Test string from resources bundle

= LocalStrings_ja.properties =
test=\u30d5\u30a1\u30a4\u30eb\u30ed



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )

2004-08-31 Thread Rick
Since 5.0.27, pretty much all of my UTF-8 i8 code seems to be messed up. 

The problem seems to have been caused by whatever fix was created for issue
--
ServletResponse.setContentType sets response encoding after getWriter was
called (Bugtraq 5062838) (luehe) 
--

Now it seems almost impossible to properly set the encoding type of some of
my JSPs and all of my Servlets that return UTF-8 XML data.

As an example, my login page allows the user to switch to Japanese text.
Text data is read with a ResourceBundle, which reads from a UTF-8 encoded
.properties file.

If the encoding of the .jsp page itself is in ASCII, then I can't get the
characters to show up at all any more.
I have to save the .jsp page as UTF-8.  
Added set JAVA_OPTS=-Dfile.encoding=UTF-8 to my catalina.bat file

Then, If I try to set a character set in my page header, it messes up.

This works in some cases...
%@ page language=java import=java.util.* contentType=text/html %
response.getCharacterEncoding() = ISO-8859-1

The really scary part is that with no meta or charset actually set, that the
browser(IE) correctly changes to UTF-8 and displays the content fine.   But
if I change the actual file encoding of the .jsp page from UTF-8 back to
ASCII. Then IE does not change to UTF-8 and the page is messed up again.
Why does the actual encoding of the .jsp file itself dictate the response
sent to the client?

It appears that the actual encoding of the source file someone how gets past
along and then I'm unable to alter the character encoding, and if I try, it
just causes everything to go to hell.


This use to work before 5.0.27, but now doesn't, even though all data and
pages are encoded in UTF-8.
%@ page language=java import=java.util.* contentType=text/html;
charset=UTF-8 %
response.getCharacterEncoding() = UTF-8


Before 5.0.27, all I had to do to get my output in UTF-8 was ...
 contentType=text/html; charset=UTF-8

Now I have to mess with the actual .jsp file page encodings and still can't
get most to work properly as well as none of my servlets will return correct
UTF-8 data.  

I have tried setting pageEncoding in the page tag as well with no luck.


Thanks for anyone's insight or help on this, its never fun to find out that
something that had been working quite solid , up and blows up for no good
reason.

Current dev machine is on windows xp by the way, vanilla install of Tomcat
5.0.28.
I will be setting this up on a Linux box for more testing shortly.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UTF-8 encoding

2004-04-07 Thread Harry Mantheakis
Hello Nikki

 Just send UTF8 encoded data and everything will be allright.

Yes, that seems to work for me at the moment, though I am relying on default
settings because I do not even specify UTF-8. (Java defaults to Unicode
anyway.)

I'm only using LATIN-1 characters at the moment, so I cannot comment on what
would happen if I was working with (say) Chinese characters.

I have to leave it at that because this is something I shall be looking into
later.

All the best!

Harry

 Simply I don't get it. You send data over HTTP. You can send data as you
 wish. What about servlet serving images?
 Just send UTF8 encoded data and everything will be allright.
 No way Tomcat knows do you want to send cyrrilic letter or french accent
 letter. It's up to you.
 Niki
 Harry Mantheakis wrote:


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UTF-8 encoding

2004-04-06 Thread Niki Ivanchev
Simply I don't get it. You send data over HTTP. You can send data as you 
wish. What about servlet serving images?
Just send UTF8 encoded data and everything will be allright.
No way Tomcat knows do you want to send cyrrilic letter or french accent 
letter. It's up to you.
Niki
Harry Mantheakis wrote:

Okay, thanks Yoav.

I got the source, and I can see what's happening - thanks to Google - at
this URL:
http://java.sun.com/blueprints/code/jps131/src/com/sun/j2ee/blueprints/encod
ingfilter/web/EncodingFilter.java.html
The 'doFilter' method sets the encoding for the *request* which does not
seem to address the original question, which was asking how to 'force tomcat
to send data in UTF-8 encoding'.
Interesting filter nevertheless! It is a subject that concerns me.

Kind regards

Harry

 

Hi,

   

implement a EncodingFilter class
   

Where's the interface?
 

javax.servlet.Filter is the interface.  He probably had
http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/e
ncodingfilter/web/EncodingFilter.html in mind.
Yoav Shapira
   



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


 




UTF-8 encoding

2004-04-05 Thread b . somer
Hi!

I have a web-application which on the serverside needs UTF-8 encoding. I 
tried to install and run apache/tomcat on a Windows-XP environment, and 
the server says, the encoding is not UTF-8. same applicationwith the same 
apache/tomcat version runs correctly on a windows 2000 environment. Is 
this a XP specific problem and is there any possibility to force tomcat to 
send data in UTF-8 encoding.



Best regards
 bab








 

RE: UTF-8 encoding

2004-04-05 Thread Yansheng Lin
Hi, you can specify the utf-8 encoding with a filter.  All you need to do is
implement a EncodingFilter class, and then in your deployment descriptor add the
filter element as 
follows:

  filter
filter-nameEncodingFilter/filter-name
display-nameEncodingFilter/display-name
descriptionUTF-8 encoding/description
filter-classorg.mysite.EncodingFilter/filter-class
init-param
param-nametargetEncoding/param-name
param-valueutf-8/param-value
/init-param
  /filter

Hope this helps:)

-Yan

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 05, 2004 6:49 AM
To: Tomcat Users List
Subject: UTF-8 encoding


Hi!

I have a web-application which on the serverside needs UTF-8 encoding. I 
tried to install and run apache/tomcat on a Windows-XP environment, and 
the server says, the encoding is not UTF-8. same applicationwith the same 
apache/tomcat version runs correctly on a windows 2000 environment. Is 
this a XP specific problem and is there any possibility to force tomcat to 
send data in UTF-8 encoding.



Best regards
 bab








 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UTF-8 encoding

2004-04-05 Thread Harry Mantheakis


 implement a EncodingFilter class


Where's the interface?


 Hi, you can specify the utf-8 encoding with a filter.  All you need to do is
 implement a EncodingFilter class, and then in your deployment descriptor add
 the filter element as follows:
 
 filter
   filter-nameEncodingFilter/filter-name
   display-nameEncodingFilter/display-name
   descriptionUTF-8 encoding/description
   filter-classorg.mysite.EncodingFilter/filter-class
   init-param
   param-nametargetEncoding/param-name
   param-valueutf-8/param-value
   /init-param
 /filter
 
 Hope this helps:)
 
 -Yan
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Sent: Monday, April 05, 2004 6:49 AM
 To: Tomcat Users List
 Subject: UTF-8 encoding
 
 
 Hi!
 
 I have a web-application which on the serverside needs UTF-8 encoding. I
 tried to install and run apache/tomcat on a Windows-XP environment, and
 the server says, the encoding is not UTF-8. same applicationwith the same
 apache/tomcat version runs correctly on a windows 2000 environment. Is
 this a XP specific problem and is there any possibility to force tomcat to
 send data in UTF-8 encoding.
 
 
 
 Best regards
 bab


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 encoding

2004-04-05 Thread Shapira, Yoav

Hi,

 implement a EncodingFilter class


Where's the interface?

javax.servlet.Filter is the interface.  He probably had
http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/e
ncodingfilter/web/EncodingFilter.html in mind.

Yoav Shapira



This e-mail, including any attachments, is a confidential business communication, and 
may contain information that is confidential, proprietary and/or privileged.  This 
e-mail is intended only for the individual(s) to whom it is addressed, and may not be 
saved, copied, printed, disclosed or used by anyone else.  If you are not the(an) 
intended recipient, please immediately delete this e-mail from your computer system 
and notify the sender.  Thank you.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: UTF-8 encoding

2004-04-05 Thread Shapira, Yoav


javax.servlet.Filter is the interface.  He probably had
http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/
e
ncodingfilter/web/EncodingFilter.html in mind.

BTW, swap .java for .html (or google with the above) to see the full
java source code for the blueprint encoding filter implementation.

Yoav Shapira



This e-mail, including any attachments, is a confidential business communication, and 
may contain information that is confidential, proprietary and/or privileged.  This 
e-mail is intended only for the individual(s) to whom it is addressed, and may not be 
saved, copied, printed, disclosed or used by anyone else.  If you are not the(an) 
intended recipient, please immediately delete this e-mail from your computer system 
and notify the sender.  Thank you.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UTF-8 encoding

2004-04-05 Thread Harry Mantheakis
Okay, thanks Yoav.

I got the source, and I can see what's happening - thanks to Google - at
this URL:

http://java.sun.com/blueprints/code/jps131/src/com/sun/j2ee/blueprints/encod
ingfilter/web/EncodingFilter.java.html

The 'doFilter' method sets the encoding for the *request* which does not
seem to address the original question, which was asking how to 'force tomcat
to send data in UTF-8 encoding'.

Interesting filter nevertheless! It is a subject that concerns me.

Kind regards

Harry


 Hi,
 
 implement a EncodingFilter class
 
 
 Where's the interface?
 
 javax.servlet.Filter is the interface.  He probably had
 http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/e
 ncodingfilter/web/EncodingFilter.html in mind.
 
 Yoav Shapira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



UTF-8 encoding problem with file included using jsp:include

2003-12-23 Thread Camilla Clifford
Hello,

I have a jsp page with the following code at the top of the page, in 
order to display the page contents in UTF-8:

%@ page contentType=text/html; charset=UTF-8 %
% response.setContentType(text/html; charset=UTF-8); %
In this page is a jsp:include tag that includes a static html file, the 
name of which is determined at runtime. This included file contains 
UTF-8 encoded characters, however these are not being displayed 
correctly in my browser  (mozilla 1.5/debian), but as generic 'unknown 
unicode' chars. If I use an include directive instead however, the 
characters are displayed correctly. If I change the extension of the 
included file to .jsp so that it's compiled (just to see what happends) 
the characters still don't display because the .java file generated by 
Jasper has a response.setContentType(iso-8859-1) line inserted into 
it, which I've been unable to figure out how to change.

It seems likely that somewhere along the line, the content type of the 
included file  (html or jsp) is being set and this setting is taking 
precedence over the page directives I have in the including page.  I've 
tried setting everything I can think of to UTF-8 (file encoding, 
response and request objects), I've checked that the JSP compiler should 
be compiling using UTF-8  (I'm using tomcat 4.1.29) (even though this 
shouldn't really affect and included html file) but I can't seem to get 
the included file encoded correctly.

Does anyone know what setting is responsible for the 
response.setContentType line inserted by jasper, or have any further 
ideas that I could investigate ?

Many thanks,
..camilla
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Setting UTF-8 Encoding

2003-01-31 Thread Affan Qureshi
I am having trouble setting the encoding to UTF-8 and hence my web pages are
unable to render characters like the Trademark or Copyright symbols. In
Tomcat's source at various places teh character encoding is hard-coded to be
ISO-8859-1. I have tried to use the filter in the examples to set the
encoding type but that did not help and I kept seeing questionamarks for
those characters. I have also tried to modify the source and build again but
that doesn't work either (I know I must be doing something wrong here.)

Somehow tomcat doesn't allow me to change the character encoding to UTF-8.
The same JSPs are looking fine on Weblogic and Resin without any
configuration/modification to the server settings.

Any ideas how can I fix this ugly problem in my app. The app is unusable
without this.

Thanks a lot.

Affan


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: Setting UTF-8 Encoding

2003-01-31 Thread Affan Qureshi
I forgot to paste my code which is there at the bottom now.

 I am having trouble setting the encoding to UTF-8 and hence my web pages
are
 unable to render characters like the Trademark or Copyright symbols. In
 Tomcat's source at various places teh character encoding is hard-coded to
be
 ISO-8859-1. I have tried to use the filter in the examples to set the
 encoding type but that did not help and I kept seeing questionamarks for
 those characters. I have also tried to modify the source and build again
but
 that doesn't work either (I know I must be doing something wrong here.)

 Somehow tomcat doesn't allow me to change the character encoding to UTF-8.
 The same JSPs are looking fine on Weblogic and Resin without any
 configuration/modification to the server settings.

 Any ideas how can I fix this ugly problem in my app. The app is unusable
 without this.

 Thanks a lot.

 Affan

 Here is my code for the Test JSP:
%@page contentType=text/html; charset=UTF-8%
html
headtitleTest JSP/title/head
body
% out.println('\u00A9'); %
% System.out.println(This © is test);%
BR
% out.println(This #176; is test); %
BR
% out.println(This © is test); %
BR
% out.println(This \u00A9 is test); %  %= ©%
BR
% out.println(This \u00B0 is test); %
BR
% out.println(This \u00AE is test); %
BR
% out.println(This \u0099 is test); %
BR
% out.println(This \u00F6 is test); %
% out.flush(); %
/body
/html


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: Setting UTF-8 Encoding

2003-01-31 Thread Masood Ahmed
Have you tried setting the locale directly on the
request object? See if that helps.

What version of tomcat are you using?

thanks,
-Masood

--- Affan Qureshi [EMAIL PROTECTED] wrote:
 I forgot to paste my code which is there at the
 bottom now.
 
  I am having trouble setting the encoding to UTF-8
 and hence my web pages
 are
  unable to render characters like the Trademark or
 Copyright symbols. In
  Tomcat's source at various places teh character
 encoding is hard-coded to
 be
  ISO-8859-1. I have tried to use the filter in the
 examples to set the
  encoding type but that did not help and I kept
 seeing questionamarks for
  those characters. I have also tried to modify the
 source and build again
 but
  that doesn't work either (I know I must be doing
 something wrong here.)
 
  Somehow tomcat doesn't allow me to change the
 character encoding to UTF-8.
  The same JSPs are looking fine on Weblogic and
 Resin without any
  configuration/modification to the server settings.
 
  Any ideas how can I fix this ugly problem in my
 app. The app is unusable
  without this.
 
  Thanks a lot.
 
  Affan
 
  Here is my code for the Test JSP:
 %@page contentType=text/html; charset=UTF-8%
 html
 headtitleTest JSP/title/head
 body
 % out.println('\u00A9'); %
 % System.out.println(This © is test);%
 BR
 % out.println(This ° is test); %
 BR
 % out.println(This © is test); %
 BR
 % out.println(This \u00A9 is test); %  %= ©%
 BR
 % out.println(This \u00B0 is test); %
 BR
 % out.println(This \u00AE is test); %
 BR
 % out.println(This \u0099 is test); %
 BR
 % out.println(This \u00F6 is test); %
 % out.flush(); %
 /body
 /html
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: Setting UTF-8 Encoding

2003-01-31 Thread Affan Qureshi
Locale object is set to to en_US by default. And I am using Tomcat 4.1.18
on Win2K. i have also tried the same on SunSOLARIS and Linux.

If I use servlets instead of JSP it works fine and output the characters as
required. But i guess its the JSPWriter that does something which shows teh
question marks in place of those characters.

However, the same JSP code works in for Resin and Weblogic.

Baffled

Affan

- Original Message -
From: Masood Ahmed [EMAIL PROTECTED]
To: Tomcat Users List [EMAIL PROTECTED]
Sent: Friday, January 31, 2003 5:13 PM
Subject: Re: Setting UTF-8 Encoding


 Have you tried setting the locale directly on the
 request object? See if that helps.

 What version of tomcat are you using?

 thanks,
 -Masood

 --- Affan Qureshi [EMAIL PROTECTED] wrote:
  I forgot to paste my code which is there at the
  bottom now.
 
   I am having trouble setting the encoding to UTF-8
  and hence my web pages
  are
   unable to render characters like the Trademark or
  Copyright symbols. In
   Tomcat's source at various places teh character
  encoding is hard-coded to
  be
   ISO-8859-1. I have tried to use the filter in the
  examples to set the
   encoding type but that did not help and I kept
  seeing questionamarks for
   those characters. I have also tried to modify the
  source and build again
  but
   that doesn't work either (I know I must be doing
  something wrong here.)
  
   Somehow tomcat doesn't allow me to change the
  character encoding to UTF-8.
   The same JSPs are looking fine on Weblogic and
  Resin without any
   configuration/modification to the server settings.
  
   Any ideas how can I fix this ugly problem in my
  app. The app is unusable
   without this.
  
   Thanks a lot.
  
   Affan
 
   Here is my code for the Test JSP:
  %@page contentType=text/html; charset=UTF-8%
  html
  headtitleTest JSP/title/head
  body
  % out.println('\u00A9'); %
  % System.out.println(This © is test);%
  BR
  % out.println(This ° is test); %
  BR
  % out.println(This © is test); %
  BR
  % out.println(This \u00A9 is test); %  %= ©%
  BR
  % out.println(This \u00B0 is test); %
  BR
  % out.println(This \u00AE is test); %
  BR
  % out.println(This \u0099 is test); %
  BR
  % out.println(This \u00F6 is test); %
  % out.flush(); %
  /body
  /html
 
 
 
 -
  To unsubscribe, e-mail:
  [EMAIL PROTECTED]
  For additional commands, e-mail:
  [EMAIL PROTECTED]
 


 __
 Do you Yahoo!?
 Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
 http://mailplus.yahoo.com

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




RE: Setting UTF-8 Encoding

2003-01-31 Thread Daniel Brown
Affan,

The encoding is set just fine. If I copy and paste your JSP, and run it
here, I get the following as the content type in the HTTP headers:

Content-Type: text/html; charset=UTF-8

You're seeing empty squares where you'd expect characters for a couple of
reasons:

The Unicode escape for trademark is \u2122, according to the HTML 4.01 spec.
The raw copyright characters in your document are in ISO-8859-1, not UTF-8.

If you replace \u0099 by \u2122, it works just fine.

Alternatively, why not use trade;, copy;, and reg; instead?

HTH,

Dan.

 -Original Message-
 From: Affan Qureshi [mailto:[EMAIL PROTECTED]]
 Sent: 31 January 2003 11:38
 To: Tomcat Users List
 Subject: Setting UTF-8 Encoding


 I am having trouble setting the encoding to UTF-8 and hence my
 web pages are
 unable to render characters like the Trademark or Copyright symbols. In
 Tomcat's source at various places teh character encoding is
 hard-coded to be
 ISO-8859-1. I have tried to use the filter in the examples to set the
 encoding type but that did not help and I kept seeing questionamarks for
 those characters. I have also tried to modify the source and
 build again but
 that doesn't work either (I know I must be doing something wrong here.)

 Somehow tomcat doesn't allow me to change the character encoding to UTF-8.
 The same JSPs are looking fine on Weblogic and Resin without any
 configuration/modification to the server settings.

 Any ideas how can I fix this ugly problem in my app. The app is unusable
 without this.

 Thanks a lot.

 Affan


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: Setting UTF-8 Encoding

2003-01-31 Thread Affan Qureshi
Thanks for your help.

 Affan,

 The encoding is set just fine. If I copy and paste your JSP, and run it
 here, I get the following as the content type in the HTTP headers:

 Content-Type: text/html; charset=UTF-8

 You're seeing empty squares where you'd expect characters for a couple of
 reasons:

 The Unicode escape for trademark is \u2122, according to the HTML 4.01
spec.
 The raw copyright characters in your document are in ISO-8859-1, not
UTF-8.
 If you replace \u0099 by \u2122, it works just fine.

Unfortunately I get the same '?' in place of those characters even by
replacing it.

 Alternatively, why not use trade;, copy;, and reg; instead?

The problem is that the data entry guys will copy/paste these symbols from
the webpages and they will go just as the come. When viewing from database
the JSP has to recognize the characters and display them accordingly. Even
if i use a filter for replacing these characters it won't help because the
JSPWriter would have placed the ? already in the stream being sent to the
browser.

The same thing works for Servlets but not for JSPs. I read that the writer
in Servlets uses the content-type set in request to determine the encoding
while JSPWriter uses the system settings of the Locale or something.

Nasty problem isn't it?

 HTH,

 Dan.

Affan


  -Original Message-
  From: Affan Qureshi [mailto:[EMAIL PROTECTED]]
  Sent: 31 January 2003 11:38
  To: Tomcat Users List
  Subject: Setting UTF-8 Encoding
 
 
  I am having trouble setting the encoding to UTF-8 and hence my
  web pages are
  unable to render characters like the Trademark or Copyright symbols. In
  Tomcat's source at various places teh character encoding is
  hard-coded to be
  ISO-8859-1. I have tried to use the filter in the examples to set the
  encoding type but that did not help and I kept seeing questionamarks for
  those characters. I have also tried to modify the source and
  build again but
  that doesn't work either (I know I must be doing something wrong here.)
 
  Somehow tomcat doesn't allow me to change the character encoding to
UTF-8.
  The same JSPs are looking fine on Weblogic and Resin without any
  configuration/modification to the server settings.
 
  Any ideas how can I fix this ugly problem in my app. The app is unusable
  without this.
 
  Thanks a lot.
 
  Affan



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




RE: Setting UTF-8 Encoding

2003-01-31 Thread Ralph Einfeldt
You have in your mind that there are several levels where
there can happen mischief with international characters.

- Generating java for the jsp
  You can verify this by a look at the generated source
  in the work directory. Do they look like you expect?

- Compiling the generated java
  I don't know how you can control which encoding is used
  to compile the source. (As I'm not using tomcat much,
  I had not to deal with this yet)

- Handling at runtime
- Setting no/wrong headers
- The browser.
  As your code works in other engines these levels are the 
  ones which are least likely to be the cause of the problem.

My guess is, that it is #2 that is causing the pain.

 -Original Message-
 From: Affan Qureshi [mailto:[EMAIL PROTECTED]]
 Sent: Friday, January 31, 2003 1:47 PM
 To: Tomcat Users List
 Subject: Re: Setting UTF-8 Encoding
 
 
 Locale object is set to to en_US by default. And I am using 
 Tomcat 4.1.18
 on Win2K. i have also tried the same on SunSOLARIS and Linux.
 
 If I use servlets instead of JSP it works fine and output the 
 characters as
 required. But i guess its the JSPWriter that does something 
 which shows teh
 question marks in place of those characters.
 
 However, the same JSP code works in for Resin and Weblogic.
 
 Baffled
 
 Affan
 
 - Original Message -
 From: Masood Ahmed [EMAIL PROTECTED]
 To: Tomcat Users List [EMAIL PROTECTED]
 Sent: Friday, January 31, 2003 5:13 PM
 Subject: Re: Setting UTF-8 Encoding
 
 
  Have you tried setting the locale directly on the
  request object? See if that helps.
 
  What version of tomcat are you using?
 
  thanks,
  -Masood
 
  --- Affan Qureshi [EMAIL PROTECTED] wrote:
   I forgot to paste my code which is there at the
   bottom now.
  
I am having trouble setting the encoding to UTF-8
   and hence my web pages
   are
unable to render characters like the Trademark or
   Copyright symbols. In
Tomcat's source at various places teh character
   encoding is hard-coded to
   be
ISO-8859-1. I have tried to use the filter in the
   examples to set the
encoding type but that did not help and I kept
   seeing questionamarks for
those characters. I have also tried to modify the
   source and build again
   but
that doesn't work either (I know I must be doing
   something wrong here.)
   
Somehow tomcat doesn't allow me to change the
   character encoding to UTF-8.
The same JSPs are looking fine on Weblogic and
   Resin without any
configuration/modification to the server settings.
   
Any ideas how can I fix this ugly problem in my
   app. The app is unusable
without this.
   
Thanks a lot.
   
Affan
  
Here is my code for the Test JSP:
   %@page contentType=text/html; charset=UTF-8%
   html
   headtitleTest JSP/title/head
   body
   % out.println('\u00A9'); %
   % System.out.println(This © is test);%
   BR
   % out.println(This ° is test); %
   BR
   % out.println(This © is test); %
   BR
   % out.println(This \u00A9 is test); %  %= ©%
   BR
   % out.println(This \u00B0 is test); %
   BR
   % out.println(This \u00AE is test); %
   BR
   % out.println(This \u0099 is test); %
   BR
   % out.println(This \u00F6 is test); %
   % out.flush(); %
   /body
   /html
  
  
  
  
 -
   To unsubscribe, e-mail:
   [EMAIL PROTECTED]
   For additional commands, e-mail:
   [EMAIL PROTECTED]
  
 
 
  __
  Do you Yahoo!?
  Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
  http://mailplus.yahoo.com
 
  
 -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: Setting UTF-8 Encoding

2003-01-31 Thread Affan Qureshi
I am sorry. It has something to do with struts. Coz if I place my JSP in
examples or tomcat-docs it works fine. Somehow Struts is messing things up.
Or i havent configured things properly.

Affan

- Original Message -
From: Daniel Brown [EMAIL PROTECTED]
To: Tomcat Users List [EMAIL PROTECTED]
Sent: Friday, January 31, 2003 5:50 PM
Subject: RE: Setting UTF-8 Encoding


 Affan,

 The encoding is set just fine. If I copy and paste your JSP, and run it
 here, I get the following as the content type in the HTTP headers:

 Content-Type: text/html; charset=UTF-8

 You're seeing empty squares where you'd expect characters for a couple of
 reasons:

 The Unicode escape for trademark is \u2122, according to the HTML 4.01
spec.
 The raw copyright characters in your document are in ISO-8859-1, not
UTF-8.

 If you replace \u0099 by \u2122, it works just fine.

 Alternatively, why not use trade;, copy;, and reg; instead?

 HTH,

 Dan.

  -Original Message-
  From: Affan Qureshi [mailto:[EMAIL PROTECTED]]
  Sent: 31 January 2003 11:38
  To: Tomcat Users List
  Subject: Setting UTF-8 Encoding
 
 
  I am having trouble setting the encoding to UTF-8 and hence my
  web pages are
  unable to render characters like the Trademark or Copyright symbols. In
  Tomcat's source at various places teh character encoding is
  hard-coded to be
  ISO-8859-1. I have tried to use the filter in the examples to set the
  encoding type but that did not help and I kept seeing questionamarks for
  those characters. I have also tried to modify the source and
  build again but
  that doesn't work either (I know I must be doing something wrong here.)
 
  Somehow tomcat doesn't allow me to change the character encoding to
UTF-8.
  The same JSPs are looking fine on Weblogic and Resin without any
  configuration/modification to the server settings.
 
  Any ideas how can I fix this ugly problem in my app. The app is unusable
  without this.
 
  Thanks a lot.
 
  Affan
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Switching on UTF-8 Encoding

2002-02-07 Thread Antony Stace
Hi

What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 encoding 
to a requesting browser and
requests received are read as UTF-8.

-- 


Cheers

Tony$B!#(B
-


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]


Re: Switching on UTF-8 Encoding

2002-02-07 Thread jeff . guttadauro

 You can use %@ page contentType="text/html;charset=UTF-8" % in the JSP or
alternatively include the META HTTP-EQUIV="Content-Type" CONTENT="text/html;
charset=UTF-8" tag in your HTML.  This will tell the browser to use the UTF-8
Encoding.

Then when getting the requests, you can do a request.setCharacterEncoding
("UTF-8") before getting anything from the request to allow you to read in
parameters as UTF-8.  You could also try just reading in the parameters
without setting that, and then doing param.getBytes("UTF-8").

I've been struggling with some encoding issues for a little while now, but I
have it working, so if you have any other questions, please feel free to email
me and I'll see if I can help.

Good luck,
-Jeff



   

Antony Stace   

s45652001@yaTo: [EMAIL PROTECTED]

hoo.com cc:   

         Subject: Switching on UTF-8 Encoding  

02/07/02   

07:45 AM   

Please 

respond to 

"Tomcat Users  

List"  

   

   





Hi

What do I need to do so that data returned from Tomcat 4 is returned in UTF-8
encoding to a requesting browser and
requests received are read as UTF-8.

--


Cheers

Tony$B!#(B
-


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]






--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]


Re: Switching on UTF-8 Encoding

2002-02-07 Thread Craig R. McClanahan



On Thu, 7 Feb 2002, Antony Stace wrote:

 Date: Thu, 7 Feb 2002 22:45:23 +0900
 From: Antony Stace [EMAIL PROTECTED]
 Reply-To: Tomcat Users List [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Switching on UTF-8 Encoding

 Hi

 What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 
encoding to a requesting browser and
 requests received are read as UTF-8.


For writing UTF-8 content, your servlet needs to set the character
encoding *before* it gets the response's writer:

  response.setContentType(text/html;charset=UTF-8);
  PrintWriter writer = response.getWriter();
  writer.println(This line will be written in UTF-8);

For reading, the browser should have set a character encoding on its
Content-Type header.  If it didn't (or if this is a GET request and you
are trying to process query string parameters), call the following
*before* calling any of the request.getParameter methods (or
request.getReader):

  request.setCharacterEncoding(UTF-8);

Note that this method was added in Servlet 2.3, so it won't work in Tomcat
3.x environments.

 --


 Cheers

 Tony$B!#(B

Craig


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




Re: Switching on UTF-8 Encoding

2002-02-07 Thread timothy
i did it by using filter, it works quite good


From Timothy

- Original Message -
From: "Antony Stace" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, February 07, 2002 9:45 PM
Subject: Switching on UTF-8 Encoding


 Hi

 What do I need to do so that data returned from Tomcat 4 is returned in
UTF-8 encoding to a requesting browser and
 requests received are read as UTF-8.

 --


 Cheers

 Tony$B!#(B
 -


 _
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com


 --
 To unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]



--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]


Re: Switching on UTF-8 Encoding

2002-02-07 Thread Antony Stace
Thanks Jeff, Timothy, Craig for your replies.

I have a situation where I have a form which is UTF-8 format.  In the servlet(I am 
acutally using struts)
when I am processing a user request I use


name = userForm.getName();  //Struts saves the information from a form in a Bean
name = new String(name.getBytes(),"UTF-8");

I can then save the name value in a database without problems.

I then use the contents of the Bean to write output in a jsp file but I get garbage.
Does this mean that the format of the data in the Bean is incorrect?  Should the
values in this bean be written in a different format?

If it is any use, I printed out the request and response encoding to a log file in the 
servlet,

 request.getCharacterEncoding()  = null
 response.getCharacterEncoding() = ISO-8859-1


Cheers

Tony


On Thu, 7 Feb 2002 08:59:53 -0600
[EMAIL PROTECTED] wrote:

 
  You can use %@ page contentType="text/html;charset=UTF-8" % in the JSP or
 alternatively include the META HTTP-EQUIV="Content-Type" CONTENT="text/html;
 charset=UTF-8" tag in your HTML.  This will tell the browser to use the UTF-8
 Encoding.
 
 Then when getting the requests, you can do a request.setCharacterEncoding
 ("UTF-8") before getting anything from the request to allow you to read in
 parameters as UTF-8.  You could also try just reading in the parameters
 without setting that, and then doing param.getBytes("UTF-8").
 
 I've been struggling with some encoding issues for a little while now, but I
 have it working, so if you have any other questions, please feel free to email
 me and I'll see if I can help.
 
 Good luck,
 -Jeff
 
 
 
  
  
 Antony Stace 
  
 s45652001@yaTo: [EMAIL PROTECTED]  
  
 hoo.com cc: 
  
      Subject: Switching on UTF-8 Encoding
  
 02/07/02 
  
 07:45 AM 
  
 Please   
  
 respond to   
  
 "Tomcat Users
  
 List"
  
  
  
  
  
 
 
 
 
 Hi
 
 What do I need to do so that data returned from Tomcat 4 is returned in UTF-8
 encoding to a requesting browser and
 requests received are read as UTF-8.
 
 --
 
 
 Cheers
 
 Tony$B!#(B
 -
 
 
 _
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com
 
 
 --
 To unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]
 
 
 
 
 
 
 --
 To unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]


-- 


Cheers

Tony$B!#(B
-


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]


Re: Switching on UTF-8 Encoding

2002-02-07 Thread Karthik Gopal

Hi Tony,

The issue maybe in these places:

1. Request object
- Jeff has covered the issue.

2. Database I/O
- You have find out what type of Unicode encoding does the Database
support. (UTF-8 or UCS-2).
   If it is UCS-2 then you have convert the data into UTF-8 at the java
end.

3. The JSP's encoding should set as UTF-8. As mentioned by Jeff.
Moreover the browser should have access to the appropriate fonts to show
the data.


Regards,
Karthik

- Original Message -
From: Antony Stace [EMAIL PROTECTED]
To: Tomcat Users List [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Friday, February 08, 2002 8:33 AM
Subject: Re: Switching on UTF-8 Encoding


 Thanks Jeff, Timothy, Craig for your replies.

 I have a situation where I have a form which is UTF-8 format.  In the
servlet(I am acutally using struts)
 when I am processing a user request I use


 name = userForm.getName();  file://Struts saves the information from a
form in a Bean
 name = new String(name.getBytes(),UTF-8);

 I can then save the name value in a database without problems.

 I then use the contents of the Bean to write output in a jsp file but I
get garbage.
 Does this mean that the format of the data in the Bean is incorrect?
Should the
 values in this bean be written in a different format?

 If it is any use, I printed out the request and response encoding to a log
file in the servlet,

  request.getCharacterEncoding()  = null
  response.getCharacterEncoding() = ISO-8859-1


 Cheers

 Tony


 On Thu, 7 Feb 2002 08:59:53 -0600
 [EMAIL PROTECTED] wrote:

 
   You can use %@ page contentType=text/html;charset=UTF-8 % in the
JSP or
  alternatively include the META HTTP-EQUIV=Content-Type
CONTENT=text/html;
  charset=UTF-8 tag in your HTML.  This will tell the browser to use the
UTF-8
  Encoding.
 
  Then when getting the requests, you can do a
request.setCharacterEncoding
  (UTF-8) before getting anything from the request to allow you to read
in
  parameters as UTF-8.  You could also try just reading in the parameters
  without setting that, and then doing param.getBytes(UTF-8).
 
  I've been struggling with some encoding issues for a little while now,
but I
  have it working, so if you have any other questions, please feel free to
email
  me and I'll see if I can help.
 
  Good luck,
  -Jeff
 
 
 
 
  Antony Stace
  s45652001@yaTo:
[EMAIL PROTECTED]
  hoo.com cc:
   Subject: Switching on UTF-8
Encoding
  02/07/02
  07:45 AM
  Please
  respond to
  Tomcat Users
  List
 
 
 
 
 
 
  Hi
 
  What do I need to do so that data returned from Tomcat 4 is returned in
UTF-8
  encoding to a requesting browser and
  requests received are read as UTF-8.
 
  --
 
 
  Cheers
 
  Tony$B!#(B
  -
 
 
  _
  Do You Yahoo!?
  Get your free @yahoo.com address at http://mail.yahoo.com
 
 
  --
  To unsubscribe:   mailto:[EMAIL PROTECTED]
  For additional commands: mailto:[EMAIL PROTECTED]
  Troubles with the list: mailto:[EMAIL PROTECTED]
 
 
 
 
 
 
  --
  To unsubscribe:   mailto:[EMAIL PROTECTED]
  For additional commands: mailto:[EMAIL PROTECTED]
  Troubles with the list: mailto:[EMAIL PROTECTED]


 --


 Cheers

 Tony$B!#(B
 -


 _
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com


 --
 To unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]




--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




Re: Switching on UTF-8 Encoding

2002-02-07 Thread Craig R. McClanahan



On Fri, 8 Feb 2002, Antony Stace wrote:

 Date: Fri, 8 Feb 2002 12:03:35 +0900
 From: Antony Stace [EMAIL PROTECTED]
 Reply-To: Tomcat Users List [EMAIL PROTECTED]
 To: Tomcat Users List [EMAIL PROTECTED],
  [EMAIL PROTECTED]
 Subject: Re: Switching on UTF-8 Encoding

 Thanks Jeff, Timothy, Craig for your replies.

 I have a situation where I have a form which is UTF-8 format.  In the servlet(I am 
acutally using struts)
 when I am processing a user request I use


 name = userForm.getName();  //Struts saves the information from a form in a Bean
 name = new String(name.getBytes(),UTF-8);

 I can then save the name value in a database without problems.

 I then use the contents of the Bean to write output in a jsp file but I get garbage.
 Does this mean that the format of the data in the Bean is incorrect?  Should the
 values in this bean be written in a different format?

 If it is any use, I printed out the request and response encoding to a log file in 
the servlet,

  request.getCharacterEncoding()  = null
  response.getCharacterEncoding() = ISO-8859-1


This means that your browser didn't include a character encoding in it's
Content-Type header on the form submission (sadly typical, unfortunately).

If you know that you're running on a Servlet 2.3 environment (like Tomcat
4), you can call request.setCharacterEncoding() *before* calling any of
the getParameter() methods, and Tomcat will do the translation for you.
One approach to this is to use a Filter -- an example filter that does
this sort of thing (SetCharacterEncodingFilter) is included in the
WEB-INF/classes of the example webapp that is shipped with Tomcat 4.


 Cheers

 Tony


Craig



 On Thu, 7 Feb 2002 08:59:53 -0600
 [EMAIL PROTECTED] wrote:

 
   You can use %@ page contentType=text/html;charset=UTF-8 % in the JSP or
  alternatively include the META HTTP-EQUIV=Content-Type CONTENT=text/html;
  charset=UTF-8 tag in your HTML.  This will tell the browser to use the UTF-8
  Encoding.
 
  Then when getting the requests, you can do a request.setCharacterEncoding
  (UTF-8) before getting anything from the request to allow you to read in
  parameters as UTF-8.  You could also try just reading in the parameters
  without setting that, and then doing param.getBytes(UTF-8).
 
  I've been struggling with some encoding issues for a little while now, but I
  have it working, so if you have any other questions, please feel free to email
  me and I'll see if I can help.
 
  Good luck,
  -Jeff
 
 
 
 
  Antony Stace
  s45652001@yaTo: [EMAIL PROTECTED]
  hoo.com cc:
   Subject: Switching on UTF-8 Encoding
  02/07/02
  07:45 AM
  Please
  respond to
  Tomcat Users
  List
 
 
 
 
 
 
  Hi
 
  What do I need to do so that data returned from Tomcat 4 is returned in UTF-8
  encoding to a requesting browser and
  requests received are read as UTF-8.
 
  --
 
 
  Cheers
 
  Tony$B!#(B
  -
 
 
  _
  Do You Yahoo!?
  Get your free @yahoo.com address at http://mail.yahoo.com
 
 
  --
  To unsubscribe:   mailto:[EMAIL PROTECTED]
  For additional commands: mailto:[EMAIL PROTECTED]
  Troubles with the list: mailto:[EMAIL PROTECTED]
 
 
 
 
 
 
  --
  To unsubscribe:   mailto:[EMAIL PROTECTED]
  For additional commands: mailto:[EMAIL PROTECTED]
  Troubles with the list: mailto:[EMAIL PROTECTED]


 --


 Cheers

 Tony$B!#(B
 -


 _
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com


 --
 To unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]




--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request parameters?

2001-02-19 Thread Mike Spreitzer

Consider a form that is encoded in UTF-8.  Here's how it comes down:

HTTP/1.0 200 OK
Content-Type: text/html; charset=UTF-8
Servlet-Engine: Tomcat Web Server/3.2.1 (JSP 1.1; Servlet 2.2; Java 1.3.0; 
AIX 4.3 ppc; java.vendor=IBM Corporation)


!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/DTD/loose.dtd"
html
...
FORM METHOD=POST ACTION="/servlet/SusrReg"
...
INPUT NAME="usr" TYPE=text SIZE="20"
...

I fill in the "usr" field with a single character, U+201D, and submit. 
Here's how the submission goes up:

POST /servlet/SusrReg HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/x-comet, application/pdf, */*
Referer: http://9.2.43.70:8085/servlet/SusrReg
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
Host: 9.2.43.70:8085
Content-Length: 165
Connection: Keep-Alive
Cookie: JSESSIONID=loj2w5hcz1

usr=%E2%80%9DB1=Submit

In my servlet, I find the value of the request parameter named "usr" is a 
string of three characters: U+00E2, U+0080, U+009D.  Should I be offended, 
or expect that the servlet should have to decode the UTF-8?  I find the 
servlet spec v2.2 fairly silent on the issue, leading me to expect that 
the servlet container is supposed to handle the full parameter decoding.

Thanks,
Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request parameters?

2001-02-19 Thread Marc Saegesser

This is the way it is supposed to work.  The default form submission
encoding is application/x-www-form-urlencoded (which you'll notice is what
got sent in the Content-Type header.  This means that all non-ASCII data is
going to get URL encoded using %HH (where H is single HEX digit).  Your
single input character got turned into Unicode and then encoded into UTF-8
which turned it into 3 bytes.  These three bytes where then URL encoded and
sent to the servlet.

You'll also notice that nothing in the POST request sent to the servlet
indicates the character encoding.  There is no way for the servlet container
to convert this data from the three bytes it receives back into characters
because nothing supplies the appropriate encoding.  This is not the fault of
the container, its a major hole in the HTTP and HTML specifications that
makes any I18n effort a royal pain in the a**.

There are a couple ways to decode the data but what I use is something like
this:

   sValue = new String(sOriginal.getBytes("8859_1"), sEncoding);

where sEncoding is the encoding used in the client (e.g. Shift_JIS).  You
can't determine sEncoding a proiori.  You'll need to either assume that all
data sent to your application is in a given encoding or pass the correct
encoding in a hidden form field, etc.



 -Original Message-
 From: Mike Spreitzer [mailto:[EMAIL PROTECTED]]
 Sent: Monday, February 19, 2001 4:27 PM
 To: [EMAIL PROTECTED]
 Subject: Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request
 parameters?


 Consider a form that is encoded in UTF-8.  Here's how it comes down:

 HTTP/1.0 200 OK
 Content-Type: text/html; charset=UTF-8
 Servlet-Engine: Tomcat Web Server/3.2.1 (JSP 1.1; Servlet 2.2;
 Java 1.3.0;
 AIX 4.3 ppc; java.vendor=IBM Corporation)


 !DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/DTD/loose.dtd"
 html
 ...
 FORM METHOD=POST ACTION="/servlet/SusrReg"
 ...
 INPUT NAME="usr" TYPE=text SIZE="20"
 ...

 I fill in the "usr" field with a single character, U+201D, and submit.
 Here's how the submission goes up:

 POST /servlet/SusrReg HTTP/1.1
 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
 application/x-comet, application/pdf, */*
 Referer: http://9.2.43.70:8085/servlet/SusrReg
 Accept-Language: en-us
 Content-Type: application/x-www-form-urlencoded
 Accept-Encoding: gzip, deflate
 User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
 Host: 9.2.43.70:8085
 Content-Length: 165
 Connection: Keep-Alive
 Cookie: JSESSIONID=loj2w5hcz1

 usr=%E2%80%9DB1=Submit

 In my servlet, I find the value of the request parameter named "usr" is a
 string of three characters: U+00E2, U+0080, U+009D.  Should I be
 offended,
 or expect that the servlet should have to decode the UTF-8?  I find the
 servlet spec v2.2 fairly silent on the issue, leading me to expect that
 the servlet container is supposed to handle the full parameter decoding.

 Thanks,
 Mike


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, email: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]