RE: character encoding in HTTP GET / POST requests

2001-11-27 Thread Hoogervorst, Nico


Suppose I would migrate to Tomcat 3.3, would changing the defaultEncoding
of the DecodeInterceptor in the server.xml file do the trick for me?

See: 
http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html#DecodeInterce
ptor


But how can I do the same in Tomcat 4.0?

thx,

Nico Hoogervorst
Compuware



 -Original Message-
 From: Hoogervorst, Nico [mailto:[EMAIL PROTECTED]]
 Sent: Monday, November 26, 2001 5:44 PM
 To: '[EMAIL PROTECTED]'
 Subject: character encoding in HTTP GET / POST requests
 
 
 Hi there,
 
 About: HTTP GET/POST request in UTF-8 
with Tomcat 3.2, Apache webserver, struts-framework
 
 To be able to display foreign characters, we want to use 
 unicode. We therefor specify the following contentType in 
 the page directive of the jsp pages:
contentType=text/html; charset=utf-8
 This means that the response of the jsp page will use the 
 utf-8 charset. The browser recognizes this and displays all
 characters correctly.
 
 However, when the page uses the utf-8 charset, the browser 
 will also use utf-8 in HTTP GET/POST requests to the server. 
 This would be fine, if the browser would specify that the 
 content-type of the request is: 
application/x-www-form-urlencoded; charset=utf-8
 MSIE6.0 and Netscape 6.2 do not provide the character 
 encoding information. Even when in the form-tag, the enctype 
 attribute explicitly specifies this content-type. I have 
 tested this with a CGI-script on an Apache webserver.
 According the HTTP1.1 specifications, when no character 
 encoding is provided, the server must assume that it is 
 encoded with ISO-8859-1 (ISO Latin 1). Tomcat (I use Tomcat 
 3.2) does that, but that leaves us with a problem.
 How can we handle requests in utf-8 ?
 
 In the following message:
 
 http://mikal.org/interests/java/tomcat/archive/view?mesg=35060
 
 the suggested solution is the convert parameters like this:
 
  String s = new String(
 request.getParameter(my_param).getBytes(ISO-8859-1), 
 UTF-8);
 
 However, we use the struts framework (see 
 http://jakarta.apache.org/struts/index.html), and this 
 framework handles the parameters and puts the values directly 
 into the bean. It would be inappropriate to do extra 
 conversions in the set-methods of the bean.
 
 Has anyone a solution for the described problem?
 
 Is it possible to change the default encoding from ISO-8859-1 
 into UTF-8 ? Did somebody made a patch to solve this?
 
 
   TIA
 
   Nico Hoogervorst
   Compuware
 
 

--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




RE: character encoding in HTTP GET / POST requests

2001-11-27 Thread Hoogervorst, Nico


I keep answering my own questions.  :-)

In Tomcat 4.0 the trick is to create a filter, which
sets the CharacterEncoding, like explained in:
  http://java.sun.com/products/servlet/Filters.html#72673

And then, add the filter in the web.xml, like this:

web-app
!-- Filter to set character encoding on each request --
filter
filter-nameSet Character Encoding/filter-name
filter-classfilters.SetCharacterEncodingFilter/filter-class
init-param
param-nameencoding/param-name
param-valueUTF-8/param-value
/init-param
/filter



 -Original Message-
 From: Hoogervorst, Nico [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, November 27, 2001 3:27 PM
 To: 'Tomcat Users List'
 Subject: RE: character encoding in HTTP GET / POST requests
 
 
 
 Suppose I would migrate to Tomcat 3.3, would changing the 
 defaultEncoding
 of the DecodeInterceptor in the server.xml file do the trick for me?
 
 See: 
 http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html
 #DecodeInterce
 ptor
 
 
 But how can I do the same in Tomcat 4.0?
 
 thx,
 
   Nico Hoogervorst
   Compuware
 
 
 
  -Original Message-
  From: Hoogervorst, Nico [mailto:[EMAIL PROTECTED]]
  Sent: Monday, November 26, 2001 5:44 PM
  To: '[EMAIL PROTECTED]'
  Subject: character encoding in HTTP GET / POST requests
  
  
  Hi there,
  
  About: HTTP GET/POST request in UTF-8 
 with Tomcat 3.2, Apache webserver, struts-framework
  
  To be able to display foreign characters, we want to use 
  unicode. We therefor specify the following contentType in 
  the page directive of the jsp pages:
 contentType=text/html; charset=utf-8
  This means that the response of the jsp page will use the 
  utf-8 charset. The browser recognizes this and displays all
  characters correctly.
  
  However, when the page uses the utf-8 charset, the browser 
  will also use utf-8 in HTTP GET/POST requests to the server. 
  This would be fine, if the browser would specify that the 
  content-type of the request is: 
 application/x-www-form-urlencoded; charset=utf-8
  MSIE6.0 and Netscape 6.2 do not provide the character 
  encoding information. Even when in the form-tag, the enctype 
  attribute explicitly specifies this content-type. I have 
  tested this with a CGI-script on an Apache webserver.
  According the HTTP1.1 specifications, when no character 
  encoding is provided, the server must assume that it is 
  encoded with ISO-8859-1 (ISO Latin 1). Tomcat (I use Tomcat 
  3.2) does that, but that leaves us with a problem.
  How can we handle requests in utf-8 ?
  
  In the following message:
  
  http://mikal.org/interests/java/tomcat/archive/view?mesg=35060
  
  the suggested solution is the convert parameters like this:
  
   String s = new String(
  request.getParameter(my_param).getBytes(ISO-8859-1), 
  UTF-8);
  
  However, we use the struts framework (see 
  http://jakarta.apache.org/struts/index.html), and this 
  framework handles the parameters and puts the values directly 
  into the bean. It would be inappropriate to do extra 
  conversions in the set-methods of the bean.
  
  Has anyone a solution for the described problem?
  
  Is it possible to change the default encoding from ISO-8859-1 
  into UTF-8 ? Did somebody made a patch to solve this?
  
  
  TIA
  
  Nico Hoogervorst
  Compuware
  
  
 
 --
 To unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]
 

--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




character encoding in HTTP GET / POST requests

2001-11-26 Thread Hoogervorst, Nico

Hi there,

About: HTTP GET/POST request in UTF-8 
   with Tomcat 3.2, Apache webserver, struts-framework

To be able to display foreign characters, we want to use 
unicode. We therefor specify the following contentType in 
the page directive of the jsp pages:
   contentType=text/html; charset=utf-8
This means that the response of the jsp page will use the 
utf-8 charset. The browser recognizes this and displays all
characters correctly.

However, when the page uses the utf-8 charset, the browser 
will also use utf-8 in HTTP GET/POST requests to the server. 
This would be fine, if the browser would specify that the 
content-type of the request is: 
   application/x-www-form-urlencoded; charset=utf-8
MSIE6.0 and Netscape 6.2 do not provide the character 
encoding information. Even when in the form-tag, the enctype 
attribute explicitly specifies this content-type. I have 
tested this with a CGI-script on an Apache webserver.
According the HTTP1.1 specifications, when no character 
encoding is provided, the server must assume that it is 
encoded with ISO-8859-1 (ISO Latin 1). Tomcat (I use Tomcat 
3.2) does that, but that leaves us with a problem.
How can we handle requests in utf-8 ?

In the following message:

http://mikal.org/interests/java/tomcat/archive/view?mesg=35060

the suggested solution is the convert parameters like this:

 String s = new String(
request.getParameter(my_param).getBytes(ISO-8859-1), 
UTF-8);

However, we use the struts framework (see 
http://jakarta.apache.org/struts/index.html), and this 
framework handles the parameters and puts the values directly 
into the bean. It would be inappropriate to do extra 
conversions in the set-methods of the bean.

Has anyone a solution for the described problem?

Is it possible to change the default encoding from ISO-8859-1 
into UTF-8 ? Did somebody made a patch to solve this?


TIA

Nico Hoogervorst
Compuware


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]