Hi All!
Different encodings support in Servlet/JSP is an ancient well-known problem.
The setCharacterEncoding() method of HttpServletRequest allows to change
request
encoding before reading parameters. Thus, servlet is able to change encoding
in
accordance with its needs. (Small lyrical digression: what does this
encoding mean?
I'll post my thoughts about it separately)
Howevet the problem still exists in JSP (there were several postings about
the problem in
this maillist). The purpose of this mail is to propose a solution for
encodings support in JSP.
Problem description
===================
A JSP programmer is not able to change request encoding for incoming JSP
request, since
"This method [setCharacterEncoding] must be called prior to parsing any post
data or
reading any input from the request. Calling this method once data has been
read will
not affect the encoding." (Servlet 2.3 Spec). This happens because request
parameters
being read inside org.pache.jasper.servlet.JspServlet, before calling
generated JSP-servlet.
As a result we have the following behaviour of compiled JSP for non-English
environments:
1) incoming request being read using 'ISO-8859-1'
2) getParameter() method returns a value in 'ISO-8859-1', but JSP-servlet
suppose the
return value has JVM default encoding (say "KOI8-R") -- here is ???????
instead of
real parameter value. Here is a problem.
Problem solution
================
There should be a configurable optional parameter for JspServlet (say
'requestEncoding') to
change request encoding. According to this parameter JspServlet should call
setCharacterEncoding()
before processing request. It does not conflict with JSP 1.2 Spec, since
there are now any
words about default encoding of incoming request over there.
I have made neccessary changes to implement this feature in
tomcat-4.0-20010807. It works fine
with different Cyrillic encodings. (Suppose the same result for the rest of
non-Latin1 encodings).
I clearly understand that proposed solution is not a panacea and it's a
subject to discuss.
Regards,
Andrey Aristarkhov
Diffs are followed (also as attachments). I have also attached a sample JSP
for encoding testing.
file: org/apache/jasper/EmbededServletOptions.java
147a148,152
> * Java platform encoding for incoming request.
> */
> private String requestEncoding;
>
> /**
219a225,228
> public String getRequestEncoding() {
> return requestEncoding;
> }
>
320a330
> this.requestEncoding = config.getInitParameter("requestEncoding");
file: org/apache/jasper/EmbededServletOptions.java
144a145,149
>
> /**
> * Java platform encoding for incoming request.
> */
> public String getRequestEncoding();
file: org/apache/jasper/servlet/JspServlet.java
422c422,426
< String includeUri
---
> // According to section 4.9 of Servlet 2.3 spec we have to
> // setCharacterEncoding() before reading any parameter
> if (options.getRequestEncoding()!=null)
> request.setCharacterEncoding(options.getRequestEncoding());
> String includeUri
JspServlet.diff
EmbededServletOptions.diff
Options.diff
encoding.jsp