Hi I'm using tomcat-4.1.30 on a Japanese windows machine. I believe I found a problem with the way XML JSP syntax is handled in Jasper. It seems the 'encoding' attribute in the <?xml declaration is not taken into account. I searched the lists for no avail on this specific pb, and then dug into the code:
In org.apache.jasper.compiler.ParserController#figureOutJspDocument() there is this comment: // Figure out the encoding of the page // FIXME: We assume xml parser will take care of // encoding for page in XML syntax. Correct?
This is correct if you feed the XML parser with a bare InputStream (i.e. if your InputSource is built around an InputStream). However, the ParserController build an InputStreamReader based on the default encoding (ISO-8859-1), which ends up beeing the InputSource base stream (see JspDocumentParser#parse). In this case, it seems that the XML parser (Xerces in my case) assumes the encoding of the stream is correct, instead of trying to figure it out of the xml declaration.
Also, it seems the <jsp:directive.page encoding=.../> is not properly taken into account. The Compiler#generateJava method will first setup the ServletWriter with the default encoding (UTF-8) and then call Validator#validate() which is responsible for handling the page directives. So the writer is already configured with UTF-8 at the time the real encoding is read.
Am I mis-unterpreting the specs or code ? Anybody had problems with encodings in Jasper ? /zog
_________________________________________________________________
Tired of spam? Get advanced junk mail protection with MSN Premium http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_Taglines
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]