In this particular case (template output to a JSP), explicitly setting
the page's charset seems to turn the whole thing into a
JSP-and-Tomcat-related problem.
----------------------------------------------------------------------
This:
<%@ page contentType="text/html;charset=utf8" %>
leads to:
org.apache.jasper.JasperException: Unsupported encoding: utf8
at
org.apache.jasper.compiler.ParserController.getReader(ParserController.java:440)
at
org.apache.jasper.compiler.ParserController.parse(ParserController.java:209)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:210)
at org.apache.jasper.servlet.JspServlet.loadJSP(JspServlet.java:548)
...
And this:
<%@ page contentType="text/html;charset=utf-8" %>
unfortunately results in:
org.apache.jasper.compiler.ParseException: Cannot read file: ze file
at org.apache.jasper.compiler.JspReader.pushFile2(JspReader.java:275)
at org.apache.jasper.compiler.JspReader.(JspReader.java:316)
at org.apache.jasper.compiler.Parser.(Parser.java:137)
at
org.apache.jasper.compiler.ParserController.parse(ParserController.java:214)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:210)
at org.apache.jasper.servlet.JspServlet.loadJSP(JspServlet.java:548)
at org.apache.jasper.servlet.JspServlet$JspServletWrapper.loadIfNec
...
----------------------------------------------------------------------
Tomcat doesn't seem to be very tolerant about the spelling of "utf-8".
And the "ze file" error message doesn't turn up a lot in Google. I think
I know what it is. My JSP says "I've got utf-8.", but it still contains
regular non-ASCII Latin-1 that's cleary not doing well in a utf-8
surrounding.
But that's likely to be off-topic, anyhow. And the problem's nature is
general. (I'm having Template::Toolkit generate Java classes, XML etc.)
So I'm reformulating my problem. I've got a template file containing a
couple of non-ASCII Latin-1 characters (among all the ASCII characters),
and they're getting turned into utf-8 by the template engine (at least
that's what I think). But I've got more text being included into the
output files, and not all of that is handled by Template::Toolkit.
So, there are some utf-8 mutants and some Latin-1 survivors sharing one
file.
That's not nice to look at, and apparently even more difficult to compile.
So, I'd prefer a way to tell Template::Toolkit not to mess around with
my characters.
Or is that a problem related to the Perl interpreter rather than to a
module? I really don't know and I haven't come across an answer
consulting Google and this list's mail headers of the past twelve months.
##########
From: Peter Guzis <[EMAIL PROTECTED]>
To: templates <[EMAIL PROTECTED]>
Subject: RE: [Templates] charset / character coding / encoding / utf8 / is
o8859-1
Date: Thu, 27 Jun 2002 09:56:08 -0700
Are you explicitly setting the page's character set?
Example:
<meta http-equiv=3D"Content-Type" content=3D"text/html; =
charset=3Dutf8">
Peter Guzis
Web Administrator, Sr.
ENCAD, Inc.
- A Kodak Company
email: [EMAIL PROTECTED]
www.encad.com=20
-----Original Message-----
From: Michael Ludwig [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, June 26, 2002 6:02 PM
To: templates
Subject: [Templates] charset / character coding / encoding / utf8 /
iso8859-1
Hello list,
I've got a template containing Latin-1 characters (like =E4, =F6, =FC, =
=DF) and=20
they're getting mangled into what might be the utf8 representation of=20
said characters. Or something different.
To give you an example, a line from my template reads:
label.Reset=3DZur=FCcksetzen
This mutates to:
label.Reset=3DZur=C3=BCcksetzen
And it shows up like that in my browser. I can correct that situation =
by=20
switching the browser's assumption of the incoming content over to=20
utf-8; but that to no avail, since other characters that have not been=20
mistreated by the template engine (now, that's only my assumption) now=20
show up in the wrong encoding.
And it's all the more confusing to me as my editor (vim) does not=20
display those character sequences the same way my browser does.
So my question is: Does anyone know how to tell the Template::Toolkit=20
NOT to transform non-ASCII Latin-1 characters to UTF-8?
Thanks a lot!
Michael