In this particular case (template output to a JSP), explicitly setting 
the page's charset seems to turn the whole thing into a 
JSP-and-Tomcat-related problem.

----------------------------------------------------------------------

This:

<%@ page contentType="text/html;charset=utf8" %>

leads to:

org.apache.jasper.JasperException: Unsupported encoding: utf8
        at 
org.apache.jasper.compiler.ParserController.getReader(ParserController.java:440)
        at 
org.apache.jasper.compiler.ParserController.parse(ParserController.java:209)
        at org.apache.jasper.compiler.Compiler.compile(Compiler.java:210)
        at org.apache.jasper.servlet.JspServlet.loadJSP(JspServlet.java:548)
...


And this:

<%@ page contentType="text/html;charset=utf-8" %>

unfortunately results in:

org.apache.jasper.compiler.ParseException: Cannot read file: ze file
        at org.apache.jasper.compiler.JspReader.pushFile2(JspReader.java:275)
        at org.apache.jasper.compiler.JspReader.(JspReader.java:316)
        at org.apache.jasper.compiler.Parser.(Parser.java:137)
        at 
org.apache.jasper.compiler.ParserController.parse(ParserController.java:214)
        at org.apache.jasper.compiler.Compiler.compile(Compiler.java:210)
        at org.apache.jasper.servlet.JspServlet.loadJSP(JspServlet.java:548)
        at org.apache.jasper.servlet.JspServlet$JspServletWrapper.loadIfNec
...
----------------------------------------------------------------------


Tomcat doesn't seem to be very tolerant about the spelling of "utf-8". 
And the "ze file" error message doesn't turn up a lot in Google. I think 
I know what it is. My JSP says "I've got utf-8.", but it still contains 
regular non-ASCII Latin-1 that's cleary not doing well in a utf-8 
surrounding.

But that's likely to be off-topic, anyhow. And the problem's nature is 
general. (I'm having Template::Toolkit generate Java classes, XML etc.)

So I'm reformulating my problem. I've got a template file containing a 
couple of non-ASCII Latin-1 characters (among all the ASCII characters), 
and they're getting turned into utf-8 by the template engine (at least 
that's what I think). But I've got more text being included into the 
output files, and not all of that is handled by Template::Toolkit.

So, there are some utf-8 mutants and some Latin-1 survivors sharing one 
file.

That's not nice to look at, and apparently even more difficult to compile.

So, I'd prefer a way to tell Template::Toolkit not to mess around with 
my characters.

Or is that a problem related to the Perl interpreter rather than to a 
module? I really don't know and I haven't come across an answer 
consulting Google and this list's mail headers of the past twelve months.


##########




From: Peter Guzis <[EMAIL PROTECTED]>
To: templates <[EMAIL PROTECTED]>
Subject: RE: [Templates] charset / character coding / encoding / utf8 / is
        o8859-1
Date: Thu, 27 Jun 2002 09:56:08 -0700

Are you explicitly setting the page's character set?

Example:
<meta http-equiv=3D"Content-Type" content=3D"text/html; =
charset=3Dutf8">

Peter Guzis
Web Administrator, Sr.
ENCAD, Inc.
- A Kodak Company
email: [EMAIL PROTECTED]
www.encad.com=20

-----Original Message-----
From: Michael Ludwig [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, June 26, 2002 6:02 PM
To: templates
Subject: [Templates] charset / character coding / encoding / utf8 /
iso8859-1


Hello list,

I've got a template containing Latin-1 characters (like =E4, =F6, =FC, =
=DF) and=20
they're getting mangled into what might be the utf8 representation of=20
said characters. Or something different.

To give you an example, a line from my template reads:

label.Reset=3DZur=FCcksetzen

This mutates to:

label.Reset=3DZur=C3=BCcksetzen

And it shows up like that in my browser. I can correct that situation =
by=20
switching the browser's assumption of the incoming content over to=20
utf-8; but that to no avail, since other characters that have not been=20
mistreated by the template engine (now, that's only my assumption) now=20
show up in the wrong encoding.

And it's all the more confusing to me as my editor (vim) does not=20
display those character sequences the same way my browser does.

So my question is: Does anyone know how to tell the Template::Toolkit=20
NOT to transform non-ASCII Latin-1 characters to UTF-8?

Thanks a lot!

Michael




Reply via email to