Hi Tom, Thanks for your quick reply. The output is being generated by Perl, the strange thing being that when I use the ENCODING => 'UTF-8' configuration parameter this breaks the output. If I don’t include it then the £ sign is returned in the correct UTF-8 format.
I’ll have a look at the debugging options that you suggest and see if I can find anything else out. Thanks again, David From: Tom Molesworth [mailto:[email protected]] Sent: 21 October 2014 09:00 To: [email protected]; David Hickman Subject: Re: [Templates] Template Toolkit and UTF-8 template files Hi, On 21/10/14 08:44, David Hickman wrote: I wondered if anyone had any experience of using the Perl Template Toolkit and UTF-8 encoded files? I appear to be facing a rather Strange issue with a UTF-8 encoded template file (although the only UTF-8 encoded characters in the file are the £ sign): • If I don’t tell the template toolkit anything about the encoding of the template file then the resulting output from the template is correct and the £ signs are output in UTF-8 format How are you generating output? Is this via tpage/ttree, or Perl? A test case should make it much easier to track down any encoding issues. • If I specifically tell the template toolkit that the file is encoded in UTF-8 (either using the configuration parameters, a BOM on the template file, or both) then the £ signs in the template are converted to character code 163 (the ANSI equivalent). This breaks my intended output of the template as the character encoding reported to the browser in the response header is UTF-8 even though the file now contains characters that are not compatible with UTF-8. The £ Unicode character is codepoint 163 - http://www.fileformat.info/info/unicode/char/a3/index.htm - so it sounds like you might be trying to write Unicode strings directly without going through UTF-8 encoding? For what it's worth, tpage does the right thing: $ echo '[% "test: " %] £' > test.tt2 $ tpage test.tt2 test: £ $ tpage test.tt2 | od -t x1z 0000000 74 65 73 74 3a 20 20 c2 a3 0a >test: ...< That 0xC2 0xA3 is the expected UTF-8 encoding. In Perl, I think you'd need the ENCODING parameter if TT2 is reading the template file: #!/usr/bin/env perl use strict; use warnings; use Template; # default ->process target is STDOUT, binmode STDOUT, ':encoding(UTF-8)'; my $tt = Template->new(ENCODING => 'UTF-8'); $tt->process('test.tt2', {}) or die $tt->error; Maybe try writing the template output to a scalar, and see if it's a valid Unicode (not UTF-8) string? Data::Dumper should report \x{a3} as the character in this case. cheers, Tom
_______________________________________________ templates mailing list [email protected] http://mail.template-toolkit.org/mailman/listinfo/templates
