Daniel McBrearty wrote:
>> It seems to me that Template-Toolkit does no UTF-8-encoding of the
>> outputted variables.
>>     
>
> well, it shouldn't.
>
> In perl, strings are already utf8, internally. If TT was to do
> encoding, they would be double-encoded.
>   

The internal encoding that perl uses is none of your business and has no
relevance to anything. :)  It can be changed at any time, and your
software is extremely unreliable if it depends on it (and nobody wants
that).  Here's my usual advice about unicode:

 * use utf8 if and only if your source code is utf-8 encoded.
 * Encode::decode() every piece of data that comes from outside your
program.  If you have utf8-encoded data and you forget to do this,
you'll end up with double-encoded data when perl implicitly upgrades it
to utf8.  By default, text is latin-1.  Here's a simulation of what
happens when the octet string "日本語" is implicitly upgraded:

 $ recode latin1..utf8
 日本語
 æ¥æ¬èª

* Encode::encode() your perl strings to utf-8 right before you print
them to a utf-8 terminal or web browser.  If you try to print the raw
strings, you'll see this error (sometimes; when the characters aren't in
latin-1):

 $ perl -e 'syswrite *STDOUT,"\x{CAFE}\n"'
 Wide character in syswrite at -e line 1.

"encoding::warnings" will tell you when something wonky is happening
with your code.  In your case, remember that you need to make sure all
these things are decoded before using them as text:

 * database data
 * templates themselves
 * data the user enters in via a form
 * url params
 * filenames
 * file values (but not if they're binary)
 * etc.

HTH.  Good luck.

Regards,
Jonathan Rockway


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
List: Catalyst@lists.rawmode.org
Listinfo: http://lists.rawmode.org/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.rawmode.org/
Dev site: http://dev.catalyst.perl.org/

Reply via email to