Daniel McBrearty wrote: >> It seems to me that Template-Toolkit does no UTF-8-encoding of the >> outputted variables. >> > > well, it shouldn't. > > In perl, strings are already utf8, internally. If TT was to do > encoding, they would be double-encoded. >
The internal encoding that perl uses is none of your business and has no relevance to anything. :) It can be changed at any time, and your software is extremely unreliable if it depends on it (and nobody wants that). Here's my usual advice about unicode: * use utf8 if and only if your source code is utf-8 encoded. * Encode::decode() every piece of data that comes from outside your program. If you have utf8-encoded data and you forget to do this, you'll end up with double-encoded data when perl implicitly upgrades it to utf8. By default, text is latin-1. Here's a simulation of what happens when the octet string "日本語" is implicitly upgraded: $ recode latin1..utf8 日本語 æ¥æ¬èª * Encode::encode() your perl strings to utf-8 right before you print them to a utf-8 terminal or web browser. If you try to print the raw strings, you'll see this error (sometimes; when the characters aren't in latin-1): $ perl -e 'syswrite *STDOUT,"\x{CAFE}\n"' Wide character in syswrite at -e line 1. "encoding::warnings" will tell you when something wonky is happening with your code. In your case, remember that you need to make sure all these things are decoded before using them as text: * database data * templates themselves * data the user enters in via a form * url params * filenames * file values (but not if they're binary) * etc. HTH. Good luck. Regards, Jonathan Rockway
signature.asc
Description: OpenPGP digital signature
_______________________________________________ List: Catalyst@lists.rawmode.org Listinfo: http://lists.rawmode.org/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.rawmode.org/ Dev site: http://dev.catalyst.perl.org/