Mikhael Goikhman Wrote: > Is there any practical unsolvable problem to always work with non utf-8 > flagged data only (input from or output to file, socket, cgi, db, other > modules)? And whenever you need to operate on multibyte characters you > may write a function for each such case, for example "trim" or "cut" that > does "decode_utf8", then regexp or "substr", then "encode_utf8" back. And > if you like, your function may also support both cases (using _is_utf8) > and return the output in the same manner (with or without utf8 flag).
Well, that's how it works right now. I'm just worried that Template Toolkit will get confuse handling utf8 data as latin1 data. But that's a very unlikely. Don't forget that doing it this way will introduce weird characters everywhere. Theoretically, a Hebrew char can be 0x5D + 0x10. and then suddenly you have \r in you stream and weird things happens. And now we have the question whether all the modules can handle weird/control chars in the text, or just go all the way and treat it as binary. I'll go test a few modules... Shmuel. _______________________________________________ Perl mailing list [email protected] http://perl.org.il/mailman/listinfo/perl
