Hi, I posted some tips for UTF8-based web applications with DB support to this list some time ago.
Now, it looks like I was wrong at quite a few points: * use encoding 'utf8'; which I used in Embperl_Top_Include turned out to be a very bad idea, since it breaks regular expressions with non-ASCII chars. It doesn't work on all I/O, too, sometimes perl thinks it has to use Latin1 anyway. Don't ask me why. * Embperl_Top_Include doesn't work for 'use utf8;'. I think Embperl_Top_Include "use utf8;" is equivalent to [- use utf8; -] at the top of the page, and this does exactly nothing, because perl stuff in [- .. -] has its own scope, and "use utf8;" only applies to the current scope. So, one must place [* use utf8; *] at the top of every page if literal strings should be flagged as utf8. One thing puzzles me here: If I place "use utf8;" in my startup.pl (i.e., I tell mod_perl that everything under the main scope is utf8), it still doesn't work! Is it switched off by Embperl again or something? * [! ... !] blocks need their own "use utf8;". The [* use utf8; *] at the top doesn't apply to them. Argh. Why is this so? The documentation states that [! !] blocks are only executed once, but don't they run in the same scope!? A documentation of all the scopes and namespaces in which startup.pl, the httpd configuration file directives, my .ep pages, the various [ ] expressions and pages started by Execute() would be a big help. * File-IO: I thought I could make utf8 I/O the default by placing "use open IO => ':encoding(utf8)';" into Embperl_Top_Include. (this was implied by use encoding 'utf8'; which I used before, and now I needed a replacement). Doesn't work. [* use open IO => ':encoding(utf8)'; *] doesn't work either, nor any other form I tried. The only solution is to specify the character set in every open statement, like open(IN, '<:encoding(utf8)', '/tmp/some/file'); All in all, I can only say that using utf8 in Embperl is very, very frustrating, because so many things break in non-obvious ways. Could we at some point have a general "use-utf8-for-everything" switch? Regards, Torsten --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]