Hi All,
I got bit by a bug with Apache::Util's escape_html() function in mod_perl 1. It seems that it doesn't like Perl's Unicode encoded strings! This patch demonstrates the issue (be sure that your editor understands utf-8):
Any chance you have a patch to fix that too, David? At the moment we have zero time to look at mp1 bugs. Once mp2 is out it might be more likely. Does it work fine with mp2?
--- modperl/t/net/perl/util.pl.~1.18.~ Sun May 25 03:54:08 2003 +++ modperl/t/net/perl/util.pl Thu Sep 9 19:38:40 2004 @@ -74,6 +74,25 @@
#print $esc_2; test ++$i, $esc eq $esc_2; + +# Make sure that escape_html() understands multibyte characters. +my $utf8 = '<åè>'; +my $esc_utf8 = '<åè>'; +my $test_esc_utf8 = Apache::Util::escape_html($utf8); +test ++$i, $test_esc_utf8 eq $esc_utf8; +#print STDERR "Compare '$test_esc_utf8'\n to '$esc_utf8'\n"; + +eval { require Encode }; +unless ($@) { + # Make sure escape_html() properly handles strings with Perl's + # Unicode encoding. + $utf8 = Encode::decode_utf8($utf8); + $esc_utf8 = Encode::decode_utf8($esc_utf8); + $test_esc_utf8 = Apache::Util::escape_html($utf8); + test ++$i, $test_esc_utf8 eq $esc_utf8; + #print STDERR "Compare '$test_esc_utf8'\n to '$esc_utf8'\n"; +} + use Benchmark;
=pod
========================End Patch ======================================
If I enable the print statements and look at the log, I see this:
Compare '<åè>' to '<åè>' Compare '<ÃÂÃÂÂ>' to '<åè>'
The first escape appears to work correctly, but when I decode the string to Perl's Unicode representation, you can see how badly escape_html() munges the text!
Curiously, both tests fail, although the first conversion appears to be correct. This could be due to the behavior of C<eq>, though I'm not sure why. But it's the second test that's the more interesting, since it really screws things up.
If you have trouble reading the Unicode characters in this email, I've also posted it to my blog.
http://www.justatheory.com/computers/programming/perl/mod_perl/ escape_html_utf8.html
Regards,
David
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] g
-- __________________________________________________________________ Stas Bekman JAm_pH ------> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]