I'm just wondering if anyone has any input on this issue. I'm implementing an output filter, like so:
<Files *.pcgi> SetHandler perl-script PerlResponseHandler ModPerl::Registry PerlOutputFilterHandler Apache::Kinnetics::Output </Files>
and I get the following error on some web pages that have UTF-8 data:
[Wed Nov 05 17:30:00 2003] [error] [client 127.0.0.1] panic: sv_pos_b2u: bad byte offset at /kinnetics/component/perllib/site_perl/Apache/Kinnetics/Output.pm line 221.
Line 221 is the start of this regex:
s{ <\?nm( # opening angle bracket (?: # Non-backreffing grouping paren [^>'"] * # 0 or more things that are neither > nor ' nor " | # or else ".*?" # a section between double quotes (stingy match) | # or else '.*?' # a section between single quotes (stingy match) ) + # repetire ad libitum # hm.... are null tags <> legal? XXX )\?> # closing angle bracket }{kinnetics_handler($r, $1)}geisx; # mutate
I'd suggest to take whatever data you s/// and try it outside mod_perl first. May be your filter or some previous filter has truncated the UTF-8 char in the middle? You should be aware that other filters are not aware of the encoding, and they just give you the amount of data your filter asks for. So it's quite possible that you can't process the data as-is when you get it, because you may get only a half of the char. So you either need to recognize that and buffer it up for the next filter invocation or you should ask for more data to get the other half.
It'd be definitely a good test to add to our test suite, once this is resolved on your side.
__________________________________________________________________ Stas Bekman JAm_pH ------> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
-- Reporting bugs: http://perl.apache.org/bugs/ Mail list info: http://perl.apache.org/maillist/modperl.html