Matthew Darwin wrote:


Stas Bekman wrote:


I'd suggest to take whatever data you s/// and try it outside mod_perl first. May be your filter or some previous filter has truncated the UTF-8 char in the middle? You should be aware that other filters are not aware of the encoding, and they just give you the amount of data your filter asks for. So it's quite possible that you can't process the data as-is when you get it, because you may get only a half of the char. So you either need to recognize that and buffer it up for the next filter invocation or you should ask for more data to get the other half.


It'd be definitely a good test to add to our test suite, once this is resolved on your side.


Thanks Stas,

Before trying to solve things, please do what I've asked you for. Take the contents $leftover . $buffer it's complaining about and run do_it outside mod_perl filter (in a simple script). Just to confirm that my suggestion was correct. If it's the in-the-middle problem, you will see the same problem outside of mod_perl.


Here is my handler(). How can you tell if you're in the middle of a UTF-8 character or not? Also, does perl know at this point that it is a UTF-8 string? or do I need to tell it again (ie as the string goes through apache it looses it UTF-8 bit?)

Apache IO never sets or unsets any bits, it just passes through raw data to and from the client. Once Perl passes its data to Apache, all magic bits are lost (because they live in the SV (scalar) datastructure and not string itself).


I'd suggest that you read perlunicode.pod and perluniintro.pod manpages and you tell us the answers to these questions. I've a very limited experience in this area and will need to read those docs myself to give you correct answers. But the problem could be totally different as suggested above. It's quite possible that all you need to do to fix the problem is to run: utf8::decode($_);

sub do_it {
        my $r = shift;
        local $_ = shift;
        utf8::decode($_);
...

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


-- Reporting bugs: http://perl.apache.org/bugs/ Mail list info: http://perl.apache.org/maillist/modperl.html



Reply via email to