>>>>> "Randal" == Randal L Schwartz <mer...@stonehenge.com> writes: Randal> Getting really frustrated with mod_perl2's apparent inability to Randal> probably read UTF8 input.
Randal> Here's my mod_perl2 setup: Randal> Apache 2.2.[something] Randal> mod_perl 2.0.7 (or nearly that) Randal> ModPerl::Registry Randal> Perl "script" with CGI.pm Randal> Very early in my app: Randal> ## ensure utf8 CGI params: Randal> $CGI::PARAM_UTF8 = 1; Randal> binmode STDIN, ":utf8"; Randal> binmode STDOUT, ":utf8"; Randal> binmode STDERR, ":utf8"; Randal> This works fine in CGI mode: when I ask for $foo = $cgi->param('foo'), Randal> DBI::data_string_desc($foo) shows a UTF8 string with the proper Randal> discrepency between bytes and chars. Randal> But when I try to run it under mod_perl, the returned string appears Randal> to be the raw ascii bytes, and definitely not utf8. Of course, when I Randal> store that in the database (using DBD::Pg), the "latin-1" is encoded Randal> to "utf-8", and I get a bunch of weird chars on the output. Randal> Has anyone managed to round-trip UTF8 from form to database and back Randal> using a setup similar to this? Randal> I suspect part of the problem is this in CGI.pm: Randal> 'read_from_client' => <<'END_OF_FUNC', Randal> # Read data from a file handle Randal> sub read_from_client { Randal> my($self, $buff, $len, $offset) = @_; Randal> local $^W=0; # prevent a warning Randal> return $MOD_PERL Randal> ? $self->r->read($$buff, $len, $offset) Randal> : read(\*STDIN, $$buff, $len, $offset); Randal> } Randal> END_OF_FUNC Randal> Since I binmode STDIN, the non-$MOD_PERL works ok here. What's the Randal> equivalent of $r->read() that marks the incoming stream as UTF8, so I Randal> get chars instead of bytes? Or can I just read(\*STDIN) in mod_perl2 Randal> as well? (I know that was supported at one point...) I realized that I never posted my ultimate solution. I monkey patch CGI.pm: require CGI; { my $orig = \&CGI::param; no warnings 'redefine'; *CGI::param = sub { $CGI::LIST_CONTEXT_WARN = 0; # workaround for backward compatibility $CGI::PARAM_UTF8 = 1; goto &$orig; }; } And this has been working just fine for both CGI and mod_perl. Just for the record. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <mer...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix consulting, Technical writing, Comedy, etc. etc. Still trying to think of something clever for the fourth line of this .sig