Hi,

My application has the following structure : a PerlResponseHandler builds the 
<body> part of a web page by querying a postgresql database for the data, then 
a PerlOutputFilterHandler adds the <head> section to it. The entire stack has 
been converted to utf8 (system locales, perlscripts, databases).

I found that the content passed to the PerlOutputFilterHandler is not marked as 
utf8, so that I need to decode_utf8 its content to get my pages to display 
properly. This is also true of request parameters passed by Apache2::Request. 
The reason for this seems to be that both use APR::Table, whose documentation 
says : "On the Perl level that means that we convert scalars into strings and 
store those strings. Any special information that was in the Perl scalar is not 
stored. So for example if a scalar was marked as utf8, tainted or tied, that 
information is not stored. When you get the data back as a Perl scalar you get 
only the string" (see : 
http://perl.apache.org/docs/2.0/api/APR/Table.html#Description)

So I'm now doing this in the outputfilter :

    while ( $f->read(my $buffer) ) {

        $content .= decode_utf8($buffer) ;

    }

And this with the request parameters in the response handler:

@args = $req->param ;

    for (@args) {

        $args{$_} = decode_utf8($req->param($_)) ;

    }

Can you tell me if you see a problem with this approach? I posted a simplified 
example for both modules here : http://pastebin.com/Xr89DeFX

The resulting page displayed by the example is :

OutputFilterHandler $buffer NOT utf8
myparam : ôê € : 1
string : lâàéè : 1
returned_value_1 : lâàéè : 1
returned_value_2 : lâàéè : 1
returned_value_3 : ôê € : 1
OutputFilterHandler $added_string àêôu € IS utf8





-- 
                                        Salutations, Vincent Veyron

https://marica.fr/
Gestion des contentieux, des dossiers de sinistres assurance et des contrats 
pour le service juridique

Reply via email to