Hi.

I have a problem with a PerlResponseHandler, regarding the character set used in the response to a request.
Basically, the question is : how to I set the character set properly for the 
"handle" used in
$r->print("string") ?
(where string can be "äéèöü" for example)

Neither of the following (which I do before starting to print output) seems to 
work :

$r->headers_out->unset('content-type');
$r->headers_out->set('content-type','text/html;charset=xxxx');

or

$r->content_type('text/html;charset=xxxx');

When I say that it doesn't work, I mean in fact :
- the "Content-Type" response header sent by the server is properly set according to what I do above (as verified in a browser plugin)
- but if what I print contains "accented" characters, they are not being 
encoded properly

So, do I need to set something else so that the $r->print(string) will output "string" properly ?


Background :

My PerlResponseHandler reads a html file from disk, replaces some strings into it, and sends the result out via $r->print. The source html file can be encoded in iso-8859-1 or UTF-8, and it contains a proper declaration of the charset under which it is really encoded :

<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
or
<meta http-equiv="content-type" content="text/html; charset=UTF-8">

To read the file, I first open it "raw", read a few lines, checking for the above <meta> tag. If found, I note the charset (say in $charset), close the file, and re-open it as

open(my $fh,"<:encoding($charset)", $file);

(note : if $charset is "UTF-8", then the open becomes
open(my $fh,'<:utf8', $file);)

I also at that point set the response charset by one of the means above.

Then I read the file line by line, substituting some strings in the line, and print out the line via
$r->print($line);
etc..

My problem is that, if the input file is for example iso-8859-1 and contains the word "Männer", the output comes out as "M(A tilde)(some byte)nner" (the bytes corresponding to the UTF-8 encoding of the "a umlaut").

Can I / should I do something like
binmode($r,":$charset"); # ??

TIA

Reply via email to