Hi.
I have a problem with a PerlResponseHandler, regarding the character set used in the
response to a request.
Basically, the question is : how to I set the character set properly for the
"handle" used in
$r->print("string") ?
(where string can be "äéèöü" for example)
Neither of the following (which I do before starting to print output) seems to
work :
$r->headers_out->unset('content-type');
$r->headers_out->set('content-type','text/html;charset=xxxx');
or
$r->content_type('text/html;charset=xxxx');
When I say that it doesn't work, I mean in fact :
- the "Content-Type" response header sent by the server is properly set according to what
I do above (as verified in a browser plugin)
- but if what I print contains "accented" characters, they are not being
encoded properly
So, do I need to set something else so that the $r->print(string) will output "string"
properly ?
Background :
My PerlResponseHandler reads a html file from disk, replaces some strings into it, and
sends the result out via $r->print.
The source html file can be encoded in iso-8859-1 or UTF-8, and it contains a proper
declaration of the charset under which it is really encoded :
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
or
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
To read the file, I first open it "raw", read a few lines, checking for the above <meta>
tag. If found, I note the charset (say in $charset), close the file, and re-open it as
open(my $fh,"<:encoding($charset)", $file);
(note : if $charset is "UTF-8", then the open becomes
open(my $fh,'<:utf8', $file);)
I also at that point set the response charset by one of the means above.
Then I read the file line by line, substituting some strings in the line, and print out
the line via
$r->print($line);
etc..
My problem is that, if the input file is for example iso-8859-1 and contains the word
"Männer", the output comes out as "M(A tilde)(some byte)nner" (the bytes corresponding to
the UTF-8 encoding of the "a umlaut").
Can I / should I do something like
binmode($r,":$charset"); # ??
TIA