Geoffrey Young wrote:
John ORourke wrote:
Eli Shemer wrote:
For some reason the following test doesn’t print anything out to the
screen
I'm not sure why you get nothing, but I can tell you strings read
from Apache objects come through as octets and need to be decoded
before use. We're using UTF-8 chars in URLs but I've never used one
in a GET request parameter.
I can't say why it doesn't work, but I'm surprised it would in either
case - the only characters explicitly allowed in a uri are us-ascii.
from rfc2396:
My bad memory there - you are quite correct. The way we do it is the
accepted way - to URL-encode the UTF-8 encoded text, and that will work
with URLs and parameters.
eg:
http://www....../categories/name/ty%C3%B6kalut-lamput
is the correct form of:
http://www....../categories/name/työkalut-lamput
encode before printing:
$octets = utf8_encode($my_utf8_string); # make octets
$octets =~ s/([^\041-\177])/sprintf("%%%02X",ord($1))/ge; # URL-encode
non-ASCII chars
$r->print($octets);
(the above is simplified - you'll also need to encode question marks etc)
decode after reading:
$url = utf8_decode ( $r->uri() );
or
$param = utf8_decode ( $r->param('info') );
cheers
John