Stanisław T. Findeisen wrote:
Gunnar Hjalmarsson wrote:
What assumptions does Perl make regarding input file (i.e., the
program/script file) encoding?
AFAIK, it just converts the bytes into Perl's internal format, but it
does not assume anything (at least not by default) with respect to the
character encoding.
Is it so that string literals in Perl are byte arrays in fact?
String literals in a Perl script are byte *strings* until decoded.
Yeah, it looks so. With "use utf8" (http://perldoc.perl.org/utf8.html)
one can however make them parsed (decoded) (provided they are valid UTF-8).
No. The utf8 pragma is about allowing UTF-8 encoded *symbols*, e.g.
variable names or subroutine names.
$ perl -MEncode -le '
$s = "smörgåsbord";
print length $s;
use utf8;
print length $s;
$s = decode "UTF-8", $s;
print length $s;
'
13
13
11
$
It's all about UTF8 flag:
http://perldoc.perl.org/Encode.html#The-UTF8-flag .
Maybe... That's above my head right now, I'm afraid.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/