Pierre Nugues schrieb am 06.09.2010 um 22:02 (+0200): > 2/ The output with "use utf8;"
This pragma tells the interpreter that your script source is in UTF-8. So it affects the literals in your tr/// list. It does not tell the interpreter what output encoding to use. > 3/ With > use utf8; > binmode(STDOUT, ':utf8'); > I get (this time, the terminal can display the <C2> as a Â. This is > not correct. It strips the accented characters): Some bytes might have been butchered away by the tr operator. > 4/ With binmode(STDOUT, ':utf8') only (Then, there is a combination of > wrongly coded quotes in Latin 1 or Latin 9 that the terminal displays > and accented characters that are shown with their UTF-8 substitutes > interpreted as Latin 1 or Latin 9 characters); > > »Tjuvgömmare > ! > » > säga Your output is double-encoded. This is what happens here: (1) You're reading text encoded as UTF-8 in binary mode. (2) Consequently, you don't have text in Perl: you have octets. (3) You're applying some butchery to the octets using the tr operator. (4) You're outputting the remaining octets encoding them as UTF-8. (5) You're seeing garbage on the screen. -- Michael Ludwig