With Perl 5.8, it doesn't default to working with bytes when in the
UTF-8 locale.

See http://www.perldoc.com/perl5.8.0/pod/perlunicode.html#Locales

That means you can't read a file and pass the data on to a socket,
because the socket uses a :raw PerlIO layer.

With Perl 5.8, if you are using a "legacy" locale, it uses its legacy
input layer (crlf), which is compatible with Perl 5.6.  But if you're
using a UTF-8 locale, it does things the new way with input translation.
I think these defaults are broken (they should have preferred
compatibility), but on the plus side, it forces you to think about what
you really want and to tell Perl explicitly.

If you really want to get the bytes exactly as are in the file, you have
to do

use open IN => ':raw';
binmode(STDIN);

Or if you want CRLF translation but nothing more, do

use open IN => ':crlf';
binmode(STDIN, ':crlf');
(This is the default in non-utf8 locales.)


If you want to assume all input files are in the locale's encoding, you
can get Perl to translate that into Unicode for you with:

use open IN => ':locale';
(This automatically binmodes STDIN appropriately, too, but it only works
with Perl 5.8.)
(This is the default in utf-8 locales.)

Chris




--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to