NTFS allows you to store filenames in some liberal Microsoft interpretation of UTF-16. [1] Windows Explorer makes that easy to test, just save a file as something Chinese and there you go.
[1] http://blogs.msdn.com/b/michkap/archive/2006/09/10/748699.aspx How do you save a file with a Chinese or other name requiring Unicode using Active Perl? Here's a script to save a file as Катюша.txt (Katyusha) and just won't work using any of four encodings and even none at all! What do I have to do in order to just save my Катюша.txt ? \,,,/ (o o) ------oOOo-(_)-oOOo------ use 5.010; use utf8; use strict; use warnings; use Encode qw/encode/; my $chars = 'Катюша'; # say length $chars; my $count = 0; for ( '', qw/UTF-16 UTF-16BE UTF-16LE UTF-8/ ) { say 'encoding: ', $_; my $n1 = $chars . '.' . ++$count . '.txt'; my $n2 = $_ ? encode( $_, $n1 ) : $n1; if ( open my $fh, '>:encoding(UTF-16)', $n2 ) { print $fh $chars, "\n"; close $fh; } else { warn "open $n2: $!"; } } The output of this script in cmd.exe using CHCP 1252 is: encoding: encoding: UTF-16 open þÿBNH0 . 2 . t x t: Invalid argument at ntfs_uni_filename.pl line 19. encoding: UTF-16BE open BNH0 . 3 . t x t: Invalid argument at ntfs_uni_filename.pl line 19. encoding: UTF-16LE open BNH. 4 . t x t : Invalid argument at ntfs_uni_filename.pl line 19. encoding: UTF-8 The filenames it manages to save are disfigured: 07.01.2012 20:44 17 Катюша.1.txt 07.01.2012 20:44 17 Катюша.5.txt Tested versions, outcome always as described: * v5.10.1 built for MSWin32-x86-multi-thread * (v5.12.3) built for MSWin32-x64-multi-thread * (v5.12.4) built for MSWin32-x86-multi-thread * (v5.14.0) built for MSWin32-x64-multi-thread * (v5.14.1) built for MSWin32-x86-multi-thread Note that DIR in cmd.exe otherwise has no troubles displaying Russian or Greek filenames; there are problems only with Chinese, Arab and such exotic stuff, and only because the font I'm using doesn't support those scripts. Cygwin perl 5.10.1, by the way, displayed no errors and got it right: 07.01.2012 20:51 16 Катюша.1.txt 07.01.2012 20:51 16 0BNH0 07.01.2012 20:51 16 0BNH0 07.01.2012 20:51 16 0BNH0. 07.01.2012 20:51 16 Катюша.5.txt You can feed either a character string or a UTF-8 octet string to this Cygwin perl.exe open() and it creates the proper filename, proving that it's not technically impossible. :) -- Michael Ludwig _______________________________________________ ActivePerl mailing list ActivePerl@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs