Stumped on this, and some of the examples in the perluniintro just seem wrong.

On a Japanese version of Windows when you execute a Perl to run a script, the 
length() fcn returns
the wrong number of characters for anything you pass in as @ARGV[0], and the 
split() fcn seems to
work the same way.

Using some of the samples shows in perluniintro we do not get the same results, 
so something is wrong.

Using ActivePerl 5.8.8 Build 819. Using Win2003 Server, Japanese. No emulation, 
all default Japanese
installation.

Here is what we are doing:

perl script.pl テスト

(there are three characters for @ARGV[0], the Japanese word for 'test')

The perl script does this:

print length(@ARGV[0]);  # returns 6

If one tries to use split(\\, @ARGV[0]) there are 6 iterations.

Tried use encoding UTF8, the -C6 flag and a ton of other stuff.
Oddly, if one does 'print @ARGV[0]' the output is テスト.

Even used something from perluniintro:
$Unicode_string = pack("U*", unpack("W*", $ARGV[0]));
print $Unicode_string         # returns テスト
print length($Unicode_string) # returns 6

We need to capture each character in テスト (3 of them) and 
get the HEX or UNICODE value for the
character. Since Perl thinks the length is 6 we cannot get correct hex/unicode 
values using
pack/unpack or anything else for that matter.

Thanks for any comments you can offer.

alg
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to