Re: Chinese word problem

Mumia W. Wed, 16 May 2007 00:53:36 -0700

On 05/16/2007 12:57 AM, Neil wrote:

Dear All:


Question:

How come the length of Chinese word I print shows “ 3 “.
Isn’t it supposed to 2 bytes?

Program:
-----------------------------------
$str=”我”;

$str_len = length($str);

Print $str_len, “\n\n”;

------------------------------------
The result is 3

I took a picture for the program. In case of it doesn’t show Chinese word
in some of your system,
[...]
My environment:
[...]
Encode: Big5

Something is messed up with your locale or environment. Since you onlyhave one character in $str, the length should be "1"--and that's what I get.

I saved your program two ways: as a utf8 file and as a big5 file; bothprograms produce the same result on my system: 1; however, to get yourprogram to run, I had to change the quotes.


Here is the first program (saved in UTF8):
-----------------------------------
#!/usr/bin/perl
use utf8;
use strict;
use warnings;

my $str="我";

my $str_len = length($str);

print $str_len, "\n\n";
----------------------------------

Here is the second program (saved in Big5):
--------------------------------------------
#!/usr/bin/perl
use encoding big5 => STDOUT => 'utf8';
use strict;
use warnings;

my $str="§Ú";

my $str_len = length($str);

print $str_len, "\n\n";
print "data = $str\n";
--------------------------------------------

The second program displays this:
------start output-------
1

data = 我
-------end output--------

Evidently the Big5 character sequence \xA7\xDA represents the singleUnicode character \x6211 which is the Chinese character 我. You probablyjust need to tell Perl about the encoding of your script.


My environment:
Perl 5.8.4
Debian 3.1
Encoding: UTF8


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Chinese word problem

Reply via email to