Hi, I run this on z/OS and perl-5.8.6. $a = 128; $b = 256;
for ($i=$a;$i<=$b;$i++) { $str = join '', $str, pack 'U*', $i; } if ($str =~ /(\p{inlatin1supplement}+)/) { print "\$1 : $1\n"; } I get the following values : a) for $a = 128 $b = 256 $1 has 1 byte representations for each of (128-159) and 2 byte representations for each of (160-255) b) $a = 160 $b = 240 $1 : 2 bytes for each of (160-240) c) $a = 192 $b = 240 $1 : 1 byte for the complete range of code pt values (192 - 240) d) $a = 192 $b = 256 $1 : 1 byte for each of (192-255) $1 contains either 1 byte or two byte or both representations of the matching code pt values depending on the range that is specified to construct $str. 1) Is this behaviour incorrect and needs to be fixed for $1 to always contain 1 byte representation only ?(since on ascii $1 always contains 1 byte representations only for any matching code pt value < 256). 2) If it is correct, then what is significant about the code pt 192 which changes $1 (1 byte representation (case b above) to 2 bytes (case c above)) eventhough $b = 240 in both cases ? Thanks in advance, Rajarshi. __________________________________ Do you Yahoo!? Yahoo! Mail - Find what you need with new enhanced search. http://info.mail.yahoo.com/mail_250