Am I correct in thinking that the only way to get ord() to return a value over 256 is to send the character as a Unicode string instead of a byte string?

Dan Muey Thu, 28 Oct 2010 12:54:44 -0700

In other words, is there any character that will make ord() return over  256 
when passed in as a byte string?


For example, note the differences in output between a unicode string and a byte 
string regarding character 257, as a unicode string it is 257, as a byte string 
it is 196.

$ perl -C6 -le 'print "Character 257 info:";print "\tunicode \\x{} notation: " 
. sprintf(q{\x{%x}}, 257);print "\tOutput as Unicode string \x{101}";print 
"\tunicode string \\x{} notation ord(): " . ord("\x{101}");print "\tbyte string 
grapheme ord(): " . ord "\xc4\x81";print "\tbyte string literal ord(): " . ord 
"ā";'
Character 257 info:
        unicode \x{} notation: \x{101}
        Output as Unicode string ā
        unicode string \x{} notation ord(): 257
        byte string grapheme ord(): 196
        byte string literal ord(): 196
$

The reason this is relevant is that on a given project I am using 
byte-strings-only for consistency and some encoders (i.e. Scalar::Quote::Q() 
)will change from bytes-string-friendly-grapheme-cluster notation (e.g. 
\xE3\x8A\xB7)  to unicode-string-notation (e.g. \x{32B7}) and I want to be sure 
I always use data that gets me  the former rather than the latter :)

TIA!

--
Dan Muey

Am I correct in thinking that the only way to get ord() to return a value over 256 is to send the character as a Unicode string instead of a byte string?

Reply via email to