On 4/18/05, Rajarshi Das <[EMAIL PROTECTED]> wrote: > Hi, > I am using perl-5.8.6 on z/OS. > 1) What is the BOM on z/OS ? Basically, I cant print the chars "\xFE\xFF". > Even though \xFE is defined as Latin Capital Letter U with Acute, the char > doesnt display. Also, \xFF isnt defined. > > 2) What is the difference between the utf8::encode and utf8::upgrade > routines ? > e.g. $a = 'hello'; > utf8::upgrade($a); > > $a = "\xFE\xFF"; > utf8::encode($a); > > Should I use 'encode' when the scalar contains bytes and I need to convert > those bytes into utf8 bytes (as in byte representation in unicode) ? > And use 'upgrade' when the scalar contains a normal string that I want to > convert to a utf8 string of characters ? > > Thanks in advance, > Rajarshi. >
1) this is a function of your charater display, and your system's unicode support, but see perldoc ebcdic. Also make sure that you are actually using utf8, though layers, or use utf8, or the utf8::functions. 2) in most cases, these functions perform the same task. The main difference is that utf8::encode clears the utf flag on the string; this can be important when switching back and forth between different encodings. utf8::also returns the number of bytes needed to represent the string, which can be handy. Do not, though, pass unicode bytes to utf8::encode. it will attempt to determine the encoding and respond appropriately, but in many cases, including your example, it will assume the bytes are some other encoding, and re-encode them yeilding unpredictable results. You can use your bytestring as-is. If you want to turn it back in OS native encoding (might be needed for ebcdic, I don't know), use utf8::downgrade, or utf8::decode. Check out perldoc perluniintro. HTH, --jay -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>