E R skribis 2007-10-15 16:25 (-0500): > 1. What is the result of Encode::encode("iso-8559-1", $x) if $x is not > a utf8 string (i.e. Encode::is_utf8($x) returns false.)
"utf8 string" is already confusing. It can be either one of the following: 1. byte string with UTF8 encoded text 2. Perl Unicode string that at this point in time is encoded as UTF8 *internally* Encode::is_utf8 indicates that the latter is true. You should NOT have to peek at the status of this internal flag, except for debugging perl itself. Encode::encode expects a Unicode string, which can be encoded as ISO-8859-1 or UTF8 internally. If the Unicode string is ISO-8859-1 internally, is_utf8 returns false, and if it is UTF8 internally, it returns true. This is how Encode::encode knows, again: *internally*, how to convert the string. Assuming you meant 8859, not 8559, the answer to your question is: a copy of $x is returned, because the encoding you used happens to equal the encoding that Perl used internally. > 2. What is the result of $string = decode("iso-8859-1", $octets) if > $octets is a utf8 string? Do not use Encode::decode on unicode strings, but use it on bytestrings only. Every individual byte of the bytestring is seen as a single ISO-8859-1 character, so a multi-byte UTF8 sequence will *not* be interpreted as a single character. Perhaps helpful: http://tnx.nl/perlunitut,perlunifaq -- Met vriendelijke groet, Kind regards, Korajn salutojn, Juerd Waalboer: Perl hacker <[EMAIL PROTECTED]> <http://juerd.nl/sig> Convolution: ICT solutions and consultancy <[EMAIL PROTECTED]>