From: Diomedes_01 at yahoo dot com
Operating system: Solaris 9
PHP version: 5.0.4
PHP Bug Type: Strings related
Bug description: Unable to properly convert from ISO-8859-1 to UTF-8
Description:
------------
I am unable to properly encode certain strings from ISO-8859-1 to UTF-8. I
have tried using utf8_encode, mb_convert_encoding and iconv with no
success. The code I am attempting this on is as follows:
Reproduce code:
---------------
<?php
$main_test_string = "r�f�rendum sur la Constitution europ�enne";
$string_test = mb_detect_encoding($main_test_string, 'UTF-8,
ISO-8859-1');
echo "Encoding used: $string_test<br>"; // Properly displays ISO-8859-1
// First try converting with iconv
$iconv_test = iconv("ISO-8859-1", "UTF-8", $main_test_string);
echo "Iconv test: $iconv_test<br>"; // Displays nothing. No data
whatsoever
// Now try converting with mb_convert_encoding
$mb_test = mb_convert_encoding($main_test_string, "UTF-8", "ISO-8859-1");
$string_test2 = mb_detect_encoding($mb_test, 'UTF-8, ISO-8859-1');
echo "Encoding used: $string_test2<br>"; // Indicates string is now UTF-8
encoded (which is wrong)
echo "MB Test convert value: $mb_test<br>"; // Displays: référendum sur
la Constitution européenne; doesn't look like UTF-8 to me
// Finally try utf8_encode
$utf8_encode_test = utf8_encode($main_test_string);
$string_test3 = mb_detect_encoding($textfieldabstract, 'UTF-8,
ISO-8859-1');
echo "Encoding used: $string_test3<br>"; // Indicates string is now UTF-8
encoded (which is wrong)
echo "Abstract post conversion: $utf8_encode_test<br>"; // Same as before,
displays: référendum sur la Constitution européenne
?>
Expected result:
----------------
I should be seeing UTF-8 (Unicode) translated text of the style:
'Ελληνι'
Note that the above does work for non-latin based character sets like
chinese, japanese, russian, greek, etc.
Actual result:
--------------
What I am seeing is the following string:
référendum sur la Constitution européenne
Definately not UTF-8. Could be Klingon. :-)
I will admit I am not a Unicode master but this is certainly quite
puzzling. According to the documentation, iconv is supposed to work in
this case but it is not displaying any data. I am running PHP 5.0.4 with
iconv enabled. (I see it in my phpinfo output)
Please advise.
--
Edit bug report at http://bugs.php.net/?id=32880&edit=1
--
Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=32880&r=trysnapshot4
Try a CVS snapshot (php5.0):
http://bugs.php.net/fix.php?id=32880&r=trysnapshot50
Try a CVS snapshot (php5.1):
http://bugs.php.net/fix.php?id=32880&r=trysnapshot51
Fixed in CVS: http://bugs.php.net/fix.php?id=32880&r=fixedcvs
Fixed in release: http://bugs.php.net/fix.php?id=32880&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=32880&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=32880&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=32880&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=32880&r=support
Expected behavior: http://bugs.php.net/fix.php?id=32880&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=32880&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=32880&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=32880&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=32880&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=32880&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=32880&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=32880&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=32880&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=32880&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=32880&r=mysqlcfg