ID: 49528 Updated by: sjo...@php.net Reported By: moriyo...@php.net Status: Open Bug Type: mbstring related Operating System: N/A PHP Version: 5.3SVN-2009-09-11 (SVN) New Comment:
It can be argued that the BOM character U+FEFF should never be converted, as it is no real character. I don't think it is the task of mb_convert_encoding to detect the byte order and interpret the BOM. Previous Comments: ------------------------------------------------------------------------ [2009-09-11 07:45:05] moriyo...@php.net Description: ------------ The first character of a UTF-16 string prefixed by "\xff\xfe" (LE BOM) gets converted to wrong Unicode codepoint. Moreover, the resulting string contains the BOM itself while it is uncalled for. Reproduce code: --------------- <?php var_dump(bin2hex(mb_convert_encoding("\xff\xfe\x01\x02\x03\x04", "UCS-2", "UTF-16"))); ?> Expected result: ---------------- string(8) "02010403" Actual result: -------------- string(12) "feffff010403" ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=49528&edit=1