ID: 49528 Updated by: j...@php.net Reported By: moriyo...@php.net -Status: Open +Status: Assigned Bug Type: mbstring related -Operating System: N/A +Operating System: * PHP Version: 5.3SVN-2009-09-11 (SVN) -Assigned To: +Assigned To: moriyoshi New Comment:
Moriyoshi propably added this report as reminder for himself. Previous Comments: ------------------------------------------------------------------------ [2009-09-11 08:18:38] sjo...@php.net It can be argued that the BOM character U+FEFF should never be converted, as it is no real character. I don't think it is the task of mb_convert_encoding to detect the byte order and interpret the BOM. ------------------------------------------------------------------------ [2009-09-11 07:45:05] moriyo...@php.net Description: ------------ The first character of a UTF-16 string prefixed by "\xff\xfe" (LE BOM) gets converted to wrong Unicode codepoint. Moreover, the resulting string contains the BOM itself while it is uncalled for. Reproduce code: --------------- <?php var_dump(bin2hex(mb_convert_encoding("\xff\xfe\x01\x02\x03\x04", "UCS-2", "UTF-16"))); ?> Expected result: ---------------- string(8) "02010403" Actual result: -------------- string(12) "feffff010403" ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=49528&edit=1