ID:               49528
 Updated by:       j...@php.net
 Reported By:      moriyo...@php.net
-Status:           Open
+Status:           Assigned
 Bug Type:         mbstring related
-Operating System: N/A
+Operating System: *
 PHP Version:      5.3SVN-2009-09-11 (SVN)
-Assigned To:      
+Assigned To:      moriyoshi
 New Comment:

Moriyoshi propably added this report as reminder for himself.


Previous Comments:
------------------------------------------------------------------------

[2009-09-11 08:18:38] sjo...@php.net

It can be argued that the BOM character U+FEFF should never be
converted, as it is no real character.

I don't think it is the task of mb_convert_encoding to detect the byte
order and interpret the BOM.

------------------------------------------------------------------------

[2009-09-11 07:45:05] moriyo...@php.net

Description:
------------
The first character of a UTF-16 string prefixed by "\xff\xfe" (LE BOM)
gets converted to wrong Unicode codepoint. Moreover, the resulting
string contains the BOM itself while it is uncalled for.



Reproduce code:
---------------
<?php
var_dump(bin2hex(mb_convert_encoding("\xff\xfe\x01\x02\x03\x04",
"UCS-2", "UTF-16")));
?>

Expected result:
----------------
string(8) "02010403"

Actual result:
--------------
string(12) "feffff010403"


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=49528&edit=1

Reply via email to