ID:               49528
 Updated by:       sjo...@php.net
 Reported By:      moriyo...@php.net
 Status:           Open
 Bug Type:         mbstring related
 Operating System: N/A
 PHP Version:      5.3SVN-2009-09-11 (SVN)
 New Comment:

It can be argued that the BOM character U+FEFF should never be
converted, as it is no real character.

I don't think it is the task of mb_convert_encoding to detect the byte
order and interpret the BOM.


Previous Comments:
------------------------------------------------------------------------

[2009-09-11 07:45:05] moriyo...@php.net

Description:
------------
The first character of a UTF-16 string prefixed by "\xff\xfe" (LE BOM)
gets converted to wrong Unicode codepoint. Moreover, the resulting
string contains the BOM itself while it is uncalled for.



Reproduce code:
---------------
<?php
var_dump(bin2hex(mb_convert_encoding("\xff\xfe\x01\x02\x03\x04",
"UCS-2", "UTF-16")));
?>

Expected result:
----------------
string(8) "02010403"

Actual result:
--------------
string(12) "feffff010403"


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=49528&edit=1

Reply via email to