From:             jp at df5ea dot net
Operating system: 
PHP version:      5CVS-2007-04-12 (CVS)
PHP Bug Type:     *Unicode Issues
Bug description:  JSON_decode() does not handle surrogate pairs

Description:
------------
When decoding a string with surrogate pairs in it, JSON_decode() produces
incorrect UTF-8. Instead of encoding the two surrogate characters as one
UTF-8 sequence it encodes it as two sequences wich represent the two
surrogate code points.

The decoded string is actually CESU-8. The JSON_encode() function can not
encode such a string.

I have a patch to JSON_parse.c that transcodes the UTF-16 properly to
UTF-8.

Reproduce code:
---------------
<?php
$single_barline = "\360\235\204\200";
$array = array($single_barline);
print bin2hex($single_barline) . "\n";
// print $single_barline . "\n\n";
$json = json_encode($array);
print $json . "\n\n";
$json_decoded = json_decode($json, true);
// print $json_decoded[0] . "\n";
print bin2hex($json_decoded[0]) . "\n";
print "END\n";
?>


Expected result:
----------------
The output form the two bin2hex functions should be the same:

f09d8480

["\ud834\udd00"]

f09d8480
END


Actual result:
--------------
The second string is different from the input string and illegal UTF-8.

f09d8480

["\ud834\udd00"]

eda0b4edb480
END


-- 
Edit bug report at http://bugs.php.net/?id=41067&edit=1
-- 
Try a CVS snapshot (PHP 4.4): 
http://bugs.php.net/fix.php?id=41067&r=trysnapshot44
Try a CVS snapshot (PHP 5.2): 
http://bugs.php.net/fix.php?id=41067&r=trysnapshot52
Try a CVS snapshot (PHP 6.0): 
http://bugs.php.net/fix.php?id=41067&r=trysnapshot60
Fixed in CVS:                 http://bugs.php.net/fix.php?id=41067&r=fixedcvs
Fixed in release:             
http://bugs.php.net/fix.php?id=41067&r=alreadyfixed
Need backtrace:               http://bugs.php.net/fix.php?id=41067&r=needtrace
Need Reproduce Script:        http://bugs.php.net/fix.php?id=41067&r=needscript
Try newer version:            http://bugs.php.net/fix.php?id=41067&r=oldversion
Not developer issue:          http://bugs.php.net/fix.php?id=41067&r=support
Expected behavior:            http://bugs.php.net/fix.php?id=41067&r=notwrong
Not enough info:              
http://bugs.php.net/fix.php?id=41067&r=notenoughinfo
Submitted twice:              
http://bugs.php.net/fix.php?id=41067&r=submittedtwice
register_globals:             http://bugs.php.net/fix.php?id=41067&r=globals
PHP 3 support discontinued:   http://bugs.php.net/fix.php?id=41067&r=php3
Daylight Savings:             http://bugs.php.net/fix.php?id=41067&r=dst
IIS Stability:                http://bugs.php.net/fix.php?id=41067&r=isapi
Install GNU Sed:              http://bugs.php.net/fix.php?id=41067&r=gnused
Floating point limitations:   http://bugs.php.net/fix.php?id=41067&r=float
No Zend Extensions:           http://bugs.php.net/fix.php?id=41067&r=nozend
MySQL Configuration Error:    http://bugs.php.net/fix.php?id=41067&r=mysqlcfg

Reply via email to