From:             php at koterov dot ru
Operating system: all
PHP version:      5.2.1
PHP Bug Type:     Unknown/Other Function
Bug description:  Suggestion: json_encode() and non-UTF8 strings

Description:
------------
Could you please explain why json_encode() takes care about the encoding
at all? Why not to treat all the string data as a binary flow? This is
very inconvenient and disallows the usage of json_encode() in non-UTF8
sites! :-(

I have written a small substitution for json_encode(), but note that it of
course works much more slow than json_encode() with big data arrays..

    /**
     * Convert PHP scalar, array or hash to JS scalar/array/hash.
     */
    function php2js($a)
    {
        if (is_null($a)) return 'null';
        if ($a === false) return 'false';
        if ($a === true) return 'true';
        if (is_scalar($a)) {
            $a = addslashes($a);
            $a = str_replace("\n", '\n', $a);
            $a = str_replace("\r", '\r', $a);
            $a = preg_replace('{(</)(script)}i', "$1'+'$2", $a);
            return "'$a'";
        }
        $isList = true;
        for ($i=0, reset($a); $i<count($a); $i++, next($a))
            if (key($a) !== $i) { $isList = false; break; }
        $result = array();
        if ($isList) {
            foreach ($a as $v) $result[] = php2js($v);
            return '[ ' . join(', ', $result) . ' ]';
        } else {
            foreach ($a as $k=>$v) 
                $result[] = php2js($k) . ': ' . php2js($v);
            return '{ ' . join(', ', $result) . ' }';
        }
    }

So, my suggestion is remove all string analyzation from json_encode()
code. It also make this function to work faster.

Reproduce code:
---------------
<?php
$a = array('a' =>
'&#1087;&#1088;&#1086;&#1074;&#1077;&#1088;&#1082;&#1072;', 'b' =>
array('&#1089;&#1083;&#1091;&#1093;&#1072;',
'&#1075;&#1083;&#1091;&#1093;&#1086;&#1075;&#1086;'));
echo json_encode($a);
?>

Expected result:
----------------
Correctly encoded string in the source 1-byte encoding.

Actual result:
--------------
Empty strings everywhere (and sometimes - notices that a string contains
non-UTF8 characters).

-- 
Edit bug report at http://bugs.php.net/?id=40506&edit=1
-- 
Try a CVS snapshot (PHP 4.4): 
http://bugs.php.net/fix.php?id=40506&r=trysnapshot44
Try a CVS snapshot (PHP 5.2): 
http://bugs.php.net/fix.php?id=40506&r=trysnapshot52
Try a CVS snapshot (PHP 6.0): 
http://bugs.php.net/fix.php?id=40506&r=trysnapshot60
Fixed in CVS:                 http://bugs.php.net/fix.php?id=40506&r=fixedcvs
Fixed in release:             
http://bugs.php.net/fix.php?id=40506&r=alreadyfixed
Need backtrace:               http://bugs.php.net/fix.php?id=40506&r=needtrace
Need Reproduce Script:        http://bugs.php.net/fix.php?id=40506&r=needscript
Try newer version:            http://bugs.php.net/fix.php?id=40506&r=oldversion
Not developer issue:          http://bugs.php.net/fix.php?id=40506&r=support
Expected behavior:            http://bugs.php.net/fix.php?id=40506&r=notwrong
Not enough info:              
http://bugs.php.net/fix.php?id=40506&r=notenoughinfo
Submitted twice:              
http://bugs.php.net/fix.php?id=40506&r=submittedtwice
register_globals:             http://bugs.php.net/fix.php?id=40506&r=globals
PHP 3 support discontinued:   http://bugs.php.net/fix.php?id=40506&r=php3
Daylight Savings:             http://bugs.php.net/fix.php?id=40506&r=dst
IIS Stability:                http://bugs.php.net/fix.php?id=40506&r=isapi
Install GNU Sed:              http://bugs.php.net/fix.php?id=40506&r=gnused
Floating point limitations:   http://bugs.php.net/fix.php?id=40506&r=float
No Zend Extensions:           http://bugs.php.net/fix.php?id=40506&r=nozend
MySQL Configuration Error:    http://bugs.php.net/fix.php?id=40506&r=mysqlcfg

Reply via email to