It's not FUD.
It is different from writing json_decode('ä\u0123'), because json_decode() in
PHP only accepts UTF-8 encoded input;
Give it a shot:
<?php
$chr = "\xC3\xA4"; // "ä" as UTF-8
var_dump(json_decode('["' . $chr . '\u00e4"]'));
var_dump(json_decode('["' . utf8_decode($chr) . '\u00e4"]'));
?>
That'll produce:
> array(1) {
> [0]=>
> string(4) "ää"
> }
> NULL
Understand what the problem is now?
If someone does this in a latin1-encoded file:
<?php $fancyNewArray = {"yay": "ä"}; ?>
Then that is valid as a PHP array (as it's a latin1 "ä", so \xE4), but cannot
be consumed by PHP's json_decode(). And that would be terribly inconsistent
behavior.
David
On 02.06.2011, at 22:15, Andrei Zmievski wrote:
> Stop spreading FUD, please.
>
> It's no different than writing json_decode("ä\u0123").
>
> Your statement, "the stuff in bar in UTF-8" is wrong. The \u0123
> escape sequence is a representation of a Unicode character, not the
> character itself. This representation can be encoded in any
> ASCII-compatible encoding, such as Latin-1, UTF-8, etc. So putting it
> directly in a Latin-1 encoded script is just fine.
>
> -Andrei
>
> On Thu, Jun 2, 2011 at 12:00 PM, David Zülke
> <[email protected]> wrote:
>> No we can't; I already explained why in another email last night. Copypasta:
>>
>> json_decode() can deal with Unicode sequences because decodes to UTF-8. That
>> is not possible in a language construct:
>>
>> What if I do this, in a latin1 encoded file:
>>
>> $x = {foo: "ä", bar: "\u0123"}
>>
>> Should that then give mixed encodings? The "ä" in foo in latin1 and the
>> stuff in bar in UTF-8?
>>
>> And what if I do:
>>
>> $x = {foo: "ä\u0123"}
>>
>> I'll either end up with an invalid UTF-8 sequence, or with latin1 character
>> soup.
>>
>> David
>>
>>
>> On 02.06.2011, at 18:04, Martin Scotta <[email protected]> wrote:
>>
>>> Could we first go out with fully JSON compatible version for 5.4?
>>> and then later decide the => stuff based on how that worked.
>>>
>>> Native JSON is a big stuff for userland, and I'm pretty sure it will bring a
>>> hole of core version upgrades.
>>>
>>> Martin Scotta
>>>
>>>
>>> On Wed, Jun 1, 2011 at 7:09 PM, Sean Coates <[email protected]> wrote:
>>>
>>>>> Now, the only reason I would personally support the array shortcut is
>>>>> if it was an implementation of JSON. I know that's not on the table
>>>>> here
>>>>
>>>> I don't think anything is officially off the table, unless we forego
>>>> discussion.
>>>>
>>>> My application is largely JSON-powered. We pass data from back- to
>>>> front-end via JSON, we interact with MongoDB via the extension (which is an
>>>> altered JSON-like protocol (arrays instead of objects), but would be a lot
>>>> more fluent with actual objects—they're just too hard to make in current
>>>> PHP), and we interface with ElasticSearch. The paste I linked earlier is
>>>> our
>>>> primary ElasticSearch query.
>>>>
>>>> The benefits of first-class JSON are important and wide-reaching;
>>>> especially when interacting with systems like the ones I've mentioned.
>>>> There's a huge amount of value in being able to copy JSON out of PHP and
>>>> into e.g. CURL to make a query to ElasticSearch without worrying that I've
>>>> accidentally nested one level too deep or shallow, or accidentally
>>>> mistranslating my arrays into JSON.
>>>>
>>>> This is not about saving five characters every time I type array(), it's
>>>> about making my systems all work together in a way that's a little less
>>>> abstracted, and a lot less prone to error.
>>>>
>>>> S
>>>> --
>>>> PHP Internals - PHP Runtime Development Mailing List
>>>> To unsubscribe, visit: http://www.php.net/unsub.php
>>>>
>>>>
>>
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
smime.p7s
Description: S/MIME cryptographic signature
