On Fri, 25 Jan 2008, Stanislav Malyshev wrote:

> Hi!
> 
> Right now, if json_encode sees wrong UTF-8 data, it just cuts the string in
> the middle, no error returned, no message produced. Example:
> 
> var_dump(json_encode("ab\xE0"));
> var_dump(json_encode("ab\xE0\""));
> 
> Both strings get cut at "ab". I think it's not a good idea to just silently
> cut the data. In fact, I think it is a bug caused by this code in
> ext/json/utf8_to_utf16.c:
>         if (c < 0) {
>             return UTF8_END ? the_index : UTF8_ERROR;
>         }
> which inherited this bug from code published on json.org. It should be:
>         if (c < 0) {
>             return (c == UTF8_END) ? the_index : UTF8_ERROR;
>         }
> Now this is an easy fix but would lead to bad strings silently converted to
> empty strings. The question is - should we have an error there? If so, which
> one - E_WARNING, E_NOTICE? I'm for E_WARNING.
> Also filed as bug #43941.
> Any comments?

Like I mentioned before (I think), it should not return an empty string 
of course because programmatically it's not possible to check for this. 
As most of our functions return false in those cases, so should this 
function.

Derick

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to