Le samedi, 9 mai 2015 à 02:33, Philippe Verdy a écrit :
> 2015-05-08 14:32 GMT+02:00 Daniel Bünzli <[email protected] 
> (mailto:[email protected])>:
> > Well did you test them all ? There's quite a big list here 
> > http://www.json.org. Taking a random one mentioned on that page leads me to 
> > http://golang.org/pkg/encoding/json/ in which they say that they replace 
> > invalid UTF-16 surrogate pairs by U+FFFD. This is really not very 
> > surprising since apparently go's strings as text are UTF-8 encoded so when 
> > you need to produce your results as UTF-8 then you don't have a lot of 
> > solutions... error and/or U+FFFD.
>  
>  
> I've already saif that JSON is UTF-8 encoded by default, but this does not 
> mean that JSON invalidates the escape sequence '\uD800' isolated in a string.

You didn't get what I said. When a parser returns a JSON string it just parsed 
and that it wants to give it back to the programmer using the native string of 
the language and that these strings happen to be UTF-8 encoded in this 
language, then in presence of such lone surrogates you are stuck and need to do 
something as you cannot encode them in the UTF-8 string.  

(I understand that in *your* interpretation this should not happen since I 
should define a special data type to represent these JSON strings so that they 
behave like JavaScript strings; that would be indeed very practical, none of my 
language native string tools can be used on that…)
  
Anyways, we are largely OT at this point.  

Best,

Daniel



Reply via email to