[ 
https://issues.apache.org/jira/browse/COUCHDB-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990250#comment-12990250
 ] 

Paul Joseph Davis commented on COUCHDB-1057:
--------------------------------------------

Also, I realized I should probably give more background on this instead of just 
getting irritated with that spec again.

The underlying issue is that CouchDB stores all of its JSON strings as UTF-8, 
which means that all code points we recognize in the input is required to be 
representable as UTF-8. As you see in the JSON spec, there wasn't much 
foresight into what constitutes a valid Unicode code point. This means that the 
JSON spec allows for things that aren't representable as UTF-8 via unicode 
escapes.

When I asked about the issue on the es5-discuss list I was actually told that 
JSON requires strings to be stored as 16 bit integers (hence why I'm so fond of 
repeating that). Yeah, I was actually told that JSON supposedly requires a 
specific string implementation. Seeing as how JSON is widely characterized as a 
ubiquitous exchange format, I promptly rejected that assertion and haven't been 
overly motivated to relax our enforcement of valid Unicode code points.

If someone wants to write a patch that carries invalid escapes through the 
system I'd probably be ok with that, though I think we tried once and it gummed 
up something somewhere else.

> Wrong JSON parser behavior on escaped unicode characters
> --------------------------------------------------------
>
>                 Key: COUCHDB-1057
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1057
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 1.0
>         Environment: Ubuntu 10.10
> Doesn't matter
>            Reporter: Fedor Indutny
>
> Try to save following doc to couchdb:
> { "_id" : "json-test", "test": "\u0080-\uffff"}
> And then put it to the database:
> curl -X PUT -d @1.json --basic --user admin:admin -H "Content-Type: 
> application/json" http://couchdb:5984/tadagraph/json-test
> You'll get error:
> {"error":"bad_request","reason":"invalid UTF-8 JSON"}
> jsonlint ( http://www.jsonlint.com/ ) says that it's a valid JSON

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to