[
https://issues.apache.org/jira/browse/COUCHDB-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sam Bisbee closed COUCHDB-333.
------------------------------
Resolved for a while. Closing.
> Json handling of UTF8 strings not in accordance with rfc4627
> ------------------------------------------------------------
>
> Key: COUCHDB-333
> URL: https://issues.apache.org/jira/browse/COUCHDB-333
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Affects Versions: 0.9
> Environment: couchdb 0.9.0 spidermonkey 0.7.0 erlang R12B3
> Reporter: mark
> Attachments: utf16-surrogate-pairs.diff
>
>
> Handling of some unicode values escaped in json format \uXXXX fails with
> "invalid_json" error.
> curl -X PUT -d
> '{"revisions":[],"_id":"U_1d11e","codepoint":"3441","definition":"\uD834\uDD1E
> G clef character"}' http://localhost:5984/mydb/U_1d11e
> yields
> {"error":"invalid_json","reason":"{\"revisions\":[],\"_id\":\"U_1d11e\",\"codepoint\":\"3441\",\"definition\":\"\\uD834\\uDD1E
> G clef character\"}"}
> When the RFC states:
> To escape an extended character that is not in the Basic Multilingual
> Plane, the character is represented as a twelve-character sequence,
> encoding the UTF-16 surrogate pair. So, for example, a string
> containing only the G clef character (U+1D11E) may be represented as
> "\uD834\uDD1E".
> Furthermore, couchdb accepts encoded strings of the format \uXXXXXXXX which
> is not mentioned as acceptable in the json rfc
> curl -X PUT -d
> '{"revisions":[],"_id":"U_1d11e","codepoint":"3441","definition":"\u0001D11E
> G clef character"}' http://localhost:5984/mydb/U_1d11e
> Yields:
> {"ok":true,"id":"U_1d11e","rev":"1-1270273433"}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira