[Zorba-coders] [Bug 1025622] Re: Incorrect JSON serialization of supplementary plane code points
** Changed in: zorba Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Zorba Coders, which is the registrant for Zorba. https://bugs.launchpad.net/bugs/1025622 Title: Incorrect JSON serialization of supplementary plane code points Status in Zorba - The XQuery Processor: Fix Committed Bug description: this bug is a follow-up of bug #1024448 Currently, the result of the following JSONiq query: let $message := "👊" return { "message": $message } is serialized into incorrect JSON: { "message" : "\ufff0\uff9f\uff91\uff8a" } the correct result would be: { "message" : "\ud83d\udc4a" } Explanation: Characters from the supplementory plane are usually represented in utf-16 surrogate pairs within JSON results. The above result is in particular incorrect because JSON allows only 4 hex digits after '\u'. utf-16 surrogate pairs alway fit into a 4 hex digit or 2 x 4 hex digit window which is most probably the reason why utf-16 is used. This has been greatly fixed in the JSON parser by Paul (see mp: https://code.launchpad.net/~paul-lucas/zorba/bug-1024448/+merge/115248 ), but it still needs to be fixed in the serializer. @Paul: I'm not sure if you are the right person to assign this bug to? thanks To manage notifications about this bug go to: https://bugs.launchpad.net/zorba/+bug/1025622/+subscriptions -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp
[Zorba-coders] [Bug 1025622] Re: Incorrect JSON serialization of supplementary plane code points
** Summary changed: - Incorrect JSON serialization of supplementory plane code points + Incorrect JSON serialization of supplementary plane code points -- You received this bug notification because you are a member of Zorba Coders, which is the registrant for Zorba. https://bugs.launchpad.net/bugs/1025622 Title: Incorrect JSON serialization of supplementary plane code points Status in Zorba - The XQuery Processor: In Progress Bug description: this bug is a follow-up of bug #1024448 Currently, the result of the following JSONiq query: let $message := "👊" return { "message": $message } is serialized into incorrect JSON: { "message" : "\ufff0\uff9f\uff91\uff8a" } the correct result would be: { "message" : "\ud83d\udc4a" } Explanation: Characters from the supplementory plane are usually represented in utf-16 surrogate pairs within JSON results. The above result is in particular incorrect because JSON allows only 4 hex digits after '\u'. utf-16 surrogate pairs alway fit into a 4 hex digit or 2 x 4 hex digit window which is most probably the reason why utf-16 is used. This has been greatly fixed in the JSON parser by Paul (see mp: https://code.launchpad.net/~paul-lucas/zorba/bug-1024448/+merge/115248 ), but it still needs to be fixed in the serializer. @Paul: I'm not sure if you are the right person to assign this bug to? thanks To manage notifications about this bug go to: https://bugs.launchpad.net/zorba/+bug/1025622/+subscriptions -- Mailing list: https://launchpad.net/~zorba-coders Post to : zorba-coders@lists.launchpad.net Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp