[v8-dev] Re: Issue 761 in v8: Incorrect UTF-8 encoding/decoding for non-BMP characters in String related functions

codesite-noreply Tue, 13 Sep 2011 01:42:45 -0700

Comment #6 on issue 761 by [email protected]: Incorrect UTF-8encoding/decoding for non-BMP characters in String related functions

http://code.google.com/p/v8/issues/detail?id=761

We are using Websocket for sending bulk data from native application to webbrowser application written in java script. our native application issending bulk data in utf-8 decoding format.

web browser java script application works fine with data having characterin Basic Multilingual Plane. If there is a utf-8 codded character outsidethe Basic Multilingual Plane (code point in surrogate area) it replace itwith U+FFFD (REPLACEMENT CHARACTER). due to which java script applicationnever know what string has been received.

One option is to fix this using utf-16 for code point in surrogate area.our orginal data is in utf-8 format and conversion from utf-8 to utf-16 forthese characeted require to scan complete string and identify location ofthose characted and then replace them with utf-16 surrogate pair. Thisreplcement is a costly operation and slowdown whole application.

Is there any plan to support code point in surrogate area in utf-8 formatin browser itself to avoid this costly conversion.


--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev

[v8-dev] Re: Issue 761 in v8: Incorrect UTF-8 encoding/decoding for non-BMP characters in String related functions

Reply via email to