Comment #15 on issue 761 by [email protected]: Incorrect UTF-8
encoding/decoding for non-BMP characters in String related functions
http://code.google.com/p/v8/issues/detail?id=761
Thanks for your reply. Actully I am sending UTF-8 Encodded Data from a
Native (C) application to Client inside Chrome browser which recieve data
from WebSocket using javascript (I think i use V8 for same). This Data
containg non-BMP character as well.
But to due to limitation of V8 Engine as I have seen in Chrome browser it
has been converted into U+FFFD
So I have tried non-BMP character in UTF-16 surrogate pair
e.g. charcter (𝍖) U+1D356 in UTF-8=(f0 9d 8d 96) in UTF-16=(D834 DF56)
Native Apps:
char *p = out;
*p++ = 0xd8;
*p++ = 0x34;
*p++ = 0xdf;
*p++ = 0x56;
*p = '\0';
JavaScrip in Chrome:
var ws = new WebSocket('ws://localhost:12345/mySession');
this.ws.onmessage = function(evt)
{
var reply = evt.data;
console.log ('reply :'+ reply); // empty string received(when send
non-BMP char in UTF-16 )
// replacement char U+FFFD ( when send non-BMP char in UTF-8 )
}
This code is Native (C) sending data in UTF-16 ASCII hex digits. But at
Chrome browser Java script Application receive empty string
I am not sure it is a problem with V8 or Webkit or Chrome. But finally data
in UTF-16(surrogate pair) is not received.
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev