Comment #29 on issue 761 by [email protected]: Incorrect UTF-8 encoding/decoding for non-BMP characters in String related functions
http://code.google.com/p/v8/issues/detail?id=761

I don't see why this issue was closed... I think offering a full UTF-8 codec is still a valid feature request.

I think V8 should adopt a pragmatic approach there. Indeed, I don't care if JavaScript String.* simply ignores surrogate pairs (conforming to the standard), but I just want to be able to get UTF-8 data from and to a socket, a file, a database passing by a conversion into V8 string without losing data due to those few extra-BMP characters.

If you are interested, I've a patch of v8 to do so.

Regards, M.

--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev

Reply via email to