On Tue, Jul 5, 2011 at 3:56 AM, Stephan Beal <[email protected]> wrote:
> Slight correction: one to four bytes.
> http://en.wikipedia.org/wiki/UTF-8
> ASCII text is, by definition, also UTF-8, so to say that a Unicode character
> is doesn't use 1 byte isn't strictly correct.

Sorry, my bad. I don't know why I said three; probably a consequence
of posting at 3AM. Three bytes covers the BMP, four bytes will cover
all currently-defined Unicode codepoints. Not significant at the
moment, though.


On Tue, Jul 5, 2011 at 6:40 AM, Henrik Lindqvist
<[email protected]> wrote:
> Its more serious than a just little "quirk". Many binary protocols use
> Pascal type strings where the length is stored explicitly, then
> String::WriteUtf8 can't be used. V8 should atleast skip writing \0
> when HINT_MANY_WRITES_EXPECTED is specified, that would be logical.

The trouble is, any code written now will expect it to include the \0
in the count. Would it suit to add an additional hint, eg
HINT_NO_NULL_TERMINATOR, which will then (a) not write the null, and
(b) not include it in the count?

It'll be a fairly simple change. I could make it when I get to work in
an hour or so, and submit a patch. Where are such things handled?

Chris Angelico

-- 
v8-users mailing list
[email protected]
http://groups.google.com/group/v8-users

Reply via email to