https://codereview.chromium.org/12390057/diff/4001/src/api.cc
File src/api.cc (right):
https://codereview.chromium.org/12390057/diff/4001/src/api.cc#newcode3864
src/api.cc:3864: int last_character =
unibrow::Utf16::kNoPreviousCharacter;
I wonder if it's worth modifying this to take advantage of the fact that
Latin1 never has surrogate pairs and the high bit of the character
distinguishes 1-byte from 2-byte UTF-8 encodings.
if (sizeof(Char) == 2) {
// Current implementation
} else {
for (int i = 0; i < length; i++) {
utf8_length += (chars[i] >> 7); // Assume unsigned.
}
utf8_length += length;
}
This has the advantage that there is no branch in the inner loop, which
is normally great for deeply pipelined CPUs.
In general it seems that this could be made vastly simpler for Latin1
strings, perhaps by having a completely different visitor. It's really
a rather trivial operation for Latin1 (not as simple as ASCII, but
still...)
https://codereview.chromium.org/12390057/
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.