Am 16.03.2010 22:36, schrieb Martin Buchholz:
On Tue, Mar 16, 2010 at 13:58, Ulf Zibis<[email protected]> wrote:
Additionally, toUpperCaseCharArray(), codePointCountImpl(), String(int[],
int, int) would profit from consecutive use of isBMPCodePoint +
isSupplementaryCodePoint() or isHighSurrogate() + isLowSurrogate.
For codePointCountImpl(), I do not agree.
1-byte comparisons have less footprint, in doubt load faster from
memory, need less L1-CPU-cache, on small/RISC/etc. CPU's would be faster
and therefore should enhance overall performance.
The shift additionally could be omitted on CPU's which can benefit from
6933327.
For String(int[], int, int), I do agree.
Here is my latest more readable and more performant implementation:
int end = offset + count;
// Pass 1: Compute precise size of char[]
int n = 0;
for (int i = offset; i< end; i++) {
int c = codePoints[i];
if (Character.isBMPCodePoint(c))
n += 1;
else if (Character.isSupplementaryCodePoint(c))
n += 2;
else throw new IllegalArgumentException(Integer.toString(c));
}
// Pass 2: Allocate and fill in char[]
char[] v = new char[n];
for (int i = offset, j = 0; i< end; i++) {
int c = codePoints[i];
if (Character.isBMPCodePoint(c)) {
v[j++] = (char) c;
} else {
Character.toSurrogates(c, v, j);
j += 2;
}
}
I suggest:
// Pass 2: Allocate and fill in char[]
char[] v = new char[n];
for (int i = end; n > 0; ) {
int c = codePoints[--i];
if (Character.isBMPCodePoint(c))
v[--n] = (char)c;
else
Character.toSurrogates(c, v, n -= 2);
}
- saves 1 variable (=reduces register pressure)
- determining of the loop end against 0 is faster than against "end",
see: 6932855 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6932855>
BTW:
int end = offset + count;
could be saved, as VM would do that, for sure in HotSpot c2 compiler.
-Ulf