Hi,
Collapsing a couple of emails.
On 03/18/2019 01:43 PM, Martin Buchholz wrote:
Hmmmm ....
I took another look at this and Roger tickled my UTF-8 optimization
spidey sense. Rewritten using Martin style:
static int writeUTF(String str, DataOutput out) throws IOException {
int strlen = str.length();
int utflen = strlen; // optimized for ASCII
for (int i = 0; i < strlen; i++) {
int c = str.charAt(i);
if (c >= 0x80 || c == 0)
utflen += (c >= 0x800) ? 2 : 1;
}
@Martin, you'd need uftlen to be a long to avoid the original overflow
to negative problem
for an input string length > MAXINT / 3. And that might be the
cleanest. Compared to adding
extra conditions in the loop.
Otherwise, this code looks good, it might optimize to fewer branches
than the original.
if (utflen > 65535)
throw new UTFDataFormatException(
"encoded string too long: " + utflen + " bytes");
final byte[] bytearr;
if (out instanceof DataOutputStream) {
DataOutputStream dos = (DataOutputStream)out;
if (dos.bytearr == null || (dos.bytearr.length < (utflen +
2)))
dos.bytearr = new byte[(utflen*2) + 2];
bytearr = dos.bytearr;
} else {
bytearr = new byte[utflen + 2];
}
int count = 0;
bytearr[count++] = (byte) ((utflen >>> 8) & 0xFF);
bytearr[count++] = (byte) ((utflen >>> 0) & 0xFF);
int i = 0;
for (i = 0; i < strlen; i++) { // optimized for initial run of
ASCII
int c = str.charAt(i);
if (c >= 0x80 || c == 0) break;
bytearr[count++] = (byte) c;
}
for (; i < strlen; i++) {
int c = str.charAt(i);
if (c < 0x80 && c != 0) {
bytearr[count++] = (byte) c;
}
else if (c >= 0x800) {
bytearr[count++] = (byte) (0xE0 | ((c >> 12) & 0x0F));
bytearr[count++] = (byte) (0x80 | ((c >> 6) & 0x3F));
bytearr[count++] = (byte) (0x80 | ((c >> 0) & 0x3F));
} else {
bytearr[count++] = (byte) (0xC0 | ((c >> 6) & 0x1F));
bytearr[count++] = (byte) (0x80 | ((c >> 0) & 0x3F));
}
}
out.write(bytearr, 0, utflen + 2);
return utflen + 2;
}
On Mon, Mar 18, 2019 at 6:41 AM Roger Riggs <roger.ri...@oracle.com
<mailto:roger.ri...@oracle.com>> wrote:
Hi Brian,
On 03/15/2019 05:06 PM, Brian Burkhalter wrote:
Updated: http://cr.openjdk.java.net/~bpb/8219196/webrev.01/
<http://cr.openjdk.java.net/%7Ebpb/8219196/webrev.01/>
I'm at odds with Martin on including utflen in the loop. (Or
that's not what he meant).
Now it has to do two comparisons for every input byte instead of 1.
There is no need to have an early exit for input strings that are
too long.
It is a very unusual case, probably never except in testing.
There might be a case for checking that strlen <= 65535,
(even at 1 input byte per output byte it would be too large).
But even that is probably not worth the compare and branch.
The test only needs enough memory to create the input string (MAXINT/3+1)
and a bit more for the default sized ByteArrayOutputStream.
So it should run in 2G configurations.
So yes, removing the enabled = false is ok.
Roger
Instead of disabling the test statically, you could make it
conditional
on Runtime.maxMemory but the test will fail quickly anyway.
For this I simply added the requirement to the jtreg tags instead.
With the fix, it will throw immediately before any allocation, so
there is
no need for a restriction on memory size.
The only time that would be necessary would be running the test on
a runtime without the fix.
The @requires >4g restriction is unnecessary.
$.02, Roger