Hi Raffaello,
Missed sending this yesterday, the changes are in the updated webrev.
On 8/23/20 10:42 AM, Raffaello Giulietti wrote:
...
Hi Roger,
here are my notes about the Decoder.
(1) The only serious issue I could find is on L.548. It should read
L.548 CharBuffer cb = CharBuffer.wrap(chars, index, index + length);
as method wrap() takes a start and an end.
(It's quite confusing and error prone to have both (start, length) and
(start, end) conventions on ranges all over the OpenJDK :-( Nobody's
fault, just annoying.)
Well spotted, fixed.
The rest are minor notes.
(2) While I'm pretty sure that a JIT compiler can remove the index range
check on L.465 when len is replaced by bytes.length on L.463, I'm less
sure that it can do it as currently coded, despite len and bytes.length
having the same value. Readability wouldn't suffer and len could be
removed.
The JIT doesn't have trouble following those values.
(3) The expressions preLen > origLen - sufLen on L.492 and sufLen >
origLen - preLen on L.496 are equivalent. The latter is thus useless on
L.496 because it is always false when execution reaches here.
true; updated to optimize the case of empty prefix and/or suffix.
(4) The code starting at L.588 assumes that delimiter.length() > 0,
which is ensured by the optimization in L.585-586.
Without this assumption, however, neither L.509 nor L.512 are correct
when delimiter.length() == 0.
To make the code to work even when, for some reason, delimiter.length()
== 0) the lines can be replaced by the slightly more involved yet more
robust
L.509 if ((subcs.length() - 2) % stride != 0)
and
L.512 final int len = (subcs.length() - 2) / stride + 1;
fixed
(5) I guess that the rationale for not failing fast on illegal
characters in the loop on L.516-521, and accumulate and postpone the
check only after the whole cs has been processed, is that the code
optimistically assumes that cs is legal, so it would be wasteful to
check at each iteration.
But since each iteration already needs a quite more expensive check for
the delimiter, I wonder if the postponing makes much economical sense.
The check could be directly on v before assignment to bytes and the var
illegal could be removed.
The code is easier to read with the inline checs and throws.
(6) What about moving L.517 at the end of the loop body so that memory
accesses are issued with strictly increasing and hardware predictable
addresses? This move would also better reveal the "left-to-right"
processing intent.
ok, though with a lot of reordering and caching in hardware it won't
make a difference.
(7) On L.561, delimiter.isEmpty() could be removed as checkDelimiter()
is not invoked when the delimiter is empty. But it's perhaps safer to
have it.
It could be an assert, but asserts aren't enabled always
(8) You may want to wrap escapeNL() around the message on L.567-569
ok
(9) fromHexPair() and fromHex() can be declared static.
(10) equals() and hashCode() are missing despite the class being
value-based. This holds for Encoder as well.
Added
I look forward for the next iteration of the code.
Thanks for the comments,
Roger
Greetings
Raffaello