On Sun, 17 Jan 2021 14:56:40 GMT, Peter Levart <plev...@openjdk.org> wrote:
>> Claes Redestad has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Simplify lookupCharset > > This looks good. > Are you planning to do similar things for encoding too? I already approved the changes and they are OK. Maybe for a followup: just noticing after the fact that logic for `newStringUTF8NoRepl(....)` vs. `new String(...., StringCoding.UTF_8)` differ in handling unmappabale characters, which is by the spec, but also the constructor contains special handling of input that only contains non-negative bytes: if (charset == UTF_8) { if (COMPACT_STRINGS && !StringCoding.hasNegatives(bytes, offset, length)) { this.value = Arrays.copyOfRange(bytes, offset, offset + length); this.coder = LATIN1; return; ...while `newStringUTF8NoRepl(....)` does not contain this optimization. I guess ZipCoder could benefit from that optimization too since paths are mostly ASCII only. So WDYT of this additional simplification/consolidation of UTF-8 decoding: https://github.com/plevart/jdk/commit/0b8b12c998e4ed451588442205ebe8f7423db7d8 ------------- PR: https://git.openjdk.java.net/jdk/pull/2102