On Wed, 20 Apr 2022 21:08:19 GMT, XenoAmess <d...@openjdk.java.net> wrote:
>> some thoughts after watching 8285001: Simplify StringLatin1.regionMatches >> https://github.com/openjdk/jdk/pull/8292/ >> >> if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) { >> continue; >> } >> >> should be changed to >> >> if (((u1 == c1) ? CharacterDataLatin1.instance.toLowerCase(c1) : >> c1) == Character.toLowerCase(u2)) { >> continue; >> } >> >> as: >> >> 1. c1 is LATIN1, so CharacterDataLatin1.instance.toLowerCase seems faster. >> 2. because c1 is LATIN1, so if u1 != c1, then c1 is already lowercase, and >> don't need a lowercase cauculation. > > XenoAmess has updated the pull request incrementally with one additional > commit since the last revision: > > remove = check Unfortunately this leads to an error for case-insensitive `regionMatches` between a latin-1-string that contains either of `\u00b5` or `\u00ff` (these two code-points have upper case codepoints outside of the latin-1 range) and a UTF-16 string: jshell> "\u00b5".regionMatches(true, 0, "\u0100", 0, 1) | Exception java.lang.ArrayIndexOutOfBoundsException: Index 924 out of bounds for length 256 | at CharacterDataLatin1.getProperties (CharacterDataLatin1.java:74) | at CharacterDataLatin1.toLowerCase (CharacterDataLatin1.java:140) | at StringLatin1.regionMatchesCI_UTF16 (StringLatin1.java:420) | at String.regionMatches (String.java:2238) | at (#4:1) ------------- PR: https://git.openjdk.java.net/jdk/pull/8308