Hi David,
Glad to hear that you are delighted with the recent fix (JDK-8248655).
The scope of the fix is limited to the String class, so it may or may
not affect the said RegEx and/or Collator case insensitive operations. I
created the following two issues to track your observations:
https://bugs.openjdk.java.net/browse/JDK-8253058
https://bugs.openjdk.java.net/browse/JDK-8253059
And happy to take a look at them.
PS. "jdk-dev" is for the technical discussion related to the "JDK
Project", so I'd recommend choosing either core-libs and/or i18n-dev
mailing lists for the further discussion.
Naoto
On 9/10/20 3:52 PM, Dai Conrad wrote:
I was delighted to hear the longstanding problem with
case-insensitive comparisons of strings with astral
characters (ones outside the basic multilingual plane)
was fixed in JDK 16 build 8. Methods equalsIgnoreCase,
regionMatches, and compareToIgnoreCase all work
correctly now.
I had assumed this would also fix case-insensitive regular
expressions and java.text.Collator, since I guessed they
boiled down to a call to regionMatches somewhere under
the covers. But this appears not to be the... case.
For scripts Deseret, Osage, Old Hungarian, Warang Citi,
Medefaidrin, and Adlam, for strings with upper- and
lowercase variants of the same letter, the following
code fails:
Pattern pattern = Pattern.compile(lower, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(upper);
assertThat(matcher.matches()).isTrue();
Collator collator = Collator.getInstance();
collator.setStrength(Collator.PRIMARY);
assertThat(collator.compare(lower, upper)).isEqualTo(0);
I'm not sure why the fix didn't fix these, but it would be
a shame to overlook them while fixing it in other places.
David