On Tue, 28 Oct 2025 20:15:26 GMT, Xueming Shen <[email protected]> wrote:
>> With the following changes
>>
>> - Use int[] / code point storage instead of char[] / char for folding data.
>> - Add a fast path for Latin-1 vs. Latin-1 comparisons.
>> - Speed up the UTF-16 path as well.
>>
>> The only remaining “slow” path is Latin-1 vs. UTF-16 for now.
>> Would appreciate some eagle eyes on the new fast-path implementation. The
>> tests suggest it’s working as expected :-)
>>
>> **StringCompareToFoldCase.latin1UTF16 avgt 15 23.959 ± 2.357
>> ns/op
>> StringCompareToFoldCase.latin1UTF16EQ avgt 15 22.455 ± 0.104 ns/op
>> StringCompareToFoldCase.latin1UTF16EQFC avgt 15 32.782 ± 0.962 ns/op
>> StringCompareToFoldCase.latin1UTF16FC avgt 15 32.581 ± 0.725
>> ns/op**
>>
>>
>> Benchmark Mode Cnt Score Error Units
>> StringCompareToFoldCase.asciiLower avgt 15 17.983 ± 0.208 ns/op
>> StringCompareToFoldCase.asciiLowerEQ avgt 15 9.986 ± 0.254 ns/op
>> StringCompareToFoldCase.asciiLowerEQFC avgt 15 10.781 ± 0.161 ns/op
>> StringCompareToFoldCase.asciiLowerFC avgt 15 10.274 ± 0.136 ns/op
>> StringCompareToFoldCase.asciiUpperLower avgt 15 12.465 ± 0.409 ns/op
>> StringCompareToFoldCase.asciiUpperLowerEQ avgt 15 10.961 ± 0.407 ns/op
>> StringCompareToFoldCase.asciiUpperLowerEQFC avgt 15 9.157 ± 0.166 ns/op
>> StringCompareToFoldCase.asciiUpperLowerFC avgt 15 9.228 ± 0.254 ns/op
>> StringCompareToFoldCase.asciiWithDFFC avgt 15 57.603 ± 1.540 ns/op
>> StringCompareToFoldCase.greekLower avgt 15 39.593 ± 0.128 ns/op
>> StringCompareToFoldCase.greekLowerEQ avgt 15 39.249 ± 0.032 ns/op
>> StringCompareToFoldCase.greekLowerEQFC avgt 15 13.993 ± 0.346 ns/op
>> StringCompareToFoldCase.greekLowerFC avgt 15 14.030 ± 0.454 ns/op
>> StringCompareToFoldCase.greekUpperLower avgt 15 7.137 ± 0.130 ns/op
>> StringCompareToFoldCase.greekUpperLowerEQ avgt 15 7.496 ± 0.104 ns/op
>> StringCompareToFoldCase.greekUpperLowerEQFC avgt 15 6.203 ± 0.316 ns/op
>> StringCompareToFoldCase.greekUpperLowerFC avgt 15 6.235 ± 0.256 ns/op
>> StringCompareToFoldCase.latin1UTF16 avgt 15 23.959 ± 2.357 ns/op
>> StringCompareToFoldCase.latin1UTF16EQ avgt 15 22.455 ± 0.104 ns/op
>> StringCompareToFoldCase.latin1UTF16EQFC avgt 15 32.782 ± 0.962 ns/op
>> StringCompareToFoldCase.latin1UTF16FC avgt 15 32.581 ± 0.725 ns/op
>> StringCompare...
>
> Experimenting with Arrays.mismatch at the beginning of the array iteration as
>
>
> int k = ArraysSupport.mismatch(value, other, lim);
> if (k < 0)
> return len - olen;
> for (; k < lim; k++) {
> ....
> }
>
>
> The benchmark results suggest that it does help 'dramatically' when the
> compared strings share with the same prefix. For example those "UpperLower"
> test cases (which shares the same upper cases text prefix. However it is also
> relatively expensive, with a 20%-ish overhead when the strings do not share
> the same string text but are case-insensitively equals. I would suggest
> let's leave it out for now?
>
> ### With Arrays.mismatch
>
>
> Benchmark Mode Cnt Score Error Units
> StringCompareToFoldCase.asciiLower avgt 15 15.044 ± 0.751 ns/op
> StringCompareToFoldCase.asciiLowerEQ avgt 15 10.033 ± 0.366 ns/op
> StringCompareToFoldCase.asciiLowerEQFC avgt 15 12.094 ± 0.288 ns/op
> StringCompareToFoldCase.asciiLowerFC avgt 15 12.513 ± 0.290 ns/op
> StringCompareToFoldCase.asciiUpperLower avgt 15 11.716 ± 0.471 ns/op
> StringCompareToFoldCase.asciiUpperLowerEQ avgt 15 11.120 ± 0.458 ns/op
> StringCompareToFoldCase.asciiUpperLowerEQFC avgt 15 7.544 ± 0.103 ns/op
> StringCompareToFoldCase.asciiUpperLowerFC avgt 15 7.384 ± 0.167 ns/op
> StringCompareToFoldCase.asciiWithDFFC avgt 15 54.949 ± 1.260 ns/op
> StringCompareToFoldCase.greekLower avgt 15 39.492 ± 0.124 ns/op
> StringCompareToFoldCase.greekLowerEQ avgt 15 39.266 ± 0.071 ns/op
> StringCompareToFoldCase.greekLowerEQFC avgt 15 28.049 ± 0.292 ns/op
> StringCompareToFoldCase.greekLowerFC avgt 15 28.272 ± 0.115 ns/op
> StringCompareToFoldCase.greekUpperLower avgt 15 7.103 ± 0.052 ns/op
> StringCompareToFoldCase.greekUpperLowerEQ avgt 15 7.439 ± 0.079 ns/op
> StringCompareToFoldCase.greekUpperLowerEQFC avgt 15 2.716 ± 0.138 ns/op
> StringCompareToFoldCase.greekUpperLowerFC avgt 15 2.628 ± 0.051 ns/op
> StringCompareToFoldCase.latin1UTF16 avgt 15 23.147 ± 0.094 ns/op
> StringCompareToFoldCase.latin1UTF16EQ avgt 15 22.626 ± 0.081 ns/op
> StringCompareToFoldCase.latin1UTF16EQFC avgt 15 38.453 ± 0.697 ns/op
> StringCompareToFoldCase.latin1UTF16FC avgt 15 38.464 ± 0.441 ns/op
> StringCompareToFoldCase.supLower avgt 15 54.443 ± 0.162 ns/op
> StringCompareToFoldCase....
### Long: packing 1:M-count + 1-3 folding codepoints
https://cr.openjdk.org/~sherman/casefolding_long/
The performance is slightly better, but not as good as I would have expected.
The access to codepoint from the long looks a little clumsy, but the logic
looks smooth. need more work. opinion?
Benchmark Mode Cnt Score Error Units
StringCompareToFoldCase.asciiLower avgt 15 15.487 ± 0.298 ns/op
StringCompareToFoldCase.asciiLowerEQ avgt 15 10.005 ± 0.368 ns/op
StringCompareToFoldCase.asciiLowerEQFC avgt 15 10.755 ± 0.160 ns/op
StringCompareToFoldCase.asciiLowerFC avgt 15 10.349 ± 0.155 ns/op
StringCompareToFoldCase.asciiUpperLower avgt 15 12.188 ± 0.278 ns/op
StringCompareToFoldCase.asciiUpperLowerEQ avgt 15 10.901 ± 0.551 ns/op
StringCompareToFoldCase.asciiUpperLowerEQFC avgt 15 9.218 ± 0.165 ns/op
StringCompareToFoldCase.asciiUpperLowerFC avgt 15 9.335 ± 0.404 ns/op
StringCompareToFoldCase.asciiWithDFFC avgt 15 37.010 ± 0.518 ns/op
StringCompareToFoldCase.greekLower avgt 15 39.572 ± 0.098 ns/op
StringCompareToFoldCase.greekLowerEQ avgt 15 39.317 ± 0.104 ns/op
StringCompareToFoldCase.greekLowerEQFC avgt 15 20.428 ± 0.243 ns/op
StringCompareToFoldCase.greekLowerFC avgt 15 19.623 ± 0.141 ns/op
StringCompareToFoldCase.greekUpperLower avgt 15 7.105 ± 0.048 ns/op
StringCompareToFoldCase.greekUpperLowerEQ avgt 15 7.462 ± 0.092 ns/op
StringCompareToFoldCase.greekUpperLowerEQFC avgt 15 6.518 ± 0.128 ns/op
StringCompareToFoldCase.greekUpperLowerFC avgt 15 6.593 ± 0.240 ns/op
StringCompareToFoldCase.latin1UTF16 avgt 15 23.130 ± 0.152 ns/op
StringCompareToFoldCase.latin1UTF16EQ avgt 15 22.606 ± 0.089 ns/op
StringCompareToFoldCase.latin1UTF16EQFC avgt 15 29.574 ± 0.348 ns/op
StringCompareToFoldCase.latin1UTF16FC avgt 15 29.691 ± 0.445 ns/op
StringCompareToFoldCase.supLower avgt 15 55.027 ± 0.676 ns/op
StringCompareToFoldCase.supLowerEQ avgt 15 55.784 ± 0.368 ns/op
StringCompareToFoldCase.supLowerEQFC avgt 15 24.984 ± 0.157 ns/op
StringCompareToFoldCase.supLowerFC avgt 15 24.865 ± 0.139 ns/op
StringCompareToFoldCase.supUpperLower avgt 15 14.538 ± 0.144 ns/op
StringCompareToFoldCase.supUpperLowerEQ avgt 15 14.728 ± 0.206 ns/op
StringCompareToFoldCase.supUpperLowerEQFC avgt 15 9.932 ± 0.121 ns/op
StringCompareToFoldCase.supUpperLowerFC avgt 15 9.870 ± 0.119 ns/op
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/27628#discussion_r2471532292