On Tue, 21 Feb 2023 11:12:18 GMT, Eirik Bjorsnos <[email protected]> wrote:
>> This PR suggests we speed up Character.toUpperCase and Character.toLowerCase
>> for latin1 code points by applying the 'oldest ASCII trick in the book'.
>>
>> This takes advantage of the fact that latin1 uppercase code points are
>> always 0x20 lower than their lowercase (with the exception of two code
>> points which uppercase out of latin1).
>>
>> To verify the correctness of the new implementation, the test
>> `Latin1CaseConversion` is added with an exhaustive verification of
>> toUpperCase/toLowerCase for all latin1 code points.
>>
>> The implementation needs to balance the performance of the various ranges in
>> latin1. An effort has been made to favour operations on ASCII code points,
>> without causing excessive regression for higher code points.
>>
>> Performance is benchmarked for 7 chosen sample code points, each
>> representing a range or a special-case. Results in the first comment.
>
> Eirik Bjorsnos has updated the pull request incrementally with three
> additional commits since the last revision:
>
> - Allow any integer codePoint by defaulting to Integer.parseInt
> - Rename Latin1CaseConversions to just CaseConversions
> - Remove a whitespace following 'if ('
Thanks for your review and JBS juggling, Claes!
I'll wait for a final word from @naotoj before integrating.
-------------
PR: https://git.openjdk.org/jdk/pull/12623