Re: RFR: 8285255: refine StringLatin1.regionMatchesCI_UTF16 [v3]

Roger Riggs Wed, 20 Apr 2022 14:22:47 -0700

On Wed, 20 Apr 2022 21:08:19 GMT, XenoAmess <[email protected]> wrote:


>> some thoughts after watching 8285001: Simplify StringLatin1.regionMatches  
>> https://github.com/openjdk/jdk/pull/8292/
>> 
>>             if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
>>                 continue;
>>             }
>> 
>> should be changed to 
>> 
>>             if (((u1 == c1) ? CharacterDataLatin1.instance.toLowerCase(c1) : 
>> c1) == Character.toLowerCase(u2)) {
>>                 continue;
>>             }
>> 
>> as:
>> 
>> 1. c1 is LATIN1, so CharacterDataLatin1.instance.toLowerCase seems faster.
>> 2. because c1 is LATIN1, so if u1 != c1, then c1 is already lowercase, and 
>> don't need a lowercase cauculation.
>
> XenoAmess has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   remove = check

Can you run the JMH against the code before either change (or an existing JDK).
It would be interesting to quantify the improvements of going straight to 
Latin1.

(Understanding current hardware architectures and their parallelism is hard to 
understand well.
They do clever things with branch prediction and potentially optimistically 
executing both paths
and then discarding the non-branch case.  The existing code for toLower and 
toUpper already includes a branch or two; adding one more branch to the 
sequence likely can't be optimized.)

These interactions at the instruction level is why measuring is important.
Thanks

-------------

PR: https://git.openjdk.java.net/jdk/pull/8308

Re: RFR: 8285255: refine StringLatin1.regionMatchesCI_UTF16 [v3]

Reply via email to