subject:"\"Re\\\: RFR\\\: 8248655\\\: Support supplementary characters in String case insensitive operations\""

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-22 Thread Brent Christian

Hi, Naoto The latest changes look good to me. -Brent On 7/22/20 10:23 AM, naoto.s...@oracle.com wrote: Hi, I revised the fix again, based on further suggestions: https://cr.openjdk.java.net/~naoto/8248655.8248434/webrev.05/ Changes from v.04 are (all in StringUTF16.java): - The short cut n

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-22 Thread Joe Wang

On 7/22/20 1:43 PM, naoto.s...@oracle.com wrote: Thanks Roger, Ah, I just saw your email just after I sent mine! They probably saw each other crossing and said hi on the way to inboxes ;-) On 7/22/20 1:38 PM, Roger Riggs wrote: Hi Naoto, Looks fine. (with Joe's suggestion) On 7/22/20 4:

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-22 Thread naoto . sato

Thanks Roger, Ah, I just saw your email just after I sent mine! On 7/22/20 1:38 PM, Roger Riggs wrote: Hi Naoto, Looks fine. (with Joe's suggestion) On 7/22/20 4:20 PM, Joe Wang wrote: Hi Naoto, The change looks good to me. "supLower" is indeed super slow :-) The only minor comment I have

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-22 Thread Roger Riggs

Hi Naoto, Looks fine. (with Joe's suggestion) On 7/22/20 4:20 PM, Joe Wang wrote: Hi Naoto, The change looks good to me. "supLower" is indeed super slow :-) The only minor comment I have is that the compareCodePointCI method performs toUpperCase unconditionally. That's not a problem for the

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-22 Thread naoto . sato

Hi Joe, Thank you for the consecutive reviews! On 7/22/20 1:20 PM, Joe Wang wrote: Hi Naoto, The change looks good to me. "supLower" is indeed super slow :-) The only minor comment I have is that the compareCodePointCI method performs toUpperCase unconditionally. That's not a problem for the

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-22 Thread Joe Wang

Hi Naoto, The change looks good to me. "supLower" is indeed super slow :-) The only minor comment I have is that the compareCodePointCI method performs toUpperCase unconditionally. That's not a problem for the regular case, where a check on cp1 == cp2 (line 337) is done prior to the method ca

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-22 Thread naoto . sato

Hi, I revised the fix again, based on further suggestions: https://cr.openjdk.java.net/~naoto/8248655.8248434/webrev.05/ Changes from v.04 are (all in StringUTF16.java): - The short cut now does case insensitive comparison that makes the fix closer to the previous implementation (for BMP char

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-21 Thread naoto . sato

Hi Brent, On 7/21/20 2:56 PM, Brent Christian wrote: Hi, Naoto I have a few comments: src/java.base/share/classes/java/lang/StringUTF16.java 379 private static int getSupplementaryCodePoint(byte[] ba, int cp, int index, int start, int end) I think it would be worth a small addition to

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-21 Thread Brent Christian

Hi, Naoto I have a few comments: src/java.base/share/classes/java/lang/StringUTF16.java 379 private static int getSupplementaryCodePoint(byte[] ba, int cp, int index, int start, int end) I think it would be worth a small addition to the comment to reflect that non-surrogate chars are re

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-21 Thread naoto . sato

Thank you, Joe. I got it now. Will try out and benchmark. Naoto On 7/21/20 10:05 AM, Joe Wang wrote: On 7/20/2020 8:58 PM, naoto.s...@oracle.com wrote: The short-cut worked well. There's maybe a further optimization we could do to rid us of the performance concern (or the need to run additi

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-21 Thread naoto . sato

Thank you, Roger. On 7/21/20 7:50 AM, Roger Riggs wrote: Hi Naoto, StringUTF16.java: 343, 348: The masking and comparisons seem awkward. I'd suggest just negating the value and testing for < 0 to flag the less usual case. If you continue with the flag bit, define your own static final constan

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-21 Thread Mark Davis ☕️

One option is to have a fast path that uses char functions, up to the point where you hit a high surrogate, then drop into the slower codepoint path. That saves time for the high-runner cases. On the other hand, if the times are good enough, you might not need the complication. Mark On Fri, Jul

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-21 Thread Joe Wang

On 7/20/2020 8:58 PM, naoto.s...@oracle.com wrote: The short-cut worked well. There's maybe a further optimization we could do to rid us of the performance concern (or the need to run additional performance tests). Consider the cases where strings in comparison don't contain any sup characte

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-21 Thread Roger Riggs

Hi Naoto, StringUTF16.java: 343, 348: The masking and comparisons seem awkward. I'd suggest just negating the value and testing for < 0 to flag the less usual case. If you continue with the flag bit, define your own static final constant for that bit; It looks odd to be using Integer.MIN_VALU

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-20 Thread naoto . sato

Hi Joe, On 7/20/20 7:14 PM, Joe Wang wrote: Hi Naoto, "Unless it showed 100% slower", wow, your tolerance is quite high :-). On the other hand, I do agree it's unlikely to show in specJVM (that's a speculation though). I am not saying 100% slowing is permissible :-) That's an example of de

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-20 Thread Joe Wang

Hi Naoto, "Unless it showed 100% slower", wow, your tolerance is quite high :-). On the other hand, I do agree it's unlikely to show in specJVM (that's a speculation though). The short-cut worked well. There's maybe a further optimization we could do to rid us of the performance concern (or

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-20 Thread naoto . sato

Small correction in the updated part: http://cr.openjdk.java.net/~naoto/8248655.8248434/webrev.04/ Naoto On 7/20/20 2:39 PM, naoto.s...@oracle.com wrote: Hi Joe, Thank you for your comments. On 7/20/20 11:20 AM, Joe Wang wrote: Hi Naoto, StringUTF16: line 384 - 388 seem unnecessary since y

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-20 Thread naoto . sato

Hi Joe, Thank you for your comments. On 7/20/20 11:20 AM, Joe Wang wrote: Hi Naoto, StringUTF16: line 384 - 388 seem unnecessary since you'd only get there if 389:isHighSurrogate is not true. Good point. But more importantly, StringUTF16 has existing method "codePointAt" you may want to c

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-20 Thread Joe Wang

Hi Naoto, StringUTF16: line 384 - 388 seem unnecessary since you'd only get there if 389:isHighSurrogate is not true. But more importantly, StringUTF16 has existing method "codePointAt" you may want to consider instead of adding a new method. Comparing to the base benchmark: StringCompareToI

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-19 Thread naoto . sato

Hi Mark, Thank you for your comments. On 7/17/20 8:03 PM, Mark Davis ☕ wrote: One option is to have a fast path that uses char functions, up to the point where you hit a high surrogate, then drop into the slower codepoint path. That saves time for the high-runner cases. On the other hand, if

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-17 Thread naoto . sato

Hi, Based on the suggestions, I modified the fix as follows: https://cr.openjdk.java.net/~naoto/8248655.8248434/webrev.01/ Changes from the initial revision are: - Shared the implementation between compareToCI() and regionMatchesCI() - Enabled immediate short cut if two code points match. - Cr

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread James Laskey

👍 📱 > On Jul 15, 2020, at 4:32 PM, Joe Wang wrote: > > Jim: I was referring to the rest of the process (the calls to > toCodePoint/toUpperCase/toLowerCase), not isHighSurrogate itself. But Roger > has a more comprehensive review on performance, and Naoto is planning on > preparing a JMH tes

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread Joe Wang

Jim: I was referring to the rest of the process (the calls to toCodePoint/toUpperCase/toLowerCase), not isHighSurrogate itself. But Roger has a more comprehensive review on performance, and Naoto is planning on preparing a JMH test. -Joe On 7/15/2020 11:46 AM, Jim Laskey wrote: Joe: This is

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread Jim Laskey

Joe: This is a defensive approach that I believe has minimal cost. public static boolean isHighSurrogate(char ch) { // Help VM constant-fold; MAX_HIGH_SURROGATE + 1 == MIN_LOW_SURROGATE return ch >= MIN_HIGH_SURROGATE && ch < (MAX_HIGH_SURROGATE + 1); } > On Jul 15, 2020,

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread naoto . sato

Hi Joe, Thank you for your review. On 7/15/20 10:57 AM, Joe Wang wrote: Hi Naoto, In StringUTF16.java, if one is isHighSurrogate and the other not, you may quickly return without going through the rest of the process, probably not significant as cp1 and cp2 and/or u1 and u2 won't be equal a

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread naoto . sato

Hi Roger, Thank you for your review and suggestions. On 7/15/20 10:56 AM, Roger Riggs wrote: Hi Naoto, Given the extra tests in the body of the loop, I think its worth finding or creating a JMH test for this and checking the performance. Good point. With performance in mind, I would try

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread Joe Wang

Hi Naoto, In StringUTF16.java, if one is isHighSurrogate and the other not, you may quickly return without going through the rest of the process, probably not significant as cp1 and cp2 and/or u1 and u2 won't be equal anyways. But it could skip a couple of toCodePoint/toUpperCase/toLowerCase

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread Roger Riggs

Hi Naoto, Given the extra tests in the body of the loop, I think its worth finding or creating a JMH test for this and checking the performance. With performance in mind, I would try to fall back to the UC/LC conversions only when the bytes don't match. See java.util.Arrays.mismatch(byte[],

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread naoto . sato

Thank you, Jim, for the quick review! On 7/15/20 9:26 AM, Jim Laskey wrote: I think I'm good with this. +1 Asides: 325 int cp1 = (int)getChar(value, k1); 326 int cp2 = (int)getChar(other, k2); I would be tempted to short cut by exiting when not equal, but I think w

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

2020-07-15 Thread Jim Laskey

I think I'm good with this. +1 Asides: 325 int cp1 = (int)getChar(value, k1); 326 int cp2 = (int)getChar(other, k2); I would be tempted to short cut by exiting when not equal, but I think we agreed we need to allow for upper/lowers on different planes. In the UTF-16

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

30 matches

Site Navigation

Mail list logo

Footer information