On Jan 23, 2015, at 6:22 PM, Xueming Shen <xueming.s...@oracle.com> wrote:
> On 01/23/2015 09:00 AM, Paul Sandoz wrote: >> Hi, >> >> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8071477-String-spliterators/webrev/ >> >> This patch implements better spliterators for >> String/Buffer/Builder.chars/codePoints than those provided by the default >> methods on CharSequence. >> >> The test java/lang/CharSequence/DefaultTest.java is removed as i now pass >> the spliterators through the grinder of the spliterator-and-traversing tests. >> >> Thanks, >> Paul. > > I'm a little confused at following logic. > > 2997 // Mid-point is a high-surrogate > 2998 // Or mid-point and the previous are low-surrogates > 2999 if (Character.isHighSurrogate(array[mid]) || > 3000 Character.isLowSurrogate(array[midOneLess = (mid - 1)])) > 3001 return new CodePointsSpliterator(array, lo, index = mid, > cs); > > > Shouldn't it be something like > > if (!Character.isLowSurrogate(array[mid]) || > !Character.isHighSurrogate(array[midOneLess = (mid -1)])) { > return new CodePointsSpliterator(array, lo, index = mid, cs); > } > > For example, in case both "mid" and "midOneLess" are normal non-surrogate > character, I would assume the trySplit() should return [lo, index=mid) as > wekk? > > or something like > > ... > if (Character.isLowSurrogate(array[mid]) || && > Character.isHighSurrogate(array[midOneLess = (mid -1)])) { > if (lo >= midOneLess) > return null; > return new CodePointsSpliterator(array, lo, index = midOneless, cs); > } > return new CodePointsSpliterator(array, lo, index = mid, cs); > ... > > means, we only return [lo, midOneless), if mid is in the middle of a surrogate > pair (midOneLess is hiSurr, mid is hoSurr)? > Doh! thanks i dunno what i was thinking (isLowSurrogate was the negation of isHighSurrogate perhaps...) > btw, is it worth having a "nextCodePoint()" to be shared by forEachRemaining > and tryAdvance()? > Nice suggestion. I created a static method "advance". I also noticed that test/TEST.groups needs updating to remove a reference to the removed char sequence test. Webrev updated in place. Thanks, Paul.