Re: RFR 8071477: Better Spliterator implementations for String.chars() and String.codePoints()

Xueming Shen Fri, 23 Jan 2015 12:22:01 -0800

The webrev looks good.

However I have a question regarding the spec of 
CharsSequence.chars/codePoints(). It says
     *
     * <p>If the sequence is mutated while the stream is being read, the
     * result is undefined.


But it looks like ABS.chars/codePoints() "predefine/fix" the size of the 
resulting stream when
the stream is being created/initialized. The possible corner case is that the 
"size/length" of
ABS is changed between the stream is created  and read?

I would assume the CharSequence default implementation will go after the real 
size, even
a "length()" is passed in as an estimated size during creation, but the 
optimized implementation
will only return the length/size of char/cp when the stream is created. A 
behavior change?

-Sherman


On 01/23/2015 11:19 AM, Paul Sandoz wrote:

On Jan 23, 2015, at 6:22 PM, Xueming Shen<[email protected]>  wrote:

On 01/23/2015 09:00 AM, Paul Sandoz wrote:

Hi,

http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8071477-String-spliterators/webrev/

This patch implements better spliterators for 
String/Buffer/Builder.chars/codePoints than those provided by the default 
methods on CharSequence.

The test java/lang/CharSequence/DefaultTest.java is removed as i now pass the 
spliterators through the grinder of the spliterator-and-traversing tests.

Thanks,
Paul.

I'm a little confused at following logic.

2997             // Mid-point is a high-surrogate
2998             // Or mid-point and the previous are low-surrogates
2999             if (Character.isHighSurrogate(array[mid]) ||
3000                 Character.isLowSurrogate(array[midOneLess = (mid - 1)]))
3001                 return new CodePointsSpliterator(array, lo, index = mid, 
cs);


Shouldn't it be something like

if (!Character.isLowSurrogate(array[mid]) ||
    !Character.isHighSurrogate(array[midOneLess = (mid -1)])) {
    return new CodePointsSpliterator(array, lo, index = mid, cs);
}

For example, in case both "mid" and "midOneLess" are normal non-surrogate
character, I would assume the trySplit() should return [lo, index=mid) as wekk?

or something like

...
if (Character.isLowSurrogate(array[mid]) ||

&&

    Character.isHighSurrogate(array[midOneLess = (mid -1)])) {
    if (lo>= midOneLess)
        return null;
    return new CodePointsSpliterator(array, lo, index = midOneless, cs);
}
return new CodePointsSpliterator(array, lo, index = mid, cs);
...

means, we only return [lo, midOneless), if mid is in the middle of a surrogate
pair (midOneLess is hiSurr, mid is hoSurr)?

Doh! thanks i dunno what i was thinking (isLowSurrogate was the negation of 
isHighSurrogate perhaps...)

btw, is it worth having a "nextCodePoint()" to be shared by forEachRemaining
and tryAdvance()?

Nice suggestion. I created a static method "advance".

I also noticed that test/TEST.groups needs updating to remove a reference to 
the removed char sequence test.

Webrev updated in place.

Thanks,
Paul.

Re: RFR 8071477: Better Spliterator implementations for String.chars() and String.codePoints()

Reply via email to