Hello!

PS> I don’t particular want to add a special spliterator for this
PS> case to avoid some profile pollution. Will it not just push the
PS> pollution further down the road to Spliterator.forEachRemaining? or to 
within other code?

I just thought that the current idea is to create specialized
spliterators for most methods returning streams. If not, why
String.chars()/AbstractStringBuilder.chars() in JDK9 use new
IntCharArraySpliterator instead of already existing
CharBuffer.wrap(this).chars() which produce similar performance in
both sequential and parallel cases? Also for String an alternative
would be

return IntStream.range(0, value.length).map(i -> value[i]);

Which is actually similar to Collections.nCopies().stream().

Also note that Collections class already contains singletonSpliterator
which  creates  an  anonymous class. With my proposed change it can be
replaced  with new ConstantSpliterator(1, element) without performance
drop, so actual number of classes will not increase.

At very least why creating two distinct lambdas in CopiesList.stream()
and CopiesList.parallelStream()? They duplicate "i -> element", for
which javac creates two separate methods (like lambda$stream$95(int)
and lambda$parallelStream$96(int)) and in runtime two distinct
anonymous classes may be created. It could be written instead

public Stream<E> parallelStream() {
    return stream().parallel();
}

With best regards,
Tagir Valeev.

PS> Alas i think profile pollution is current fact of JDK life when
PS> inverting control with lambdas. What we really require is a better way to 
specialise the hot loops.

PS> Paul.

PS> On 28 Jul 2015, at 10:37, Tagir F. Valeev <amae...@gmail.com> wrote:

>> Hello!
>> 
>> Current implementation of Collections.nCopies().stream() is as
>> follows:
>> 
>> http://hg.openjdk.java.net/jdk9/dev/jdk/file/f160dec9a350/src/java.base/share/classes/java/util/Collections.java#l5066
>> 
>> public Stream<E> stream() {
>>    return IntStream.range(0, n).mapToObj(i -> element);
>> }
>> 
>> @Override
>> public Stream<E> parallelStream() {
>>    return IntStream.range(0, n).parallel().mapToObj(i -> element);
>> }
>> 
>> The problem is that it adds a lambda expression to the
>> RangeIntSpliterator type profile which can be polluted by some other
>> code and vice versa: using nCopies().stream() may pollute the type
>> profile for other code making it slower.
>> 
>> Another thing which is missing in current implementation is unordered
>> mode. This collection is unordered by nature, its stream is similar to
>> Stream.generate(), so to my opinion it should be unordered which may
>> improve the parallel reduction in some cases.
>> 
>> This can be improved by introducing the custom spliterator class which
>> is quite simple:
>> https://gist.github.com/amaembo/62f3efee9923b1468e86
>> 
>> On pre-polluted type profile with simple mapping and reduction using
>> custom spliterator is about 25-30% faster in both parallel and
>> sequential cases as benchmarking shows (performed on 4-core cpu).
>> 
>> What do you think?
>> 
>> With best regards,
>> Tagir Valeev.
>> 

Reply via email to