Hello! PS> I don’t particular want to add a special spliterator for this PS> case to avoid some profile pollution. Will it not just push the PS> pollution further down the road to Spliterator.forEachRemaining? or to within other code?
I just thought that the current idea is to create specialized spliterators for most methods returning streams. If not, why String.chars()/AbstractStringBuilder.chars() in JDK9 use new IntCharArraySpliterator instead of already existing CharBuffer.wrap(this).chars() which produce similar performance in both sequential and parallel cases? Also for String an alternative would be return IntStream.range(0, value.length).map(i -> value[i]); Which is actually similar to Collections.nCopies().stream(). Also note that Collections class already contains singletonSpliterator which creates an anonymous class. With my proposed change it can be replaced with new ConstantSpliterator(1, element) without performance drop, so actual number of classes will not increase. At very least why creating two distinct lambdas in CopiesList.stream() and CopiesList.parallelStream()? They duplicate "i -> element", for which javac creates two separate methods (like lambda$stream$95(int) and lambda$parallelStream$96(int)) and in runtime two distinct anonymous classes may be created. It could be written instead public Stream<E> parallelStream() { return stream().parallel(); } With best regards, Tagir Valeev. PS> Alas i think profile pollution is current fact of JDK life when PS> inverting control with lambdas. What we really require is a better way to specialise the hot loops. PS> Paul. PS> On 28 Jul 2015, at 10:37, Tagir F. Valeev <amae...@gmail.com> wrote: >> Hello! >> >> Current implementation of Collections.nCopies().stream() is as >> follows: >> >> http://hg.openjdk.java.net/jdk9/dev/jdk/file/f160dec9a350/src/java.base/share/classes/java/util/Collections.java#l5066 >> >> public Stream<E> stream() { >> return IntStream.range(0, n).mapToObj(i -> element); >> } >> >> @Override >> public Stream<E> parallelStream() { >> return IntStream.range(0, n).parallel().mapToObj(i -> element); >> } >> >> The problem is that it adds a lambda expression to the >> RangeIntSpliterator type profile which can be polluted by some other >> code and vice versa: using nCopies().stream() may pollute the type >> profile for other code making it slower. >> >> Another thing which is missing in current implementation is unordered >> mode. This collection is unordered by nature, its stream is similar to >> Stream.generate(), so to my opinion it should be unordered which may >> improve the parallel reduction in some cases. >> >> This can be improved by introducing the custom spliterator class which >> is quite simple: >> https://gist.github.com/amaembo/62f3efee9923b1468e86 >> >> On pre-polluted type profile with simple mapping and reduction using >> custom spliterator is about 25-30% faster in both parallel and >> sequential cases as benchmarking shows (performed on 4-core cpu). >> >> What do you think? >> >> With best regards, >> Tagir Valeev. >>