On Thu, 10 Feb 2022 04:30:34 GMT, Tagir F. Valeev <[email protected]> wrote:

>> See the bug description for details.
>> 
>> I propose a simple solution. Let's allow ArraySpliterator to be non-SIZED 
>> and report artificial estimatedSize(), much bigger than the real one. This 
>> will allow AbstractSpliterator and IteratorSpliterator to produce prefix 
>> whose size is comparable to Long.MAX_VALUE (say, starting with 
>> Long.MAX_VALUE/2), and this will enable further splitting of the prefix. 
>> This change will drastically improve parallel streaming for affected streams 
>> of size <= 1024 and significantly improve for streams of size 1025..20000. 
>> The cost is higher-grained splitting for huge streams of unknown size. This 
>> might add a minor overhead for such scenarios which, I believe, is 
>> completely tolerable.
>> 
>> No public API changes are necessary, sequential processing should not be 
>> affected, except an extra field in ArraySpliterator which increases a 
>> footprint by 8 bytes.
>> 
>> I added a simple test using an artificial collector to ensure that at least 
>> two non-empty parts are created when parallelizing Stream.iterate source. 
>> More testing ideas are welcome.
>
> Tagir F. Valeev has updated the pull request incrementally with two 
> additional commits since the last revision:
> 
>  - Update copyright year
>  - Cosmetic fixes

For unknown-sized iterators we have to bias towards some strategy (recognizing 
that the source is a poor choice for parallelism).

Currently the "peeled" split containing an array of elements (copied from the 
iterator) is never split, since the array length will never in practice be 
greater than the size threshold. The parallelism is derived from the splits of 
the iterator, and this is biased towards a large number of elements.

In your approach each  "peeled" split effectively gets to use half of the total 
parallelism. Over subsequent splits of the source that's gonna over provision 
tasks, specifically after two splits (or 2 * 2^10 + 2^10 elements), and this is 
biased towards a smaller number of elements.

In the issue you point out a number of relevant stream sources. In practice 
they are unlikely to have millions of elements, and probably more likely 
fitting into the first few steps of the arithmetic splitting sequence. If the 
cost-per-element is reasonably high that would overcome any over provisioning.

Overall it looks reasonable. I ran it through tier 1/2 testing and there were 
no failures. If you don't mind I would like to let this percolate a little in 
my mind (i.e. sleep on it for a day or two).

-------------

PR: https://git.openjdk.java.net/jdk/pull/7279

Reply via email to