Hi Hiroshi,
This is a good example of what seems like a small feature and yet there are
some unexpected complexities :-)
We will need to refine the implementation specification of List.spliterator,
which currently states:
* @implSpec
* The default implementation creates a
* <em><a href="Spliterator.html#binding">late-binding</a></em> spliterator
* from the list's {@code Iterator}. The spliterator inherits the
* <em>fail-fast</em> properties of the list's iterator.
Since the existing implementation is derived from the iterator:
@Override
default Spliterator<E> spliterator() {
return Spliterators.spliterator(this, Spliterator.ORDERED);
}
concurrent modification checking will occur. The RandomAccessSpliterator should
also support modification checking, which i think requires it be an inner class
to check co-mod state.
I am struggling to understand the points you make about the spliterator of a
sub-list of a Vector being required to be an iterator-based implementation.
Since AbstractList.SubList can access a Vector’s elements through List.get/set
why cannot RandomAccessSpliterator?
If we want to support random access spliterators on sub-lists i think we would
anyway need to override the spliterator method to check for concurrent
modification (as is the case of the iterator method).
Paul.
> On 11 May 2016, at 11:25, Ito, Hiroshi <[email protected]> wrote:
>
> Hi,
>
> Please kindly review the attached patch for RandomAccessSpliterator
> implementation.
>
> Currently default implementation of spliterator is an IteratorSpliterator
> which is not optimal for RandomAccess data structures (besides ArrayList and
> Vector). This patch is to provide a default RandomAccessSpliterator
> implementation for RandomAccess data structure.
>
> The figures below demonstrate the performance difference before and after the
> change. Note the significant performance improvement in test
> SelectTest.parallel_lazy_streams_gsc (parallel streams performance test for
> non-JDK Lists that implement RandomAccess but don't yet implement their own
> spliterators).
>
> Benchmark code:
> https://github.com/goldmansachs/gs-collections/blob/master/jmh-tests/src/main/java/com/gs/collections/impl/jmh/SelectTest.java
>
> With IteratorSpliterator as default:
>
> Benchmark Mode Cnt Score Error Units
> SelectTest.parallel_lazy_jdk thrpt 20 172.671 ± 14.231 ops/s
> SelectTest.parallel_lazy_streams_gsc thrpt 20 20.662 ± 0.493 ops/s
> SelectTest.serial_lazy_jdk thrpt 20 89.384 ± 4.431 ops/s
> SelectTest.serial_lazy_streams_gsc thrpt 20 80.831 ± 7.875 ops/s
>
> With RandomAccessSpliterator as default:
>
> Benchmark Mode Cnt Score Error Units
> SelectTest.parallel_lazy_jdk thrpt 20 174.190 ± 16.870 ops/s
> SelectTest.parallel_lazy_streams_gsc thrpt 20 180.740 ± 9.594 ops/s
> SelectTest.serial_lazy_jdk thrpt 20 85.719 ± 2.414 ops/s
> SelectTest.serial_lazy_streams_gsc thrpt 20 78.760 ± 1.029 ops/s
>
> Majority of the patch is contributed by Paul Sandoz and he should be credited
> in the Contributed-by field.
>
> Along with this patch submission, we have a question for SubList spliterator
> implementation that we retained old behavior for now (i.e. return
> IteratorSpliterator, refer to RandomAccessSubList#spliterator()). We have
> found that Vector's subList is wrapped by RandomAccessSubList, that is
> RandomAccess but not a Vector anymore, and it expects to use
> IteratorSpliterator. We were not sure what's the best approach to distinguish
> Vector from other RandomAccess data structure within RandomAccessSublist, so
> we kept it return IteratorSpliterator for now.
>
> One approach could be to introduce VectorSubList that returns
> IteratorSpliterator (or an implementation similar to VectorSpliterator?).
> Then we could revert the spliterator() override in RandomAccessSublist.
>
> What would be your suggestion to handle this?
>
> Depending on your suggestion, we might fix the subList spliterator in this
> patch, or submit a separate patch if the amount of the change is significant.
>
> Thanks,
> Hiroshi
> <RandomAccessSpliterator.patch.txt>