Hi All,

I’m very confused about what was intended with the ‘UNORDERED’ Collector 
characteristic.  My logical inference was that unordered as a Collector 
characteristic meant that the result of this collector applied to any stream 
was independent of the order of element delivery.  This means that the stream 
backend could safely elide any ordered characteristic of the stream driving the 
collector and open up more possible execution pathways (parallelism for 
example).  The javadoc for unordered is two sentences:

/**
 * Indicates that the collection operation does not commit to preserving
 * the encounter order of input elements.  (This might be true if the
 * result container has no intrinsic order, such as a {@link Set}.)
 */

The first sentence makes sense, although it is phrased in a way that also 
explains the behavior seen if you flag an order sensitive collector as 
unordered.  The second sentence is weird, it implies a relation between 
collectors, encounter order, and an inherent ordering or iteration order of a 
collection object that might be the target of a collection. To me the important 
property of a Collection object with regards to ordering of a Collector that 
fills it is whether reordering the insertions to the collection can change the 
resultant state, for a Set this is clearly true since only the first presented 
element that is equal will be stored.

There is also a paragraph in the Collector javadoc:

* <p>For collectors that do not have the {@code UNORDERED} characteristic,
* two accumulated results {@code a1} and {@code a2} are equivalent if
* {@code finisher.apply(a1).equals(finisher.apply(a2))}.  For unordered
* collectors, equivalence is relaxed to allow for non-equality related to
* differences in order.  (For example, an unordered collector that accumulated
* elements to a {@code List} would consider two lists equivalent if they
* contained the same elements, ignoring order.)

This seems weird, why would we try to define correctness of a parallel 
reduction against a collector that was sensitive to ordering.

Finally, when investigating the properties of the collectors returned from the 
Collectors static factory I was surprised to discover that none of the 
collectors that are truly unordered were marked as such (counting, min, max, 
averaging, summing, summarizing), and the only collector that was marked as 
unordered was Collectors.toSet(), which although it is explicitly marked as 
unordered seems like it really shouldn’t be.

Whats going on here?  Which parts of all this are intended and which (if any) 
are bugs?

Thanks in advance for any enlightenment,

Chris Dennis

P.S. Coincidentally the unordered toSet() collector works perfectly at the 
moment due to the happy interaction of the Spliterator.trySplit() contract and 
the folding/combining behavior of the stream implementations parallel reduction 
algorithm.

Reply via email to