On 11/06/2015 02:39 PM, Paul Sandoz wrote:
On 6 Nov 2015, at 14:19, Fabrizio Giudici <fabrizio.giud...@tidalwave.it> wrote:


I logged an issue:

  https://bugs.openjdk.java.net/browse/JDK-8141608 
<https://bugs.openjdk.java.net/browse/JDK-8141608>
Thanks to Remi and Paul for the complete explanation. Concerning JDK-8141608, I 
lile Peter Levart's comment about making a specific Collector.
There is a problem with that approach. At the moment the Collector does not get 
to control whether the stream is executed in parallel or sequentially.

Paul.

Right, but Collector can also be concurrent and collecting into the same container (or file), like:

public static <E> Collector<E, ArrayList<E>, ArrayList<E>> synchronizedArrayListCollector() {
        ArrayList<E> list = new ArrayList<>();
        return Collector.of(
                () -> list,
                (l, e) -> {
                    synchronized (l) {
                        l.add(e);
                    }
                },
                (l1, l2) -> l1,
Collector.Characteristics.CONCURRENT, Collector.Characteristics.IDENTITY_FINISH
        );
    }

...or it can even be more sophisticated, collecting into multiple local buffers (lists) and combining them all in small batches into a single container (or file), like:

public static <E> Collector<E, ArrayList<E>, ArrayList<E>> batchedArrayListCollector() {
        ArrayList<E> list = new ArrayList<>();
        return Collector.of(
                ArrayList::new,
                ArrayList::add,
                (l1, l2) -> {
                    if (l1 != list) synchronized (list) {
                        list.addAll(l1);
                    }
                    if (l2 != list) synchronized (list) {
                        list.addAll(l2);
                    }
                    return list;
                },
                (l) -> {
                    if (l != list) {
                        list.addAll(l);
                    }
                    return list;
                }
        );
    }


If stream is serial, it keeps encounter order. If it is parallel, it does not, but that's something you might not care if you are going parallel.

How would you otherwise write a parallel stream into a file if it was not by using a Collector? Would you make it serial 1st?

Regards, Peter

IMHO it fits better the API design, as in the end we're performing a collection 
of the stream products into a file. I think it would be also better for 
readability, as it would preserve the typical pattern of Stream code, where you 
don't pass the reference to the Stream to a method, but you invoke a terminal 
operation on the stream itself.

Thanks.

--
Fabrizio Giudici - Java Architect @ Tidalwave s.a.s.
"We make Java work. Everywhere."
http://tidalwave.it/fabrizio/blog - fabrizio.giud...@tidalwave.it

Reply via email to