Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
On 31 Jul 2015, at 23:19, Stuart Marks stuart.ma...@oracle.com wrote: Hi all, The sequential() and parallel() methods on a stream set the execution mode for the entire pipeline. Unfortunately this isn't particularly clear from the documentation. This has been a recurring question. Please review this change to the java.util.stream package documentation to help clarify this point. I've also added a couple anchor tags that were missing from various sections. Looks ok. Paul. This is intended to be a purely editorial change, not a specification change. Bug: https://bugs.openjdk.java.net/browse/JDK-8132800 Webrev: http://cr.openjdk.java.net/~smarks/reviews/8132800/webrev.0/ Thanks, s'marks
Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
On 4 Aug 2015, at 01:09, Stuart Marks stuart.ma...@oracle.com wrote: Hi Tagir, Interesting issues. Regarding Stream.concat, it may be that, today, changes to the sequential/parallel execution mode aren't propagated to the streams being concatenated. The execution mode is propagated if either stream to concat is parallel i.e. isParallel() The issue here is that Stream.spltierator() is a form of terminal operation that will result in pipeline evaluation if there are non-lazy stateful operations present. What we don’t currently do is propagate the parallelism back to a sequential stream when the other is a parallel stream. We could easily do that as a bug fix. I agree with Stuart it does not require any specification e.g.: public static T StreamT concat(Stream? extends T a, Stream? extends T b) { Objects.requireNonNull(a); Objects.requireNonNull(b); boolean isPar = a.isParallel() || b.isParallel(); @SuppressWarnings(unchecked) SpliteratorT split = new Streams.ConcatSpliterator.OfRef( (SpliteratorT) (isPar ? a.parallel() : a).spliterator(), (SpliteratorT) (isPar ? b.parallel() : b).spliterator()); StreamT stream = StreamSupport.stream(split, isPar); return stream.onClose(Streams.composedClose(a, b)); } But is that something inherent to the specification of concatenation, or is it something that might change in the future? It's currently unspecified, so adding a clarification really sounds more like changing the specification to reflect the current implementation, which I'd prefer not to do. Regarding the default implementations of takeWhile/dropWhile, again, today, they don't propagate the execution mode upstream. But is this just a bug? Granted the API for doing so isn't obvious, but isn't this something that could just be fixed? The default implementations are specified to propagate the execution mode in terms of correctly reporting isParallel() but may choose not to split: * the wrapped spliterator. The returned stream preserves the execution * characteristics of this stream (namely parallel or sequential execution * as per {@link #isParallel()}) but the wrapped spliterator may choose to * not support splitting. When the returned stream is closed, the close * handlers for both the returned and this stream are invoked. So this is a little different from the Stream.concat case. The only way a default implementation can be implemented is to derive functionality from the upstream spliterator(). Rather than choosing to add yet more code to support this (and most likely poorly too) i opted for not splitting and reuse some existing code (the unordered spliterators that are configured not to split). Nor did i want to wrap the non-splitting spliterator around one which copies a prefix, which introduces a different set of poorly splitting characteristics (and would also penalise sequential operation). This is a tradeoff, and seems to me a reasonable compromise for a default. Paul.
Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
On 4 Aug 2015, at 09:20, Paul Sandoz paul.san...@oracle.com wrote: On 4 Aug 2015, at 01:09, Stuart Marks stuart.ma...@oracle.com wrote: Hi Tagir, Interesting issues. Regarding Stream.concat, it may be that, today, changes to the sequential/parallel execution mode aren't propagated to the streams being concatenated. The execution mode is propagated if either stream to concat is parallel i.e. isParallel() The issue here is that Stream.spltierator() is a form of terminal operation that will result in pipeline evaluation if there are non-lazy stateful operations present. What we don’t currently do is propagate the parallelism back to a sequential stream when the other is a parallel stream. We could easily do that as a bug fix. I agree with Stuart it does not require any specification e.g.: public static T StreamT concat(Stream? extends T a, Stream? extends T b) { Objects.requireNonNull(a); Objects.requireNonNull(b); boolean isPar = a.isParallel() || b.isParallel(); @SuppressWarnings(unchecked) SpliteratorT split = new Streams.ConcatSpliterator.OfRef( (SpliteratorT) (isPar ? a.parallel() : a).spliterator(), (SpliteratorT) (isPar ? b.parallel() : b).spliterator()); StreamT stream = StreamSupport.stream(split, isPar); return stream.onClose(Streams.composedClose(a, b)); } And on reflection this may not always be the right thing to do, there are surprises either way. I am leaning towards not doing this and Stream.concat should obey the mode of the stream that is passed to it in terms of each evaluation. Thereby it is explicit as per what the caller declared. It’s probably too late to specify that, but we could and something to the @implNote stating the current implementation behaviour and add an @apiNote on guaranteeing parallel execution of the two streams to concat. Paul.
Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
Hello! On Wed, Aug 5, 2015 at 3:53 AM, Stuart Marks stuart.ma...@oracle.com wrote: I've gone ahead and added this to the implementation note section of Stream.concat + * pSubsequent changes to the sequential/parallel execution mode of the + * returned stream are not guaranteed to be propagated to the input streams. + * and I've pushed the change. The situation with Stream.takeWhile/dropWhile seems a bit sticky. On the one hand, the default implementations seem to make a reasonable set of compromises. On the other hand, I can't think of anything useful to add to the description that's already there. If somebody can think of something useful to add, by all means propose it, but I can't think of anything that usefully clarifies the situation without diving into a bunch of implementation details. Agreed. Thanks. Tagir Valeev. s'marks On 8/4/15 12:20 AM, Paul Sandoz wrote: On 4 Aug 2015, at 01:09, Stuart Marks stuart.ma...@oracle.com wrote: Hi Tagir, Interesting issues. Regarding Stream.concat, it may be that, today, changes to the sequential/parallel execution mode aren't propagated to the streams being concatenated. The execution mode is propagated if either stream to concat is parallel i.e. isParallel() The issue here is that Stream.spltierator() is a form of terminal operation that will result in pipeline evaluation if there are non-lazy stateful operations present. What we don’t currently do is propagate the parallelism back to a sequential stream when the other is a parallel stream. We could easily do that as a bug fix. I agree with Stuart it does not require any specification e.g.: public static T StreamT concat(Stream? extends T a, Stream? extends T b) { Objects.requireNonNull(a); Objects.requireNonNull(b); boolean isPar = a.isParallel() || b.isParallel(); @SuppressWarnings(unchecked) SpliteratorT split = new Streams.ConcatSpliterator.OfRef( (SpliteratorT) (isPar ? a.parallel() : a).spliterator(), (SpliteratorT) (isPar ? b.parallel() : b).spliterator()); StreamT stream = StreamSupport.stream(split, isPar); return stream.onClose(Streams.composedClose(a, b)); } But is that something inherent to the specification of concatenation, or is it something that might change in the future? It's currently unspecified, so adding a clarification really sounds more like changing the specification to reflect the current implementation, which I'd prefer not to do. Regarding the default implementations of takeWhile/dropWhile, again, today, they don't propagate the execution mode upstream. But is this just a bug? Granted the API for doing so isn't obvious, but isn't this something that could just be fixed? The default implementations are specified to propagate the execution mode in terms of correctly reporting isParallel() but may choose not to split: * the wrapped spliterator. The returned stream preserves the execution * characteristics of this stream (namely parallel or sequential execution * as per {@link #isParallel()}) but the wrapped spliterator may choose to * not support splitting. When the returned stream is closed, the close * handlers for both the returned and this stream are invoked. So this is a little different from the Stream.concat case. The only way a default implementation can be implemented is to derive functionality from the upstream spliterator(). Rather than choosing to add yet more code to support this (and most likely poorly too) i opted for not splitting and reuse some existing code (the unordered spliterators that are configured not to split). Nor did i want to wrap the non-splitting spliterator around one which copies a prefix, which introduces a different set of poorly splitting characteristics (and would also penalise sequential operation). This is a tradeoff, and seems to me a reasonable compromise for a default. Paul.
Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
I've gone ahead and added this to the implementation note section of Stream.concat + * pSubsequent changes to the sequential/parallel execution mode of the + * returned stream are not guaranteed to be propagated to the input streams. + * and I've pushed the change. The situation with Stream.takeWhile/dropWhile seems a bit sticky. On the one hand, the default implementations seem to make a reasonable set of compromises. On the other hand, I can't think of anything useful to add to the description that's already there. If somebody can think of something useful to add, by all means propose it, but I can't think of anything that usefully clarifies the situation without diving into a bunch of implementation details. s'marks On 8/4/15 12:20 AM, Paul Sandoz wrote: On 4 Aug 2015, at 01:09, Stuart Marks stuart.ma...@oracle.com wrote: Hi Tagir, Interesting issues. Regarding Stream.concat, it may be that, today, changes to the sequential/parallel execution mode aren't propagated to the streams being concatenated. The execution mode is propagated if either stream to concat is parallel i.e. isParallel() The issue here is that Stream.spltierator() is a form of terminal operation that will result in pipeline evaluation if there are non-lazy stateful operations present. What we don’t currently do is propagate the parallelism back to a sequential stream when the other is a parallel stream. We could easily do that as a bug fix. I agree with Stuart it does not require any specification e.g.: public static T StreamT concat(Stream? extends T a, Stream? extends T b) { Objects.requireNonNull(a); Objects.requireNonNull(b); boolean isPar = a.isParallel() || b.isParallel(); @SuppressWarnings(unchecked) SpliteratorT split = new Streams.ConcatSpliterator.OfRef( (SpliteratorT) (isPar ? a.parallel() : a).spliterator(), (SpliteratorT) (isPar ? b.parallel() : b).spliterator()); StreamT stream = StreamSupport.stream(split, isPar); return stream.onClose(Streams.composedClose(a, b)); } But is that something inherent to the specification of concatenation, or is it something that might change in the future? It's currently unspecified, so adding a clarification really sounds more like changing the specification to reflect the current implementation, which I'd prefer not to do. Regarding the default implementations of takeWhile/dropWhile, again, today, they don't propagate the execution mode upstream. But is this just a bug? Granted the API for doing so isn't obvious, but isn't this something that could just be fixed? The default implementations are specified to propagate the execution mode in terms of correctly reporting isParallel() but may choose not to split: * the wrapped spliterator. The returned stream preserves the execution * characteristics of this stream (namely parallel or sequential execution * as per {@link #isParallel()}) but the wrapped spliterator may choose to * not support splitting. When the returned stream is closed, the close * handlers for both the returned and this stream are invoked. So this is a little different from the Stream.concat case. The only way a default implementation can be implemented is to derive functionality from the upstream spliterator(). Rather than choosing to add yet more code to support this (and most likely poorly too) i opted for not splitting and reuse some existing code (the unordered spliterators that are configured not to split). Nor did i want to wrap the non-splitting spliterator around one which copies a prefix, which introduces a different set of poorly splitting characteristics (and would also penalise sequential operation). This is a tradeoff, and seems to me a reasonable compromise for a default. Paul.
Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
Hi Tagir, Interesting issues. Regarding Stream.concat, it may be that, today, changes to the sequential/parallel execution mode aren't propagated to the streams being concatenated. But is that something inherent to the specification of concatenation, or is it something that might change in the future? It's currently unspecified, so adding a clarification really sounds more like changing the specification to reflect the current implementation, which I'd prefer not to do. Regarding the default implementations of takeWhile/dropWhile, again, today, they don't propagate the execution mode upstream. But is this just a bug? Granted the API for doing so isn't obvious, but isn't this something that could just be fixed? s'marks On 7/31/15 9:36 PM, Tagir Valeev wrote: Hello! Probably additional clarifications should be added about Stream.concat method (and primitive analogs). Currently it's unclear that the resulting stream is actually the new pipeline, which do not propagate parallel status to the concatenated streams. See also my StackOverflow question: http://stackoverflow.com/q/30464397/4856258 Also things become more complex in JDK9 with the introduction of takeWhile/dropWhile. For JDK Stream implementation these two are normal intermediate operations, but if it happens that user has his own stream implementation or delegate to JDK stream, then the default takeWhile/dropWhile implementation is used which is actually starts new pipeline at given point, so further changes of parallel/sequential mode are not propagated to the initial pipeline. In my StreamEx library I call such operations as quasi-intermediate, and explicitly clarify this point. See the JavaDoc: http://amaembo.github.io/streamex/javadoc/javax/util/streamex/package-summary.html#StreamOps Probably some clarification should be added to @implSpec section of takeWhile/dropWhile. With best regards, Tagir Valeev. On Sat, Aug 1, 2015 at 3:19 AM, Stuart Marks stuart.ma...@oracle.com mailto:stuart.ma...@oracle.com wrote: Hi all, The sequential() and parallel() methods on a stream set the execution mode for the entire pipeline. Unfortunately this isn't particularly clear from the documentation. This has been a recurring question. Please review this change to the java.util.stream package documentation to help clarify this point. I've also added a couple anchor tags that were missing from various sections. This is intended to be a purely editorial change, not a specification change. Bug: https://bugs.openjdk.java.net/browse/JDK-8132800 Webrev: http://cr.openjdk.java.net/~smarks/reviews/8132800/webrev.0/ http://cr.openjdk.java.net/%7Esmarks/reviews/8132800/webrev.0/ Thanks, s'marks
Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
Hello, Stuart! On Tue, Aug 4, 2015 at 5:09 AM, Stuart Marks stuart.ma...@oracle.com wrote: Regarding Stream.concat, it may be that, today, changes to the sequential/parallel execution mode aren't propagated to the streams being concatenated. But is that something inherent to the specification of concatenation, or is it something that might change in the future? It's currently unspecified, so adding a clarification really sounds more like changing the specification to reflect the current implementation, which I'd prefer not to do. That's fine. In this case it's probably should be explicitly mentioned something like note that further change of the stream mode from parallel to sequential and vice versa is *not guaranteed* to propagate to the source streams. It would still be better than not speaking about this at all. Regarding the default implementations of takeWhile/dropWhile, again, today, they don't propagate the execution mode upstream. But is this just a bug? Granted the API for doing so isn't obvious, but isn't this something that could just be fixed? It would be nice if it can be fixed, but given the current API I see no easy way to do this. Probably Paul can comment on this. With best regards, Tagir Valeev. s'marks On 7/31/15 9:36 PM, Tagir Valeev wrote: Hello! Probably additional clarifications should be added about Stream.concat method (and primitive analogs). Currently it's unclear that the resulting stream is actually the new pipeline, which do not propagate parallel status to the concatenated streams. See also my StackOverflow question: http://stackoverflow.com/q/30464397/4856258 Also things become more complex in JDK9 with the introduction of takeWhile/dropWhile. For JDK Stream implementation these two are normal intermediate operations, but if it happens that user has his own stream implementation or delegate to JDK stream, then the default takeWhile/dropWhile implementation is used which is actually starts new pipeline at given point, so further changes of parallel/sequential mode are not propagated to the initial pipeline. In my StreamEx library I call such operations as quasi-intermediate, and explicitly clarify this point. See the JavaDoc: http://amaembo.github.io/streamex/javadoc/javax/util/streamex/package-summary.html#StreamOps Probably some clarification should be added to @implSpec section of takeWhile/dropWhile. With best regards, Tagir Valeev. On Sat, Aug 1, 2015 at 3:19 AM, Stuart Marks stuart.ma...@oracle.com wrote: Hi all, The sequential() and parallel() methods on a stream set the execution mode for the entire pipeline. Unfortunately this isn't particularly clear from the documentation. This has been a recurring question. Please review this change to the java.util.stream package documentation to help clarify this point. I've also added a couple anchor tags that were missing from various sections. This is intended to be a purely editorial change, not a specification change. Bug: https://bugs.openjdk.java.net/browse/JDK-8132800 Webrev: http://cr.openjdk.java.net/~smarks/reviews/8132800/webrev.0/ Thanks, s'marks
Re: RFR(s): 8132800: clarify stream package documentation regarding sequential vs parallel modes
Hello! Probably additional clarifications should be added about Stream.concat method (and primitive analogs). Currently it's unclear that the resulting stream is actually the new pipeline, which do not propagate parallel status to the concatenated streams. See also my StackOverflow question: http://stackoverflow.com/q/30464397/4856258 Also things become more complex in JDK9 with the introduction of takeWhile/dropWhile. For JDK Stream implementation these two are normal intermediate operations, but if it happens that user has his own stream implementation or delegate to JDK stream, then the default takeWhile/dropWhile implementation is used which is actually starts new pipeline at given point, so further changes of parallel/sequential mode are not propagated to the initial pipeline. In my StreamEx library I call such operations as quasi-intermediate, and explicitly clarify this point. See the JavaDoc: http://amaembo.github.io/streamex/javadoc/javax/util/streamex/package-summary.html#StreamOps Probably some clarification should be added to @implSpec section of takeWhile/dropWhile. With best regards, Tagir Valeev. On Sat, Aug 1, 2015 at 3:19 AM, Stuart Marks stuart.ma...@oracle.com wrote: Hi all, The sequential() and parallel() methods on a stream set the execution mode for the entire pipeline. Unfortunately this isn't particularly clear from the documentation. This has been a recurring question. Please review this change to the java.util.stream package documentation to help clarify this point. I've also added a couple anchor tags that were missing from various sections. This is intended to be a purely editorial change, not a specification change. Bug: https://bugs.openjdk.java.net/browse/JDK-8132800 Webrev: http://cr.openjdk.java.net/~smarks/reviews/8132800/webrev.0/ Thanks, s'marks