Hi Viktor, Thanks, that is useful to know.
To ensure I understand your recommendation correctly, would you discourage the use of *Stream.generate(queue::poll).takeWhile(Objects::nonNull)* because of its unordered nature? Best, Jige Yu On Mon, Mar 9, 2026 at 3:41 AM Viktor Klang <[email protected]> wrote: > I personally place emphasis to write code that is correct regardless of > whether the stream is sequential or parallel, so in that case, using an > unordered generator where an ordered generator is required for correctness > would be considered, by me, as a bug. > On 2026-03-08 02:43, Jige Yu wrote: > > Viktor, can you be more specific about how you see this statement apply to > the question of generate()'s unordered-ness? > > On Fri, Mar 6, 2026 at 1:07 AM Viktor Klang <[email protected]> > wrote: > >> From that documentation: >> >> *«For sequential streams, the presence or absence of an encounter order >> does not affect performance, only determinism. If a stream is ordered, >> repeated execution of identical stream pipelines on an identical source >> will produce an identical result; if it is not ordered, repeated execution >> might produce different results.»* >> On 2026-03-06 04:00, Jige Yu wrote: >> >> >> >> On Thu, Mar 5, 2026 at 3:13 PM Viktor Klang <[email protected]> >> wrote: >> >>> >And if generate() is unordered, is it by-spec safe to depend on the >>> elements being delivered in the order they are generated, at all? >>> >>> Encounter order for unordered streams is described here: >>> https://docs.oracle.com/en/java/javase/25/docs/api/java.base/java/util/stream/package-summary.html#Ordering >>> >> >> Yeah. Specifically this statement: >> >> > if the source of a stream is a List containing [1, 2, 3], then the >> result of executing map(x -> x*2) must be [2, 4, 6]. However, if the >> source has no defined encounter order, then any permutation of the values [2, >> 4, 6] would be a valid result. >> >> generate()'s source isn't a List, and it's specified as "an infinite >> sequential unordered stream", which to me reads as "it has no defined >> encounter order". >> >> And if that reading is correct, then any permutation of the generated >> results would be a valid result? >> >> On 2026-03-06 00:04, Jige Yu wrote: >>> >>> From what I can see, many of these supplier lambdas do return null >>> idempotently, such as generate(queue::poll). >>> >>> But with enough usage, I suspect we'll run into scenarios that may trip >>> if called again after returning null. >>> >>> And if generate() is unordered, is it by-spec safe to depend on the >>> elements being delivered in the order they are generated, at all? >>> >>> On Thu, Mar 5, 2026 at 1:33 AM Viktor Klang <[email protected]> >>> wrote: >>> >>>> Is the supplier's get()-method allowed to be invoked *after* it has >>>> previously returned *null?* >>>> On 2026-03-05 06:54, Jige Yu wrote: >>>> >>>> Makes sense. >>>> >>>> What do you guys think of the idiom of >>>> generate(supplierThatEventuallyReturnsNull) + takeWhile() ? Should it be >>>> avoided? >>>> >>>> On Wed, Mar 4, 2026 at 6:59 AM Viktor Klang <[email protected]> >>>> wrote: >>>> >>>>> >In our codebase, I see some developers using iterate() + takeWhile() >>>>> and others using generate() + takeWhile(). I am debating whether to raise >>>>> a >>>>> concern about this pattern. Most likely, people won't insert intermediary >>>>> operations between them, and I worry I might be overthinking it. >>>>> >>>>> In this specific case I'd argue that it's more correct (and more >>>>> performant, and less code) to just use the 3-arg iterate. >>>>> >>>>> >or should I reconsider my warnings about side effects being >>>>> rearranged in sequential streams? >>>>> >>>>> Personally I prefer my Streams correct regardless of underlying >>>>> implementation and regardless of whether the stream isParallel() or not. >>>>> On 2026-03-03 20:29, Jige Yu wrote: >>>>> >>>>> Hi Viktor, >>>>> >>>>> Thanks for the explanation! >>>>> >>>>> I also experimented with adding parallel() in the middle, and it >>>>> indeed threw a NullPointerException even without distinct(). >>>>> >>>>> In our codebase, I see some developers using iterate() + takeWhile() >>>>> and others using generate() + takeWhile(). I am debating whether to raise >>>>> a >>>>> concern about this pattern. Most likely, people won't insert intermediary >>>>> operations between them, and I worry I might be overthinking it. >>>>> >>>>> However, generate(supplierThatMayReturnNull).takeWhile() seems even >>>>> more precarious. Since generate() is documented as unordered, could it >>>>> potentially return elements out of encounter order, such as swapping a >>>>> later null with an earlier non-null return? >>>>> >>>>> This brings me back to the rationale I’ve used to discourage side >>>>> effects in map() and filter(). In a sequential stream, I’ve argued that >>>>> relying on side effects from an earlier map() to be visible in a >>>>> subsequent >>>>> map() is unsafe because the stream is theoretically free to process >>>>> multiple elements through the first map() before starting the second. >>>>> >>>>> Is that view too pedantic? If we can safely assume iterate() + >>>>> takeWhile() is stable in non-parallel streams, should the same logic apply >>>>> to subsequent map() calls with side effects (style issues aside)? >>>>> >>>>> I’m trying to find a consistent theory. Should I advise my colleagues >>>>> that iterate() + takeWhile() and generate() + takeWhile() are unsafe, or >>>>> should I reconsider my warnings about side effects being rearranged in >>>>> sequential streams? >>>>> >>>>> I hope that clarifies the root of my confusion. >>>>> >>>>> Best, >>>>> Jige Yu >>>>> >>>>> On Mon, Mar 2, 2026 at 6:08 AM Viktor Klang <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Jige, >>>>>> >>>>>> I think I understand what you mean. In this case you're trying to >>>>>> prevent a `null`-return from `nextOrNull()` to be fed into the next >>>>>> iteration and thus throwing a NullPointerException. >>>>>> >>>>>> Now the answer is going to be a bit nuanced than you might want to >>>>>> hear, but in the spirit of providing clarity, the code which you provided >>>>>> will "work" under the assumption that there is no "buffer" in between >>>>>> iterate(…) and takeWhile(…). >>>>>> >>>>>> TL;DR: use Stream.iterate(seed, e -> e != null, e -> e.nextOrNull()) >>>>>> Long version: >>>>>> Imagine we have the following: >>>>>> ```java >>>>>> record E(E e) {} >>>>>> Stream.iterate(new E(new E(new E(null))), e -> e.e()) >>>>>> .< /span>takeWhile(Objects::nonNull) >>>>>> .forEach(IO::println) >>>>>> ``` >>>>>> We get: >>>>>> ```java >>>>>> E[e=E[e=E[e=null]]] >>>>>> E[e=E[e=null]] >>>>>> E[e=null] >>>>>> ``` >>>>>> However, if we do: >>>>>> ```java >>>>>> Stream.iterate(new E(new E(new E(null))), e -> e.e())< /span> >>>>>> .gather( >>>>>> Gatherer.<E,ArrayList<E>,E>ofSequential( >>>>>> ArrayList::new, >>>>>> (l, e, _) -> l.add(e), >>>>>> (l, d) -> l.forEach(d::push) >>>>>> ) >>>>>> ) >>>>>> .takeWhile(Objects::nonNull) >>>>>> .forEach(IO::println) >>>>>> ``` >>>>>> We get: >>>>>> ```java >>>>>> Exception java.lang.NullPointerException: Cannot invoke >>>>>> "REPL.$JShell$16$E.e()" because "<parameter1>" is null >>>>>> at lambda$do_it$$0 (#5:1) >>>>>> at Stream$1.tryAdvance (Stream.java:1515) >>>>>> at ReferencePipeline.forEachWithCancel (ReferencePipeline.java:147) >>>>>> at AbstractPipeline.copyIntoWithCancel (AbstractPipeline.java:588) >>>>>> at AbstractPipeline.copyInto (AbstractPipeline.java:574) >>>>>> at AbstractPipeline.wrapAndCopyInto (AbstractPipeline.java:560) >>>>>> at ForEachOps$ForEachOp.evaluateSequential (ForEachOps.java:153) >>>>>> at ForEachOps$ForEachOp$OfRef.evaluateSequential (ForEachOps.java:176 >>>>>> ) >>>>>> at AbstractPipeline.evaluate (AbstractPipeline.java:265) >>>>>> at ReferencePipeline.forEach (ReferencePipeline.java:632) >>>>>> at (#5:9) >>>>>> ``` >>>>>> But if we introduce something like `distinct()` in between, it will >>>>>> "work" under sequential processing, >>>>>> but under parallel processing it might not, as the distinct operation >>>>>> will have to buffer *separately* from takeWhile: >>>>>> ```java >>>>>> Stream.iterate(new E(new E(new E(null))), e -> e.e())< /span> >>>>>> .distinct() >>>>>> .takeWhile(Objects::nonNull) >>>>>> .forEach(IO::println) >>>>>> ``` >>>>>> ```java >>>>>> E[e=E[e=E[e=null]]] >>>>>> E[e=E[e=null]] >>>>>> E[e=null] >>>>>> ``` >>>>>> Parallel: >>>>>> ```java >>>>>> Stream.iterate(new E(new E(new E(null))), e -> e.e())< /span> >>>>>> .parallel() >>>>>> .distinct() >>>>>> .takeWhile(Objects::nonNull) >>>>>> .forEach(IO::println) >>>>>> ``` >>>>>> ```java >>>>>> Exception java.lang.NullPointerException: Cannot invoke >>>>>> "REPL.$JShell$16$E.e()" because "<parameter1>" is null >>>>>> at lambda$do_it$$0 (#7:1) >>>>>> at Stream$1.tryAdvance (Stream.java:1515) >>>>>> at Spliterators$AbstractSpliterator.trySplit (Spliterators.java:1447) >>>>>> at AbstractTask.compute (AbstractTask.java:308) >>>>>> at CountedCompleter.exec (CountedCompleter.java:759) >>>>>> at ForkJoinTask.doExec (ForkJoinTask.java:511) >>>>>> at ForkJoinTask.invoke (ForkJoinTask.java:683) >>>>>> at ReduceOps$ReduceOp.evaluateParallel (ReduceOps.java:927) >>>>>> at DistinctOps$1.reduce (DistinctOps.java:64) >>>>>> at DistinctOps$1.opEvaluateParallelLazy (DistinctOps.java:110) >>>>>> at AbstractPipeline.sourceSpliterator (AbstractPipeline.java:495) >>>>>> at AbstractPipeline.evaluate (AbstractPipeline.java:264) >>>>>> at ReferencePipeline.forEach (ReferencePipeline.java:632) >>>>>> at (#7:4) >>>>>> ``` >>>>>> >>>>>> On 2026-03-01 06:29, Jige Yu wrote: >>>>>> >>>>>> Hi @core-libs-dev, >>>>>> I am looking to validate the following idiom: >>>>>> Stream.iterate(seed, e -> e.nextOrNull()) >>>>>> .takeWhile(Objects::nonNull); >>>>>> The intent is for the stream to call nextOrNull() repeatedly until it >>>>>> returns null. However, I am concerned about where the Stream >>>>>> specification >>>>>> guarantees the correctness of this approach regarding happens-before >>>>>> relationships. >>>>>> The iterate() Javadoc defines happens-before for the function passed >>>>>> to it, stating that the action of applying f for one element >>>>>> happens-before >>>>>> the action of applying it for subsequent elements. However, it seems >>>>>> silent >>>>>> on the happens-before relationship with downstream operations like >>>>>> takeWhile(). >>>>>> My concern stems from the general discouragement of side effects in >>>>>> stream operations. For example, relying on side effects between >>>>>> subsequent >>>>>> map() calls is considered brittle because a stream might invoke the first >>>>>> map() on multiple elements before the second map() processes the first >>>>>> element. >>>>>> If this theory holds, is there anything theoretically preventing >>>>>> iterate() from generating multiple elements before takeWhile() evaluates >>>>>> the first one? I may be overthinking this, but I would appreciate your >>>>>> insights into why side effects are discouraged even in ordered, >>>>>> sequential >>>>>> streams and whether this specific idiom is safe. >>>>>> Appreciate your help! >>>>>> Best regards, >>>>>> Jige Yu >>>>>> >>>>>> -- >>>>>> Cheers, >>>>>> √ >>>>>> >>>>>> >>>>>> Viktor Klang >>>>>> Software Architect, Java Platform Group >>>>>> Oracle >>>>>> >>>>>> -- >>>>> Cheers, >>>>> √ >>>>> >>>>> >>>>> Viktor Klang >>>>> Software Architect, Java Platform Group >>>>> Oracle >>>>> >>>>> -- >>>> Cheers, >>>> √ >>>> >>>> >>>> Viktor Klang >>>> Software Architect, Java Platform Group >>>> Oracle >>>> >>>> -- >>> Cheers, >>> √ >>> >>> >>> Viktor Klang >>> Software Architect, Java Platform Group >>> Oracle >>> >>> -- >> Cheers, >> √ >> >> >> Viktor Klang >> Software Architect, Java Platform Group >> Oracle >> >> -- > Cheers, > √ > > > Viktor Klang > Software Architect, Java Platform Group > Oracle > >
