Re: RFR: 8072727 - add variation of Stream.iterate() that's finite

Stuart Marks Sun, 14 Feb 2016 17:05:47 -0800

Hi Tagir,

Thanks for picking this up.

I'll be at a conference this week and I won't have much time to discuss this indetail until afterward. But here are some quick thoughts about the proposal.

I'd suggest focusing on the API first before worrying about how to track thestream state with booleans, etc. Is the API convenient to use, and how well doesit support the use cases we envision for it?

In particular, I can imagine a number of cases where it would be very helpful tobe able to support an empty stream, or where the computation to produce thefirst element is the same as the computation to produce subsequent elements.Requiring a value for the first stream element is at odds with that.


Here are some ideas for use cases to try out:

 - a series of dice rolls representing a round of craps [1]
 - elements drawn from a queue until the queue is empty or until
   a sentinel is reached
 - a sequence of numbers that (probably) terminates but whose length
   isn't necessarily known in advance (e.g. Collatz sequence [2])

[1] https://en.wikipedia.org/wiki/Craps

[2] https://en.wikipedia.org/wiki/Collatz_conjecture

Note that in some cases the sentinel value that terminates the stream should bepart of the stream, and in other cases it's not.


I'm sure you can find more uses cases by perusing Stack Overflow. :-)

I'm a bit skeptical of the use of "iterate" for producing a finite stream. Thereare the usual issues with overloading, but there's also potential confusion assome forms of iterate() are infinite and others finite. I'll suggest the name"produce" instead, but there are surely better terms.

One thing to think about is where the state of the producer is stored. Is itexpected to be in an argument that's passed to each invocation of the functionalargument, or is it expected to be captured? I don't think there's an answer inisolation; examining use cases would probably shed some light here.


Here are a few API ideas (wildcards elided):

--

<T> Stream<T> iterate(T seed, Predicate<T> predicate, UnaryOperator<T> f)

The API from your proposal, for comparison purposes.

--

<T> Stream<T> produce(Supplier<Optional<T>>)

Produces elements until empty Optional is returned. This box/unboxes everyelement, maybe(?) alleviated by Valhalla.


--

<T> Stream<T> produce(BooleanSupplier, Supplier<T>)

Calls the BooleanSupplier; if true the next stream element is what's returned bycalling the Supplier. If BooleanSupplier returns false, end of stream. If youhave an iterator already, this enables


produce(iterator::hasNext, iterator::next)

But if you don't have an iterator already, coming up with the functions tosatisfy the iterator-style protocol is sometimes painful.


--

<T> Stream<T> produce(Predicate<Consumer<T>> advancer)

This has an odd signature, but the function is like Spliterator.tryAdvance(). Itmust either call the consumer once and return true, or return false withoutcalling the consumer.


--

<T> Stream<T> produce(Consumer<Consumer<T>> advancer)

A variation of the above, without a boolean return. The advancer calls theconsumer one or more times to add elements to the stream. End of stream occurswhen the advancer doesn't call the consumer.


--

<T> Stream<T> produce(Supplier<Stream<T>>)

A variation of Supplier<Optional<T>> where the supplier returns a streamcontaining zero or more elements. The stream terminates if the supplier returnsan empty stream. There "boxing" overhead here, but we don't seem to be botheredby this with flatMap().


--

s'marks


On 2/14/16 6:53 AM, Tagir F. Valeev wrote:

Hello!

I wanted to work on foldLeft, but Brian asked me to take this issue
instead. So here's webrev:
http://cr.openjdk.java.net/~tvaleev/webrev/8072727/r1/

I don't like iterator-based Stream source implementations, so I made
them AbstractSpliterator-based. I also implemented manually
forEachRemaining as, I believe, this improves the performance in
non-short-circuiting cases.

I also decided to keep two flags (started and finished) to track the
state. Currently existing implementation of infinite iterate() does
not use started flag, but instead reads one element ahead for
primitive streams. This seems wrong to me and may even lead to
unexpected exceptions (*). I could get rid of "started" flag for
Stream.iterate() using Streams.NONE, but this would make object
implementation different from primitive implementations. It would also
be possible to keep single three-state variable (byte or int,
NOT_STARTED, STARTED, FINISHED), but I doubt that this would improve
the performance or footprint. Having two flags looks more readable to
me.

Currently existing two-arg iterate methods can now be expressed as a
partial case of the new method:

public static<T> Stream<T> iterate(final T seed, final UnaryOperator<T> f) {
     return iterate(seed, x -> true, f);
}
(same for primitive streams). I may do this if you think it's
reasonable.

I created new test class and added new iterate sources to existing
data providers.

Please review and sponsor!

With best regards,
Tagir Valeev.

(*) Consider the following code:

int[] data = {1,2,3,4,-1};
IntStream.iterate(0, x -> data[x])
          .takeWhile(x -> x >= 0)
          .forEach(System.out::println);

Currently this unexpectedly throws an AIOOBE, because
IntStream.iterate unnecessarily tries to read one element ahead.

Re: RFR: 8072727 - add variation of Stream.iterate() that's finite

Reply via email to