I'm talking about the perf difference between stream.forEach and for(var 
element: stream), forEachRemaining may be slower because for the VM the ideal 
case is to see the creation of the Stream and the call to the terminal 
operation inside the same inlining horizon so the creation of the Stream itself 
can be elided.

A bit of history: they have been several prototypes of how to implement the 
stream API before the current one, one of them (i think it's the first one) was 
based on iterators and iterators of iterators, one for each step of the Stream. 
The perf of that implementation was good until there was too many intermediary 
ops calls on the Stream and at that point perf were really bad. It's because 
the VM has two way to find the type of something in a generic code, it can 
build a profile by remembering what class was used for a method call or it can 
propagate the type of an argument to the type of the corresponding parameter. 
Because an iterator stores the element to return in a field, you are loosing 
the later way to optimize and the former only work if you have no more than 2 
different classes in the profile.
So while Stream.iterator() may be optimized, it's not that simple.

Yes, I remember this prototype. Sure, iterating from stream.iterator() will likely be slower than stream.forEach(), because of (current) limitations in JIT compilation. This may be important for performance-critical applications. So if you have such an application, you should be aware of possible performance issues using such an iterator(), measure, and recode if necessary.

Is this an argument not to allow Stream in a for-loop? I don't think so. There's a (fairly narrow) set of use cases where it's really necessary, and in most cases performance isn't an issue. After all, people use things like List<Integer> which is known to be terrible for large, performance-critical applications. But most apps are small and aren't performance critical, and for those, it's just fine.

This proposal has the side effect of making Stream more different from its
primitive counterpart IntStream, LongStream and DoubleStream which may be
problematic because we are trying to introduce reified generics as part of
Valhalla (there is a recent mail of Brian about not adding methods to
OptionalInt for the same reason).

Well, yes, I think that it means that Stream evolves somewhat independently of
Int/Long/DoubleStream, but I don't see that this imposes an impediment on
generic specialization in Valhalla. In that world, Stream<int> should (mostly)
just work. It may also be possible in a specialized world to add the specific
things from IntStream (such as sum() and max()) to Stream<int>.

We may want more here, like having Stream<int> being a subtype of IntStream so there 
is only one implementation for IntStream and Stream<int>.
Thus adding a method that make IntStream and Stream<Object> different just make 
this kind of retrofitting more unlikely.

I think the argument about specialization runs the other way, which is not to add stuff to IntStream.

Adding IterableOnce to Stream shouldn't really affect anything with respect to generic specialization. The type is already Stream<T>. The Iterable<T> methods that are inherited (iterator, spliterator, forEach) all match existing methods on Stream, at least structurally. So I don't see that this would cause a problem.

(Hm, I note that there is a slight semantic disagreement between Iterable::forEach and Stream::forEach. Stream::forEach allows parallelism, which isn't mentioned in Iterable::forEach. Somebody could conceivably call Iterable::forEach with a consumer that's not thread-safe, and if a parallel stream gets passed in, it would break that consumer. This strikes me as an edge case to be filed off, rather than a fatal problem, though.)


And, the real issue is how to deal with checked exceptions inside the Stream
API, i would prefer to fix that issue instead of trying to find a way to
workaround it.

Well I'd like to have a solution for checked exceptions as well, but there
doesn't appear to be one on the horizon. I mean, there are some ideas floating
around, but nobody is working on them as far as I know.

as far as i know, there are two of them,
- one is to get ride of checked exception, even Kotlin which tout itself as a 
language that is more safe that Java doesn't have checked exception, basically 
Java is the only language that run of the JVM and have checked exception.
- the other is to automatically wrap checked exceptions into a corresponding 
unchecked exception by letting the compiler generate the code that users 
currently write when the checked exception appear some context
   by example with the keyword autowrap,
   - you have the autowrap block (syntactically like a synchronized block)
       autowrap {
         return Files.newInputStream(path);   // IOException is transformed to 
UncheckedIOException by calling IOException.wrap()
       }
   - you can use autowrap on a method declaration
      void foo(Path path) autowrap {
        return Files.newInputStream(path);   // IOException is transformed to 
UncheckedIOException by calling IOException.wrap()
      }
   - you can use autowrap with a functional interface
      void runBlock(autoWrap Consumer<String> consumer) { ... }
      ...
      runblock(() -> {
        Files.newInputStream(path);         // IOException is transformed to 
UncheckedIOException by calling IOException.wrap()
      });

I can think of several other approaches but I don't want to discuss them here.

But checked exceptions aren't the only reason to prefer iteration in some cases;
loops offer more flexible control flow (break/continue) and easier handling of
side effects. The Streams+IterableOnce feature benefits these cases as well as
exception handling.

the break/continue equivalent on Stream are 
skip/limit/findFirst/takeWhile/dropWhile i.e. any short-circuit terminal 
operations.

Right: many, but not all loops with "break" can be rewritten to use streams with a short-circuit terminal operation. But sometimes it's difficult, or you have to contort the stream in a particular way in order to get the result you want. For cases like those, sometimes it's just easier to write a loop.

s'marks

Reply via email to