dugenkui03 commented on issue #2119:
URL: https://github.com/apache/hudi/issues/2119#issuecomment-699940435


   > Hi @dugenkui03, This `collect` function is useful, it is a `terminal 
operation` and will traverse the stream to produce a side-effect.
   > 
   > The function in `Stream` can be divided into intermediate and terminal 
operations, the former is lazy, executing an intermediate operation such as 
map() does not actually perform any mapping until a terminal operation of the 
pipeline is executed, such as `collect`.
   > 
   > details can be found here:
   > https://docs.oracle.com/javase/8/docs/api/
   > 
   > ```
   > Stream operations and pipelines
   > Stream operations are divided into intermediate and terminal operations, 
and are combined to form stream pipelines. A stream pipeline consists of a 
source (such as a Collection, an array, a generator function, or an I/O 
channel); followed by zero or more intermediate operations such as 
Stream.filter or Stream.map; and a terminal operation such as Stream.forEach or 
Stream.reduce.
   > 
   > Intermediate operations return a new stream. They are always lazy; 
executing an intermediate operation such as filter() does not actually perform 
any filtering, but instead creates a new stream that, when traversed, contains 
the elements of the initial stream that match the given predicate. Traversal of 
the pipeline source does not begin until the terminal operation of the pipeline 
is executed.
   > 
   > Terminal operations, such as Stream.forEach or IntStream.sum, may traverse 
the stream to produce a result or a side-effect. After the terminal operation 
is performed, the stream pipeline is considered consumed, and can no longer be 
used; if you need to traverse the same data source again, you must return to 
the data source to get a new stream. In almost all cases, terminal operations 
are eager, completing their traversal of the data source and processing of the 
pipeline before returning. Only the terminal operations iterator() and 
spliterator() are not; these are provided as an "escape hatch" to enable 
arbitrary client-controlled pipeline traversals in the event that the existing 
operations are not sufficient to the task.
   > ```
   
   Thanks for your explanation!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to