gabotechs commented on issue #22708:
URL: https://github.com/apache/datafusion/issues/22708#issuecomment-4600043507
There's one subtle different from `EvaluationType::Eager`'s definition and
what `BufferExec` does:
```rust
/// The stream generated by [`execute`](ExecutionPlan::execute) eagerly
generates `RecordBatch`
/// in one or more spawned Tokio tasks. Eager evaluation is only started
the first time
/// `Stream::poll_next` is called.
/// Examples of eager operators are repartition, coalesce partitions,
and sort preserving merge.
///
/// Eager operators are also known as a data-driven operators.
Eager,
```
`BufferExec` does eagerly generates `RecordBatches` on a tokio spawned task,
but it does not start evaluation after `Stream::poll_next` is called. It starts
evaluation whether `Stream::poll_next` is called or not, which is the main
point and the main reason why it brings performance benefits at the cost of
memory usage.
That being said, I agree that it's more accurate if both declare
`EvaluationType::Eager`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]