geoffreyclaude commented on issue #22708:
URL: https://github.com/apache/datafusion/issues/22708#issuecomment-4600294872

   Thanks @gabotechs, I agree with both points.
   
   For the first point, the `EvaluationType::Eager` rustdoc was defining the 
category too tightly by saying evaluation starts on the first 
`Stream::poll_next`. I think that startup timing is an implementation detail of 
eager operators. `BufferExec` starts its producer from `execute`, while other 
eager operators may start producer work on first poll; both are still eager 
from the caller's perspective because downstream demand can cause the operator 
to drive child input or produce batches independently of one downstream poll at 
a time.
   
   For `need_data_exchange`, I agree that using `evaluation_type == Eager` 
makes the helper answer the wrong question once `BufferExec` and `AnalyzeExec` 
are classified accurately. The history is useful here: #4585 proposed 
`need_data_exchange` for callers that need to identify physical operators 
requiring exchange-style handling, and listed non-round-robin 
`RepartitionExec`, `CoalescePartitionsExec`, and `SortPreservingMergeExec`. 
#4586 implemented that helper around that exchange/gather meaning; it even 
moved the logic out of the `ExecutionPlan` trait into a free helper. Later, 
#16398 introduced `EvaluationType` for cooperative scheduling and made 
`need_data_exchange` delegate to eager evaluation. That was understandable 
while the eager set effectively matched the exchange/gather set, but it 
conflates two different properties.
   
   I updated the PR so `EvaluationType` is documented as execution/evaluation 
behavior, while `need_data_exchange(...)` is restored to the exchange/gather 
predicate. That lets `BufferExec` and `AnalyzeExec` report eager evaluation 
without being treated as data-exchange operators.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to