Dandandan commented on PR #23167: URL: https://github.com/apache/datafusion/pull/23167#issuecomment-4798531452
I agree with the points already made above: * It should be clear when a non-shuffle/non-push-based execution will benefit from this: for joins, we usually don't want to fully know both sides of the join as they need to be spilled / shuffled fully when the underlying data source / query to get exact statistics. AQE in streaming mode will be different than when data is streamed / pushed. * We probably want to reuse the existing optimizations as much as possible. E.g. `JoinSelection` etc. Also I am a bit confused to see `RuntimeOptimizerExec`, I think that feels a bit hacky to reuse the execution plan for some AQE plumbing 🤔 (but perhaps I need to be convinced) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
