NGA-TRAN commented on issue #18595: URL: https://github.com/apache/datafusion/issues/18595#issuecomment-3522141524
@feniljain > Issue sounds very similar to https://github.com/apache/datafusion/issues/18513, are they related by any chance? It may be similar issue. I wonder after [this fix](https://github.com/apache/datafusion/issues/18341) (see[ PR](https://github.com/apache/datafusion/pull/18521#issuecomment-3516585320)), will it get better? See[ the summary of the fix here](https://github.com/apache/datafusion/issues/18341#issuecomment-3505130845) @LiaCastaneda > I'm wondering where the fix to this should be, Rather than in the optimizer EnforceDistribution or maybe in the planner? I wonder why its not selecting AggregateMode::Single (not partition) directly. [This](https://github.com/apache/datafusion/blob/76b4156aff9033680c907430d98de4dd274b1fd0/datafusion/core/src/physical_planner.rs#L801) is where the node is created in the physical planner. I suspect the fix might be tricky—by the time we detect that the file is small, we may have already committed to repartitioning elsewhere. This ticket likely requires a deeper investigation to fully understand the planning flow and its implications. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
