gene-bordegaray commented on issue #18595:
URL: https://github.com/apache/datafusion/issues/18595#issuecomment-3522508493

   Great catch @LiaCastaneda. I read read into some of the code and think a 
great place to look into is the condition which partitions on aggregates. This 
condition is in `physical_planner.rs` on line 801: 
   
   ```
   let can_repartition = !groups.is_empty() 
                       && session_state.config().target_partitions() > 1
                       && session_state.config().repartition_aggregations();
   ```
   
   here it could be useful to have another condition checking if we actually 
benefit from repartitioning. This new condition could check statistics of the 
data source and make a decision based on this. Say in my example we were to see 
we have one input partition, then we check is this partition actually large 
enough based on statistic to make the AggregatePartitioned. 
   
   I believe catching this at the physical planner level would be best though.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to