Rachelint commented on issue #13548:
URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2504139573

   > > > > When the dataset is 100 million rows, then Polars takes 126 seconds 
and DataFusion takes 2,100 seconds.
   > > > 
   > > > 
   > > > What version are you working with?
   > > > @Rachelint has some ideas of how to improve this:
   > > > ```
   > > > * [Sketch for aggregation intermediate results blocked management 
#11943](https://github.com/apache/datafusion/pull/11943)
   > > > 
   > > > * [Manage group values and states by blocks in aggregation 
#11931](https://github.com/apache/datafusion/issues/11931)
   > > > ```
   > > 
   > > 
   > > 🤔 I guess it may be caused by the similar reason of what we encountered 
during benchmarking in #11827
   > 
   > Specifically that `power` and `corr` need to support `convert_to_state`?
   
   I think it possible.
   I am running and profiling it to find the answer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to