alamb commented on issue #13548: URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2501810315
> > > When the dataset is 100 million rows, then Polars takes 126 seconds and DataFusion takes 2,100 seconds. > > > > > > What version are you working with? > > @Rachelint has some ideas of how to improve this: > > ``` > > * [Sketch for aggregation intermediate results blocked management #11943](https://github.com/apache/datafusion/pull/11943) > > > > * [Manage group values and states by blocks in aggregation #11931](https://github.com/apache/datafusion/issues/11931) > > ``` > > 🤔 I guess it may be caused by the similar reason of what we encountered during benchmarking in #11827 Specifically that `power` and `corr` need to support `convert_to_state`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
