alamb commented on issue #11442: URL: https://github.com/apache/datafusion/issues/11442#issuecomment-2228171938
## Aggregate performance / memory use for high cardinality aggregates * https://github.com/apache/datafusion/issues/6937 **What**: Improve Queries when the number of groups is very high (1 million+) **Why**: Queries when the number of groups is high are significantly slower than DuckDB and use substantially more memory. I think there is at least a factor of 2 of performance here **What is left**: There are ideas on https://github.com/apache/datafusion/issues/6937 but someone has to try them out, prototype / see if they would work and then productionize them -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org