alamb commented on issue #11442:
URL: https://github.com/apache/datafusion/issues/11442#issuecomment-2228171938

   ## Aggregate performance / memory use for high cardinality aggregates
   * https://github.com/apache/datafusion/issues/6937
   
   **What**: Improve Queries when the number of groups is very high (1 million+)
   **Why**: Queries when the number of groups is high are significantly slower 
than DuckDB and use substantially more memory. I think there is at least a 
factor of 2 of performance here
   **What is left**: There are ideas on 
https://github.com/apache/datafusion/issues/6937 but someone has to try them 
out, prototype / see if they would work and then productionize them
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to