korowa commented on PR #11627: URL: https://github.com/apache/datafusion/pull/11627#issuecomment-2254453848
FWIW: regarding benchmarks -- running with `target_partitions=4` shows that this feature is enabling while clickbench Q13 (count distinct is rewritten to double group by) and tpch Q20 (one of the filters contains correlated subquery with aggregation). Also partial aggregation is skipped on 1/4 partitions in clickbench Q18 and tpch Q16. As a result -- I'd expect any performance improvements only in clickbench Q13 and tcph Q20 (don't think 1/4 partitions in other two queries is able to make any effect), and I suppose that improvements shown by any other queries to be just a matter of luck and fluctuations -- I wasn't able to find any stable regressions during local benchmark runs. Regarding Q32 -- I've run it separately and got equal runtimes for both branches (due to AVG it's not able to skip partial aggregation yet) ``` ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓ ┃ Query ┃ master ┃ skip-partial-aggregation ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩ │ QQuery 0 │ 19399.38ms │ 19424.11ms │ no change │ └──────────────┴────────────┴──────────────────────────┴───────────┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org