lgbo-ustc commented on PR #9155: URL: https://github.com/apache/incubator-gluten/pull/9155#issuecomment-2771283425
When querying only two tables, the optimization does not significantly improve execution speed. However, when querying three tables, the optimized query speed shows a noticeable improvement. This is partly because as the number of tables involved in the query increases, the join method requires building more hash tables. Additionally, the more join operations performed within a single node, the more likely it is to trigger memory spill operations, which further slows down execution. [three_tables_aggregate_union.pdf](https://github.com/user-attachments/files/19561181/three_tables_aggregate_union.pdf) [three_tables_join_aggregate.pdf](https://github.com/user-attachments/files/19561182/three_tables_join_aggregate.pdf) [two_tables_by_aggregate_union.pdf](https://github.com/user-attachments/files/19561183/two_tables_by_aggregate_union.pdf) [two_tables_by_join_aggregate.pdf](https://github.com/user-attachments/files/19561184/two_tables_by_join_aggregate.pdf) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
