jiaan.geng created SPARK-44571:
----------------------------------
Summary: Eliminate the Join by Combine multiple Aggregates
Key: SPARK-44571
URL: https://issues.apache.org/jira/browse/SPARK-44571
Project: Spark
Issue Type: New Feature
Components: SQL
Affects Versions: 3.5.0
Reporter: jiaan.geng
Recently, I investigate the test case q28 which is belong to the TPC-DS queries.
The query contains multiple scalar subquery with aggregation and connected with
inner join.
If we can merge the filters and aggregates, we can scan data source only once
and eliminate the join so as avoid shuffle. Obviously, this change will improve
the performance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]