Shant Hovsepian created IMPALA-10099:
----------------------------------------
Summary: Push down DISTINCT aggregation for EXCEPT/INTERSECT
Key: IMPALA-10099
URL: https://issues.apache.org/jira/browse/IMPALA-10099
Project: IMPALA
Issue Type: Improvement
Reporter: Shant Hovsepian
Assignee: Shant Hovsepian
The implementation of SetOperations for EXCEPT/INTER in IMPALA-9943 produced
query rewrites that would apply DISTINCT aggregation after exchanges for
distributed plans. In case where the query can be directly rewritten to apply
the DISTINCT to the set operation operands would result in better performance
for most large queries.
This should help the performance TPC-DS Q14 which does an INTERSECT of queries
with large result sets that contain many duplicates.
In general it would better to have DISTINCT move around optimization phase
during planning which would handle this case as well as many others.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)