Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16350 )
Change subject: IMPALA-10099: Push down DISTINCT in Set operations ...................................................................... IMPALA-10099: Push down DISTINCT in Set operations INTERSECT/EXCEPT are not duplicate preserving operations. The distinct aggregations can happen in each operand, the leftmost operand only, or after all the operands in a separate aggregation step. Except for a couple special cases we would use the last strategy most often. This change pushes the distinct aggregation down to the leftmost operand in cases where there are no analytic functions, or when a distinct or grouping operation already eliminates duplicates. In general DISTINCT placement such as in this case should be done throughout the entire plan tree in a cost based manner as described in IMPALA-5260 Testing: * TpcdsPlannerTest * PlannerTest * TPC-DS 30TB Perf run for any affected queries - Q14-1 180s -> 150s - Q14-2 109s -> 90s - Q8 no significant change * SetOperation Planner Tests * Analyzer tests * Tpcds Functional Workload Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5 Reviewed-on: http://gerrit.cloudera.org:8080/16350 Tested-by: Impala Public Jenkins <[email protected]> Reviewed-by: Tim Armstrong <[email protected]> --- M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test 6 files changed, 2,049 insertions(+), 1,806 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5 Gerrit-Change-Number: 16350 Gerrit-PatchSet: 5 Gerrit-Owner: Shant Hovsepian <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: David Rorke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Shant Hovsepian <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
