Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9949 )
Change subject: IMPALA-6822: Add a query option to control shuffling by distinct exprs ...................................................................... Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/9949/3/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/9949/3/be/src/service/query-options.h@44 PS3, Line 44: TImpalaQueryOptions::SHUFFLE_DISTINCT_EXPRS + 1);\ > Let's call this SHUFFLE_DISTINCT_EXPRS because there could be several such Done http://gerrit.cloudera.org:8080/#/c/9949/3/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/9949/3/common/thrift/ImpalaInternalService.thrift@276 PS3, Line 276: // When a query has both grouping and distinct exprs, impala can optionally include the > * both grouping and distinct exprs Done http://gerrit.cloudera.org:8080/#/c/9949/3/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: http://gerrit.cloudera.org:8080/#/c/9949/3/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@886 PS3, Line 886: // When a query has both grouping and distinct exprs, impala can optionally include > same comments as in .thrift Done http://gerrit.cloudera.org:8080/#/c/9949/3/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@889 PS3, Line 889: // grouping exprs in the second phase which is not required when omitting the distinct > shuffleDistinctExprs What do you mean by separating them? I suppose we always shuffle if there aren't grouping exprs. http://gerrit.cloudera.org:8080/#/c/9949/3/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@931 PS3, Line 931: // phase-1 merge step. > I don't understand why this is correct. If the first phase fragment is suit L929 adds phase-2 agg. Partition being compatible && no shuffle by distinct exprs => the child is partitioned by grouping exprs. We only need phase-1 agg and phase-2 agg without any merge agg or exchange. http://gerrit.cloudera.org:8080/#/c/9949/3/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@952 PS3, Line 952: // step (which is where it should be) > Move this before L959 for better readability. Also you can just: We don't need "fragments.add(firstMergeFragment)" if the partition is compatible. How to move it out? http://gerrit.cloudera.org:8080/#/c/9949/3/testdata/workloads/functional-planner/queries/PlannerTest/no-shuffle-by-distinct.test File testdata/workloads/functional-planner/queries/PlannerTest/no-shuffle-by-distinct.test: http://gerrit.cloudera.org:8080/#/c/9949/3/testdata/workloads/functional-planner/queries/PlannerTest/no-shuffle-by-distinct.test@1 PS3, Line 1: # Distinct agg without a grouping expr > Let's try to distill minimal tests that are needed here. Done -- To view, visit http://gerrit.cloudera.org:8080/9949 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icb4b4576fb29edd62cf4b4ba0719c0e0a2a5a8dc Gerrit-Change-Number: 9949 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Comment-Date: Mon, 09 Apr 2018 22:19:01 +0000 Gerrit-HasComments: Yes