Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9949 )
Change subject: IMPALA-6822: Add a query option to control shuffling by distinct exprs ...................................................................... Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/9949/5/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/9949/5/be/src/service/query-options.h@132 PS5, Line 132: QUERY_OPT_FN(shuffle_distinct_exprs, SHUFFLE_DISTINCT_EXPRS, TQueryOptionLevel::REGULAR)\ This seems similarly ADVANCED as, e.g., DEFAULT_JOIN_DISTRIBUTION_MODE so let's move to ADVANCED. http://gerrit.cloudera.org:8080/#/c/9949/5/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/9949/5/be/src/service/query-options.cc@635 PS5, Line 635: case TImpalaQueryOptions::SHUFFLE_DISTINCT_EXPRS: { I just realized we should also update the query-options-test.cc http://gerrit.cloudera.org:8080/#/c/9949/5/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/9949/5/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@78 PS5, Line 78: public void testNoShuffleOnDistinct() { testShuffleByDistinctExprs() http://gerrit.cloudera.org:8080/#/c/9949/5/testdata/workloads/functional-planner/queries/PlannerTest/shuffle-by-distinct-exprs.test File testdata/workloads/functional-planner/queries/PlannerTest/shuffle-by-distinct-exprs.test: http://gerrit.cloudera.org:8080/#/c/9949/5/testdata/workloads/functional-planner/queries/PlannerTest/shuffle-by-distinct-exprs.test@2 PS5, Line 2: ---- QUERY Is the formatting of these tests correct? I believe we don't recognize "---- QUERY in planner tests" Or perhaps this works by chance? Other tests don't have this better to be consistent. http://gerrit.cloudera.org:8080/#/c/9949/5/testdata/workloads/functional-planner/queries/PlannerTest/shuffle-by-distinct-exprs.test@217 PS5, Line 217: select count(distinct a.int_col) from functional.alltypes a inner join [shuffle] Add one more test where the input is partitioned by year, int_col (compatible with the desired partitioning with distinct exprs) http://gerrit.cloudera.org:8080/#/c/9949/5/tests/query_test/test_aggregation.py File tests/query_test/test_aggregation.py: http://gerrit.cloudera.org:8080/#/c/9949/5/tests/query_test/test_aggregation.py@336 PS5, Line 336: class TestDistinctAggregationQueries(ImpalaTestSuite): TestDistinctAggregation http://gerrit.cloudera.org:8080/#/c/9949/5/tests/query_test/test_aggregation.py@353 PS5, Line 353: if cls.exploration_strategy() == 'core': Aren't we adding all formats in L345? I think we need to add a constraint here and not a new dimension. -- To view, visit http://gerrit.cloudera.org:8080/9949 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icb4b4576fb29edd62cf4b4ba0719c0e0a2a5a8dc Gerrit-Change-Number: 9949 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Comment-Date: Tue, 10 Apr 2018 03:49:48 +0000 Gerrit-HasComments: Yes