[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1874/ -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 11 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Feb 2018 02:11:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 11: Code-Review+2 Hit IMPALA-6472, rebased, carrying Alex's +2. -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 11 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Feb 2018 02:10:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 10: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1866/ -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 10 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 03 Feb 2018 01:41:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1866/ -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 10 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Feb 2018 22:04:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 10: Code-Review+2 PS10 is a final rebase before submitting the change. Carrying Alex's +2. -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 10 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Feb 2018 22:04:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Hello Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9153 to look at the new patch set (#10). Change subject: IMPALA-5293: Turn insert clustering on by default .. IMPALA-5293: Turn insert clustering on by default This change enables clustering by default. IMPALA-2521 introduced the 'clustered' hint which inserts a local sort by the partitioning columns to a query plan. The hint is only effective for HDFS and Kudu tables. Like before, the 'noclustered' hint prevents clustering. If a table has ordering columns defined, the 'noclustered' hint is ignored and we issue a warning. This change removes some tests that were added specifically to test that clustering can be enabled using the 'clustered' hint. It changes some tests to use the 'noclustered' hint to make sure that clustering can be disabled. It also adds tests to make sure that we cover the 'noclustered' case properly. Cherry-picks: not for 2.x. Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test 12 files changed, 248 insertions(+), 66 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/10 -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 10 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 8: PS8 makes test_stats_extrapolation resilient against input data order variations. PS9 rebases the change. Carrying Alex's +2. -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 8 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Feb 2018 18:10:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Hello Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9153 to look at the new patch set (#8). Change subject: IMPALA-5293: Turn insert clustering on by default .. IMPALA-5293: Turn insert clustering on by default This change enables clustering by default. IMPALA-2521 introduces the 'clustered' hint which inserts a local sort by the partitioning columns to a query plan. The hint is only effective for HDFS and Kudu tables. Like before, the 'noclustered' hint prevents clustering. If a table has ordering columns defined, the 'noclustered' hint is ignored and we issue a warning. This change removes some tests that were added specifically to test that clustering can be enabled using the 'clustered' hint. It changes some tests to use the 'noclustered' hint to make sure that clustering can be disabled. It also adds tests to make sure that we cover the 'noclustered' case properly. Cherry-picks: not for 2.x. Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test 12 files changed, 248 insertions(+), 66 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/8 -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 8 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 7 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Feb 2018 00:49:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 7: (1 comment) Please see PS7. http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813 PS5, Line 813: > I think we should remove this planner test because it's somewhat misleading Done -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 7 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Feb 2018 00:45:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Hello Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9153 to look at the new patch set (#7). Change subject: IMPALA-5293: Turn insert clustering on by default .. IMPALA-5293: Turn insert clustering on by default This change enables clustering by default. IMPALA-2521 introduces the 'clustered' hint which inserts a local sort by the partitioning columns to a query plan. The hint is only effective for HDFS and Kudu tables. Like before, the 'noclustered' hint prevents clustering. If a table has ordering columns defined, the 'noclustered' hint is ignored and we issue a warning. This change removes some tests that were added specifically to test that clustering can be enabled using the 'clustered' hint. It changes some tests to use the 'noclustered' hint to make sure that clustering can be disabled. It also adds tests to make sure that we cover the 'noclustered' case properly. Cherry-picks: not for 2.x. Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test 12 files changed, 255 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/7 -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 7 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813 PS5, Line 813: # IMPALA-6280: clustered (default) with outer join, inline view, and TupleisNullPredicate > I changed it to use alltypessmall like the other tests in this file. I think we should remove this planner test because it's somewhat misleading. The bug in IMPALA-6280 only shows up at runtime, so this planner test arguably does not cover IMPALA-6280. -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 6 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Feb 2018 00:33:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 6: (4 comments) Thanks for the review, please see my inline comments and PS6. http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG@11 PS5, Line 11: to a query plan. The hint is only effective for HDFS and Kudu tables. > Why does clustered issue a warning? We change defaults all the time and use Leftover from a previous PS. :( Removed it. http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1893 PS5, Line 1893: // Test clustered hint. > Where's the expected warning? Removed in last PS. Fixed the comment. http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813 PS5, Line 813: # IMPALA-6280: clustered (default) with outer join, inline view, and TupleisNullPredicate > Where did this come from? I changed it to use alltypessmall like the other tests in this file. This query is tested in QueryTest/insert.test and in one of the previous PSs, Tim pointed out that it should still have a clustering node. I added this test to make sure that it does so in the plan, too. Let me know if you'd like to remove the test. http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test File testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test: http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test@550 PS5, Line 550: # sure that noclustered hint prevents addition of a sort node before writing to HDFS. > of a sort node Done -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 6 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Feb 2018 00:03:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Hello Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9153 to look at the new patch set (#6). Change subject: IMPALA-5293: Turn insert clustering on by default .. IMPALA-5293: Turn insert clustering on by default This change enables clustering by default. IMPALA-2521 introduces the 'clustered' hint which inserts a local sort by the partitioning columns to a query plan. The hint is only effective for HDFS and Kudu tables. Like before, the 'noclustered' hint prevents clustering. If a table has ordering columns defined, the 'noclustered' hint is ignored and we issue a warning. This change removes some tests that were added specifically to test that clustering can be enabled using the 'clustered' hint. It changes some tests to use the 'noclustered' hint to make sure that clustering can be disabled. It also adds tests to make sure that we cover the 'noclustered' case properly. Cherry-picks: not for 2.x. Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test 12 files changed, 305 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/6 -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 6 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG@11 PS5, Line 11: to a query plan. Specifying 'clustered' will now issue a warning. The Why does clustered issue a warning? We change defaults all the time and users might want to fix the plan with hints. I don't think we should warn if hints have no effect. http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1893 PS5, Line 1893: // Test that clustered hint issues a warning. Where's the expected warning? http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813 PS5, Line 813: # IMPALA-6280: clustered (default) with outer join, inline view, and TupleisNullPredicate Where did this come from? Would be great to avoid using alltypesinsert outside of the end-to-end tests. That table is pretty weird and probably should not be part of our functional_schema_template, but that's a longer story http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test File testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test: http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test@550 PS5, Line 550: # sure that noclustered hint prevents addition of a sort not before writing to HDFS. of a sort node -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 5 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 31 Jan 2018 23:04:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 5: (1 comment) Thanks for the review. The latest PS also fixes some tests that had failed in a private exhaustive run. I'm currently rerunning an exhaustive build with the latest PS. http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@831 PS3, Line 831: hasClusteredHint_ = true; > I'd lean towards re-adding it for Kudu tables, although it is not that usef Sounds good to me. I created IMPALA-6466 to add documentation for the clustered hint for Kudu tables. -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 5 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 31 Jan 2018 19:33:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9153 to look at the new patch set (#5). Change subject: IMPALA-5293: Turn insert clustering on by default .. IMPALA-5293: Turn insert clustering on by default This change enables clustering by default. IMPALA-2521 introduces the 'clustered' hint which inserts a local sort by the partitioning columns to a query plan. Specifying 'clustered' will now issue a warning. The hint is only effective for HDFS and Kudu tables. Like before, the 'noclustered' hint prevents clustering. If a table has ordering columns defined, the 'noclustered' hint is ignored and we issue a warning. This change also changes the behavior for Kudu tables. Specifying 'clustered' will no longer force clustering. Instead, clustering is always enabled except in these cases: * the 'noclustered' hint is present * the plan only involves a single execution node * the target table is unpartitioned. This change removes some tests that were added specifically to test that clustering can be enabled using the 'clustered' hint. It changes some tests to use the 'noclustered' hint to make sure that clustering can be disabled. It also adds tests to make sure that we cover the 'noclustered' case properly. Cherry-picks: not for 2.x. Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test 12 files changed, 302 insertions(+), 77 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/5 -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 5 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@831 PS3, Line 831: analyzer.addWarning("Clustering is enabled by default, use 'noclustered' to " + > Adding the hint should not have any effect since we don't ensure to add a c I'd lean towards re-adding it for Kudu tables, although it is not that useful - ignoring the hint seems a bit weird. I do feel like we shouldn't warn here since it could be annoying and there's no way to disable the warning. I understand that confused users might add it without understanding it's currently a no-op, but that seems fairly harmless. It's a bit of a problem if the user actually wants to retain the hint - either it they might be carried over from a query written when it wasn't a default, or maybe the user is intending to force a particular plan to prevent future plan changes (or ensure consistent behaviour on different versions of Impala). -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 3 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 30 Jan 2018 23:18:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9153 ) Change subject: IMPALA-5293: Turn insert clustering on by default .. Patch Set 4: (5 comments) Thanks for the review. Please see my inline comments and PS4. http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@831 PS3, Line 831: analyzer.addWarning("Clustering is enabled by default, use 'noclustered' to " + > I'm wondering if the warning is necessary. I don't think we generally warn Adding the hint should not have any effect since we don't ensure to add a clustering sort if it is present. Would you prefer to re-add it for Kudu tables (see Planner.java)? http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/planner/Planner.java@615 PS3, Line 615: primar > primary Done http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/empty.test File testdata/workloads/functional-planner/queries/PlannerTest/empty.test: http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/empty.test@302 PS3, Line 302: insert into functional.alltypes partition(year, month) > IMO this seems ok, the benefit of optimising this case would be very margin Thanks for the feedback. I double checked with Alex and removed the todo. http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@101 PS3, Line 101: # IMPALA-5293: noclustered hint prevents adding sort node > Do we also need to test the case when there's a noclustered hint and sort.c Insert tests for tables with sort columns are in their own file, the particular case is covered in testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test:49. http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-query/queries/QueryTest/insert.test File testdata/workloads/functional-query/queries/QueryTest/insert.test: http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-query/queries/QueryTest/insert.test@949 PS3, Line 949: # IMPALA-6280: clustered (default) with outer join, inline view, and TupleisNullPredicate > It looks like this one should still be a clustered insert, since IMPALA-628 Thanks for catching this. I also added a test for this query to PlannerTest/insert.test to make sure we get the plan right. -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 4 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 30 Jan 2018 20:06:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9153 to look at the new patch set (#4). Change subject: IMPALA-5293: Turn insert clustering on by default .. IMPALA-5293: Turn insert clustering on by default This change enables clustering by default. IMPALA-2521 introduces the 'clustered' hint which inserts a local sort by the partitioning columns to a query plan. Specifying 'clustered' will now issue a warning. The hint is only effective for HDFS and Kudu tables. Like before, the 'noclustered' hint prevents clustering. If a table has ordering columns defined, the 'noclustered' hint is ignored and we issue a warning. This change also changes the behavior for Kudu tables. Specifying 'clustered' will no longer force clustering. Instead, clustering is always enabled except in these cases: * the 'noclustered' hint is present * the plan only involves a single execution node * the target table is unpartitioned. This change removes some tests that were added specifically to test that clustering can be enabled using the 'clustered' hint. It changes some tests to use the 'noclustered' hint to make sure that clustering can be disabled. It also adds tests to make sure that we cover the 'noclustered' case properly. Cherry-picks: not for 2.x. Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test M testdata/workloads/functional-query/queries/QueryTest/insert.test 11 files changed, 302 insertions(+), 80 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/4 -- To view, visit http://gerrit.cloudera.org:8080/9153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9 Gerrit-Change-Number: 9153 Gerrit-PatchSet: 4 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Tim Armstrong