[Impala-ASF-CR](2.x) IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9422 ) Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT .. IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT This change adds support for "clustered", "noclustered", "shuffle" and "noshuffle" hints in CTAS statement. Example: create /*+ clustered,noshuffle */ table t partitioned by (year, month) as select * from functional.alltypes The effect of these hints are the same as in insert statements: clustered: Sort locally by partition expression before insert to ensure that only one partition is written at a time. The goal is to reduce the number of files kept open / buffers kept in memory simultaneously. noclustered: Do not sort by primary key before insert to Kudu table. No effect on HDFS tables currently, as this is their default behavior. shuffle: Forces the planner to add an exchange node that repartitions by the partition expression of the output table. This means that a partition will be written only by a single node, which minimizes the global number of simultaneous writes. If only one partition is written (because all partitioning columns are constant or the target table is not partitioned), then the shuffle hint leads to a plan where all rows are merged at the coordinator where the table sink is executed. noshuffle: Do not add exchange node before insert to partitioned tables. The parser needed some modifications to be able to differentiate between CREATE statements that allow hints (CTAS), and CREATE statements that do not (every other type of CREATE statements). As a result, KW_CREATE was moved from tbl_def_without_col_defs to statement rules. Testing: The parser tests mirror the tests of INSERT, while analysis and planner tests are minimal, as the underlying logic is the same as for INSERT. Query tests are not created, as the hints have no effect on the DDL part of CTAS, and the actual query ran is the same as in the insert case. Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0 Reviewed-on: http://gerrit.cloudera.org:8080/8400 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/9422 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test 9 files changed, 211 insertions(+), 65 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/9422 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: merged Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0 Gerrit-Change-Number: 9422 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR](2.x) IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9422 ) Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT .. Patch Set 1: Code-Review+2 Clean pick (just uploading to test in batch) -- To view, visit http://gerrit.cloudera.org:8080/9422 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0 Gerrit-Change-Number: 9422 Gerrit-PatchSet: 1 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 23 Feb 2018 17:49:46 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT
Hello Alex Behm, Impala Public Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9422 to review the following change. Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT .. IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT This change adds support for "clustered", "noclustered", "shuffle" and "noshuffle" hints in CTAS statement. Example: create /*+ clustered,noshuffle */ table t partitioned by (year, month) as select * from functional.alltypes The effect of these hints are the same as in insert statements: clustered: Sort locally by partition expression before insert to ensure that only one partition is written at a time. The goal is to reduce the number of files kept open / buffers kept in memory simultaneously. noclustered: Do not sort by primary key before insert to Kudu table. No effect on HDFS tables currently, as this is their default behavior. shuffle: Forces the planner to add an exchange node that repartitions by the partition expression of the output table. This means that a partition will be written only by a single node, which minimizes the global number of simultaneous writes. If only one partition is written (because all partitioning columns are constant or the target table is not partitioned), then the shuffle hint leads to a plan where all rows are merged at the coordinator where the table sink is executed. noshuffle: Do not add exchange node before insert to partitioned tables. The parser needed some modifications to be able to differentiate between CREATE statements that allow hints (CTAS), and CREATE statements that do not (every other type of CREATE statements). As a result, KW_CREATE was moved from tbl_def_without_col_defs to statement rules. Testing: The parser tests mirror the tests of INSERT, while analysis and planner tests are minimal, as the underlying logic is the same as for INSERT. Query tests are not created, as the hints have no effect on the DDL part of CTAS, and the actual query ran is the same as in the insert case. Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0 Reviewed-on: http://gerrit.cloudera.org:8080/8400 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test 9 files changed, 211 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/9422/1 -- To view, visit http://gerrit.cloudera.org:8080/9422 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0 Gerrit-Change-Number: 9422 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins