[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month)
as select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

Testing:

The parser tests mirror the tests of INSERT, while analysis and planner
tests are minimal, as the underlying logic is the same as for INSERT.

Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Reviewed-on: http://gerrit.cloudera.org:8080/8400
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 211 insertions(+), 65 deletions(-)

Approvals:
  Alex Behm: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 17: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 22 Feb 2018 20:43:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 17: Code-Review+2

Thanks for pointing out the conflicts.


--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 22 Feb 2018 17:04:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 17:

(3 comments)

I have marked the lines where the rebase (patch set 16) led to non-trivial 
conflicts/broken tests. Patch set 17 contains some typo fixes + changes in 
kudu.test, which I forgot to run after rebase.

http://gerrit.cloudera.org:8080/#/c/8400/17/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/17/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@303
PS17, Line 303:   /**
  :* Creates an insert into, an insert overwrite, and a CTAS 
statement with
  :* the given hints and checks that the parsed hints are the 
same as the expected hints.
  :*/
  :   private void TestInsertAndCtasHints(String insertPart, String 
ctasPart,
  : String[] hintStyle, String hints, String... 
expectedHints) {
  : String hintsPart = hintStyle[0] + hints + hintStyle[1];
  : TestInsertStmtHints(String.format("insert %%s into %s %%s 
select * from t",
  : insertPart), hintsPart, expectedHints);
  : TestInsertStmtHints(String.format("insert %%s overwrite %s 
%%s select * from t",
  : insertPart), hintsPart, expectedHints);
  : TestCtasHints(String.format("create %s table %s as select * 
from t",
  : hintsPart, ctasPart), expectedHints);
  :   }
  :
  :   /**
  :* Injects hints into pattern and checks that the injected 
hints match the expected
  :* hints. This function covers both insert and upsert 
statements.
  :*/
  :   private void TestInsertStmtHints(String pattern, String hint, 
String... expectedHints) {
  : for (InsertStmt.HintLocation loc: 
InsertStmt.HintLocation.values()) {
  :   InsertStmt insertStmt = (InsertStmt) 
ParsesOk(InjectInsertHint(pattern, hint, loc));
  :   assertEquals(expectedHints, 
HintsToStrings(insertStmt.getPlanHints()));
  : }
  :   }
  :
  :   /**
  :* Injects hints into pattern and expect parser error on the 
injected hints.
  :* It covers both insert and upsert statements.
  :*/
  :   private void ParserErrorOnInsertStmtHints(String pattern, 
String hint) {
  : for (InsertStmt.HintLocation loc: 
InsertStmt.HintLocation.values()) {
  :   ParserError(InjectInsertHint(pattern, hint, loc));
  : }
  :   }
IMPALA-4168: "Adds Oracle-style hint placement for INSERT/UPSERT" also contains 
similar changes to avoid test duplication ( 
https://gerrit.cloudera.org/#/c/8676/ ).


http://gerrit.cloudera.org:8080/#/c/8400/17/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/17/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@210
PS17, Line 210: # noshuffle hint would lead to a different plan.
  : # Note that noclustered hint is added to ensure consistent 
plans on Impala 2.x and 3.x,
  : # because IMPALA-5293 changed clustered to be the default on 
3.x.
  : create /*+ noclustered */table t partitioned by (year, month) as
These tests were broken by IMPALA-5293: "Turn insert clustering on by default". 
IMPALA-5293 is not merged to Impala 2.x, while this change can be merged in my 
opinion. To avoid extra work, I have changed the tests to statements that 
should lead to the same plan on both branches.


http://gerrit.cloudera.org:8080/#/c/8400/17/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

http://gerrit.cloudera.org:8080/#/c/8400/17/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test@431
PS17, Line 431: primary key(id) partition by hash(id) partitions 3 stored as 
kudu as
  : select * from functional.alltypes;
  :  DISTRIBUTEDPLAN
  : INSERT INTO KUDU [default.t]
  : |
  : 01:EXCHANGE [KUDU(KuduPartition(functional.alltypes.id))]
  : |
  : 00:SCAN HDFS [functional.alltypes]
  :partitions=24/24 files=24 size=478.45KB
  : 
  : create /* +noclustered,noshuffle */ table t
  : primary key(id) partition by hash(id) partitions 3 stored as 
kudu as
These tests broke because of 

[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-22 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#17).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month)
as select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

Testing:

The parser tests mirror the tests of INSERT, while analysis and planner
tests are minimal, as the underlying logic is the same as for INSERT.

Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 211 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/17
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-22 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#16).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month)
as select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

Testing:

The parser tests mirror the tests of INSERT, while analysis and planner
tests are minimal, as the underlying logic is the same as for INSERT.

Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 210 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/16
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 16
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-20 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 15: Code-Review+2

If there are non-trivial conflicts in the rebase, please let me know and I'll 
take another look.


--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 21 Feb 2018 06:18:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-20 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 14:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@82
PS14, Line 82:   Preconditions.checkNotNull(queryStmt);
> combine with member assignment:
Done


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1640
PS14, Line 1640:   public void TestCreateTableAsSelectWithHints() throws 
AnalysisException {
> To avoid duplication with AnalyzeStmts#TestInsertHints let's do the followi
I kept 3 tests to cover ok/warning/error case.


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@304
PS14, Line 304:* Creates an insert, an upsert, and a CTAS statement with 
the given hints
> Upsert isn't tested. Code or comment wrong?
Wrong comment, thanks for spotting it.
This remained from an attempt to merge UPSERT tests to this function, which 
turned out to be non practical, because UPSERT leads to parse failure for 
partition clauses.


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@336
PS14, Line 336:   static private List 
ConvertHintsToStringList(List hints) {
> shorter: hintsToStrings()
Done


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
File fe/src/test/java/org/apache/impala/common/FrontendTestBase.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java@83
PS14, Line 83:  new String[] { "/* +", "*/" }, // traditional commented 
hint
> tabs
Done


http://gerrit.cloudera.org:8080/#/c/8400/14/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/14/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@207
PS14, Line 207: # IMPALA-4167: noclustered hint is ignored if the CTAS has 
partitions and sort columns.
> Let's follow a procedure to trim down these tests similar to what we are do
Done



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 20 Feb 2018 16:44:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-20 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#15).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month)
as select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

Testing:

The parser tests mirror the tests of INSERT, while analysis and planner
tests are minimal, as the underlying logic is the same as for INSERT.

Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 205 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/15
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-17 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 14:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@82
PS14, Line 82:   Preconditions.checkNotNull(queryStmt);
combine with member assignment:
this.createStmt = Preconditions.checkNotNull(queryStmt);


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1640
PS14, Line 1640:   public void TestCreateTableAsSelectWithHints() throws 
AnalysisException {
To avoid duplication with AnalyzeStmts#TestInsertHints let's do the following:
* Keep a single basic test, but remove all other tests that are basically the 
same as in AnalyzeStmts#TestInsertHints
* Keep those tests that are different from the expected behavior in 
AnalyzeStmts#TestInsertHints
* Add a comment that ParserTest ensures hints are properly set in the 
InsertStmt of a Ctas, and that AnalyzeStmts#TestInsertHints covers the analysis 
of those hints. Therefore, the tests here are minimal.


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@304
PS14, Line 304:* Creates an insert, an upsert, and a CTAS statement with 
the given hints
Upsert isn't tested. Code or comment wrong?


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@336
PS14, Line 336:   static private List 
ConvertHintsToStringList(List hints) {
shorter: hintsToStrings()

Easier to return them as a String[], then you don't need to convert 
expectedHints to an List in L323


http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
File fe/src/test/java/org/apache/impala/common/FrontendTestBase.java:

http://gerrit.cloudera.org:8080/#/c/8400/14/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java@83
PS14, Line 83:  new String[] { "/* +", "*/" }, // traditional commented 
hint
tabs


http://gerrit.cloudera.org:8080/#/c/8400/14/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/14/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@207
PS14, Line 207: # IMPALA-4167: noclustered hint is ignored if the CTAS has 
partitions and sort columns.
Let's follow a procedure to trim down these tests similar to what we are doing 
for AnalyzeDDLTest, i.e.:
* Keep 1-2 basic tests.
* Comment the hints only affect the insert portion which is already tested more 
thoroughly elsewhere (add missing overage to the insert tests as necessary if 
you think coverage is missing)
* Keep tests that are truly specific to Ctas (though I think there are none)



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Sat, 17 Feb 2018 17:58:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-16 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#14).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month)
as select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner / parser tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 333 insertions(+), 78 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/14
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 11:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/cup/sql-parser.cup@1065
PS11, Line 1065: create_tbl_as_select_stmt_inner ::=
> let's call this create_tbl_as_select_params to match what is produced
Done


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@69
PS6, Line 69:* Helper class for parsing.
> I looked at this closely and both approaches seem fine to me. Here are my t
Thanks!


http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@76
PS11, Line 76: public CreateTableStmt createStmt_;
> We don't use the "_" for public members.
Done


http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1642
PS11, Line 1642:   private String[][] getHintStyles() {
> Can you move this to FrontendTestBase, make it protected, and use it in the
I have moved the member variable version found in other files instead of the 
function - I prefer the member, because no new arrays have to be created on 
every call.


http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1651
PS11, Line 1651:   public void TestCreateTableAsSelectWithHints() throws 
AnalysisException {
> * I think the more important testing is in ParserTest#TestPlanHints similar
I have added the CTAS version of insert tests to ParserTest#TestPlanHints. I 
think that some of these tests should move there, like the ones about the case 
insensitivity of hints, while the ones with errors or warnings should remain 
here.

I am unsure about the rest, maybe they can be deleted, as planner tests already 
check most of these cases.

About combining with TestInsertHints(): I have created combined 
insert/upsert/ctas tests in ParserTest#TestPlanHints, something similar could 
be done here. What makes this more tricky is that the actual tables are 
checked, so the source table must exist, and the target table in insert or the 
partitioning parameters in CTAS must match with the result of the select + 
there are some cases where insert is possible, but CTAS is not (HBase tables).


http://gerrit.cloudera.org:8080/#/c/8400/11/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/11/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@207
PS11, Line 207: # IMPALA-4167: noclustered hint is ignored if the CTAS has 
partitions and sort columns.
> * Any ideas to trim down the tests? Seems like a lot of new tests and we sh
Maybe create /*+ clustered,noshuffle */ + the default case without hints is 
enough to show that both type of hints can take effect. If the other tests are 
deleted, then some of them should be moved to the insert tests, because they 
cover some scenarios that are not tested there. What do you think about this 
solution?

I agree that both plans are unnecessary - I kept the DISTRIBUTEDPLAN, because 
the effect of (no)shuffle can be only checked there.



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 07 Feb 2018 17:50:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-07 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#13).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month)
as select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner / parser tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 333 insertions(+), 78 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/13
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-07 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#12).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month) as
select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner / parser tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 330 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/12
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-30 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 11:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/cup/sql-parser.cup@1065
PS11, Line 1065: create_tbl_as_select_stmt_inner ::=
let's call this create_tbl_as_select_params to match what is produced


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@69
PS6, Line 69:* Helper class for parsing.
> I don't feel strongly either, lets see what others think.
I looked at this closely and both approaches seem fine to me. Here are my 
thoughts:

* We follow the set() approach in a few places because it's more practical than 
the alternative
* Relying on a post-construction set() can be harder to reason about, so I 
generally prefer the current approach where all info is available at 
construction time

I suggest we proceed with the current version.

Nice work on navigating the parser changes, Csaba!


http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@76
PS11, Line 76: public CreateTableStmt createStmt_;
We don't use the "_" for public members.


http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1642
PS11, Line 1642:   private String[][] getHintStyles() {
Can you move this to FrontendTestBase, make it protected, and use it in the 
other appropriate places. I found this thing in:

AnalyzeDDLTest.java
AnalyzeStmtsTest.java
ParserTest.java
ToSqlTest.java


http://gerrit.cloudera.org:8080/#/c/8400/11/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1651
PS11, Line 1651:   public void TestCreateTableAsSelectWithHints() throws 
AnalysisException {
* I think the more important testing is in ParserTest#TestPlanHints similar to 
what we do for INSERT. We should check that the generated InsertStmt has the 
hints properly set.
* These tests are almost identical to AnalyzeStmts#TestInsertHints. The 
copy+paste worries me.
* Combining this test with TestInsertHints() seems tricky, but might be doable.
* Alternatively, we could shrink this test to the bare minimum and comment that 
coverage is provided by ParserTest#TestPlanHints (tests that hints are set in 
the CTAS' InsertStmt) and AnalyzeStmts#TestInsertHints() (tests analyzing an 
InsertStmt with hints)

What do you think?


http://gerrit.cloudera.org:8080/#/c/8400/11/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/11/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@207
PS11, Line 207: # IMPALA-4167: noclustered hint is ignored if the CTAS has 
partitions and sort columns.
* Any ideas to trim down the tests? Seems like a lot of new tests and we should 
have existing tests for INSERT that cover most parts.
* Do we really need both PLAN and DISTRIBUTEDPLAN for these tests?



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 31 Jan 2018 01:38:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-18 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#11).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month) as
select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
5 files changed, 330 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/11
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8400/10/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/10/fe/src/main/cup/sql-parser.cup@1185
PS10, Line 1185:   e
> nit: indent 2 spaces
Done



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 18 Jan 2018 20:22:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 10:

Please see my previous comment and the comment here: 
http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@69


--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 18 Jan 2018 20:01:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8400/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8400/8//COMMIT_MSG@20
PS8, Line 20:
> nit: wrong word?
Done


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/cup/sql-parser.cup@1055
PS6, Line 1055: plan_hints:hints
> nit: please check that all review comments are marked as "Done" when sendin
Done


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@69
PS6, Line 69:* Helper class for parsing.
> Thanks for the explanation, I think I understand it now. Could we just have
I have no problems with going that way, but I am not sure that it would make 
the change smaller: InsertStmt would also need a Set/AddPlanHints function, and 
it would be probably better to rewrite InsertStmt to always use the setter 
instead of constructor arguments.


http://gerrit.cloudera.org:8080/#/c/8400/8/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/8/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@71
PS8, Line 71:
> nit: "rule that checks" or "rules that check"
Done


http://gerrit.cloudera.org:8080/#/c/8400/8/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@75
PS8, Line 75: > partitio
> nit: we generally seem to use camel case for acronyms, e.g. see createCtasT
Done


http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@344
PS6, Line 344: 01:EXCHANGE [UNPARTITIONED]
> Yes, I think it's intentional. Maybe the docs need to be more clear. Can yo
I have created IMPALA-6408 to track the doc issue.



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 17 Jan 2018 16:04:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-17 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#10).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month) as
select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
5 files changed, 332 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/10
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-17 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#9).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month) as
select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
5 files changed, 331 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/9
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-16 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8400/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8400/8//COMMIT_MSG@20
PS8, Line 20: to
nit: wrong word?


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/cup/sql-parser.cup@1055
PS6, Line 1055: eate_tbl_as_sele
> Does opt_plan_hints not work here?
nit: please check that all review comments are marked as "Done" when sending 
out replies.


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@69
PS6, Line 69:* Helper class for parsing.
> It is still needed, as it is the result value of non-terminal create_tbl_as
Thanks for the explanation, I think I understand it now. Could we just have a 
setter for the plan hints in CreateTableAsSelectStmt and then call 
stmt.AddHints(hints) inside the "create_tbl_as_select_stmt" rule? That way we 
don't have to add another class, the change would be smaller, and the parsing 
code easier to follow.


http://gerrit.cloudera.org:8080/#/c/8400/8/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/8/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@71
PS8, Line 71: rules that checks
nit: "rule that checks" or "rules that check"


http://gerrit.cloudera.org:8080/#/c/8400/8/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@75
PS8, Line 75: CTASParams
nit: we generally seem to use camel case for acronyms, e.g. see 
createCtasTarget below.


http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@344
PS6, Line 344: 01:SORT
> I should add a comment about this, but I am unsure whether it should be a s
Yes, I think it's intentional. Maybe the docs need to be more clear. Can you 
open a docs JIRA?



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 16 Jan 2018 20:35:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 7:

(7 comments)

I have overwritten the wrong test in patch set 7, please ignore it and compare 
patch 6 to 8.

http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG@19
PS6, Line 19: nly
> add "more memory efficient"? That makes it more obvious why it is more effi
I hope it is correct and understandable enough this way.


http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG@27
PS6, Line 27: shuffle:
> I don't think this is true, is it? For unpartitioned tables, what would the
Done


http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG@35
PS6, Line 35:
> Mention how you tested it in the commit message.
I wrote about the current state of tests, but I became unsure during the 
process - should I add tests that actually execute the CTASs with hints? The 
reason why not to add them is that the non-DDL part of query execution is 
really the same as in INSERT tests and can be quite slow.


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@69
PS6, Line 69:* Helper class for parsing.
> Is this class still needed or a left-over from earlier patch-sets? Can you
It is still needed, as it is the result value of non-terminal 
create_tbl_as_select_stmt_inner.

I am not sure that this is the correct place for this explanation, maybe it 
should be moved to the .cup.


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1701
PS6, Line 1701: Multiple non-conflicting
> Can you add one that has multiple distinct hints, like shuffle, clustered?
Done


http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@233
PS6, Line 233: shuffle
> Is that true? Without specifying a hint there should be cases where the cos
Thanks for spotting this, I have relied too much on experimentation instead of 
checking the code... I have split this test into two to test both outcomes.


http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@344
PS6, Line 344: |  order by: year ASC NULLS
> At first I didn't understand what this one does (I saw that insert.test:574
I should add a comment about this, but I am unsure whether it should be a 
simple comment, or a todo about a strange behavior. This looks intentional (see 
https://gerrit.cloudera.org/#/c/4162/ ), but it is not mentioned explicitly in 
the documentation. It is true that this reduces the number of writes overall, 
but as a side effect, it causes the whole table to be written to one node. 
Writing the whole table to the coordinator may be useful in some cases (like a 
temp table?), but I think that shuffle has a different intention. I would ask 
Greg / Alex about their opinion.



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Mon, 15 Jan 2018 18:06:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-15 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#8).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month) as
select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce to files kept
open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
5 files changed, 331 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/8
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-15 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#7).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month) as
select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce to files kept
open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
5 files changed, 330 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/7
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-01-08 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 6:

(8 comments)

Apologies for the delayed review. I had a few more questions. Let's discuss the 
parser changes in person if you feel I misunderstood the approach.

http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG@19
PS6, Line 19: more
add "more memory efficient"? That makes it more obvious why it is more 
efficient to do additional work (sorting).


http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG@27
PS6, Line 27: Add exchange node before insert even in case of unpartitioned 
tables.
I don't think this is true, is it? For unpartitioned tables, what would the 
shuffle / exchange do? Instead "shuffle" forces an exchange, "noshuffle" 
prevents adding one, and leaving it out results in the planner making a 
cost-based decision (DistributedPlanner.java:218).


http://gerrit.cloudera.org:8080/#/c/8400/6//COMMIT_MSG@35
PS6, Line 35: was moved from tbl_def_without_col_defs to statement rules.
Mention how you tested it in the commit message.


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/cup/sql-parser.cup@1055
PS6, Line 1055: plan_hints:hints
Does opt_plan_hints not work here?


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@69
PS6, Line 69:* Helper class for parsing.
Is this class still needed or a left-over from earlier patch-sets? Can you 
expand the class comment to explain why we need it and what it does?


http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1701
PS6, Line 1701: Multiple non-conflicting
Can you add one that has multiple distinct hints, like shuffle, clustered?


http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@233
PS6, Line 233: shuffle
Is that true? Without specifying a hint there should be cases where the cost 
based heuristic decides against adding an exchange node.


http://gerrit.cloudera.org:8080/#/c/8400/6/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@344
PS6, Line 344: 01:EXCHANGE [UNPARTITIONED]
At first I didn't understand what this one does (I saw that insert.test:574 has 
the same). On my machine this exchange collects all rows at the coordinator. 
Maybe you want to add that in a comment?



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Mon, 08 Jan 2018 14:04:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2017-12-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/cup/sql-parser.cup@1167
PS3, Line 1167: tbl_def_without_col_defs ::=
> It's important that it is not accepted: we should only accept supported opt
Done in cup. I did not find another way (that does not add a lot of 
duplication) than moving KW_CREATE to high level rules. The problem was that 
the parser cannot differentiate between the CTAS and CREATE until it reaches 
the AS SELECT part, but it needs this information to decide whether to add an 
opt_plan_hints non-terminal or not if there is no hint after CREATE.


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1633
PS3, Line 1633: Only
> Did you find a good way? If not, please clean up this code with its own int
No, I did not find a really good way. The duplication could be merged, but the 
resulting code would be quite hard to understand, mainly because the format + 
partitions of a table are given in inserts, but have to specified in CTAS. This 
could be done by mapping table names (used in insert) to sql options (used in 
CTAS), and substituting these strings in the statements, but this would ruin 
the readability of the statements.


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1647
PS3, Line 1647:   // HBase tables cannot be tested, as currently Impala 
cannot create HBase tables.
> I see. I think it should go somewhere near the top. See my previous comment
Done


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1653
PS3, Line 1653:
> See above.
Done


http://gerrit.cloudera.org:8080/#/c/8400/4/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/4/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@207
PS4, Line 207: noclustered
> nit: remove the "" or add them in the test below?
Done


http://gerrit.cloudera.org:8080/#/c/8400/4/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@233
PS4, Line 233: +noclustered,shuffle
> nit: spaces
Done


http://gerrit.cloudera.org:8080/#/c/8400/4/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@251
PS4, Line 251: # IMPALA-4167: noclustered hint in CTAS into partitioned HDFS 
table has no effect as it
> I think it's better to not mention the default behavior here. The important
Done



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 22 Dec 2017 15:14:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2017-12-22 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#6).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month) as
select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition columns before insert to make the insert more
efficient.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavour.

shuffle:
Add exchange node before insert even in case of unpartitioned tables.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
5 files changed, 302 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/6
-- 
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2017-11-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8400 )

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..


Patch Set 3:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/8400/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8400/3//COMMIT_MSG@9
PS3, Line 9: Adding support for "clustered", "noclustered", "shuffle" and 
"noshuffle"
> How about "This change adds support..."
Done


http://gerrit.cloudera.org:8080/#/c/8400/3//COMMIT_MSG@12
PS3, Line 12: example:
> Capitalize
Done


http://gerrit.cloudera.org:8080/#/c/8400/3//COMMIT_MSG@19
PS3, Line 19: Sort by partition columns before insert to make the insert more 
efficient.
> Mention where exactly the sort happens (locally vs distributed).
Done


http://gerrit.cloudera.org:8080/#/c/8400/3//COMMIT_MSG@26
PS3, Line 26: exchenge
> spelling
Done


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/cup/sql-parser.cup@1054
PS3, Line 1054: RESULT = new CreateTableAsSelectStmt(new 
CreateTableStmt(tbl_def),
> nit: Can you wrap the lines at 90 chars?
Done


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/cup/sql-parser.cup@1167
PS3, Line 1167: tbl_def_without_col_defs ::=
> Does that now also allow hints like "CREATE /*+ clustered */ TABLE FOO LIKE
It is also accepted in that case, but has no effect - should I disallow it?

I think that it would be the best to enable hints in several cases (even if no 
hint is accepted there currently), and check the hints in normal java code and 
warn if one is not valid for a given statement. I am not sure though.


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/cup/sql-parser.cup@1168
PS3, Line 1168:   KW_CREATE external_val:external opt_plan_hints:hints KW_TABLE 
if_not_exists_val:if_not_exists
> long line
Done


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/main/java/org/apache/impala/analysis/TableDef.java@155
PS3, Line 155:   TableDef(TableName tableName, boolean isExternal, boolean 
ifNotExists, List hints) {
> long line
Done


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1633
PS3, Line 1633: Only
> What does "only" mean here?
This was copy pasted - I guess it means "not error, just warning".

I did not want to deviate too much from the original 
(AnalyzeStmtsTest.TestInsertHints()), to make it easier to merge the two if I 
find a good way to do it.


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1647
PS3, Line 1647:   // HBase tables cannot be tested, as currently Impala 
cannot create HBase tables.
> It's weird that this explanation occurs in the middle of the function. Can
This is also copy paste legacy - AnalyzeStmtsTest.TestInsertHints() had an 
HBase test here.


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1653
PS3, Line 1653:
> Why is there a newline?
Also copy paste legacy.


http://gerrit.cloudera.org:8080/#/c/8400/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1677
PS3, Line 1677: "select * from functional.alltypes");
> Please also add tests that have additional hints in the select clause.
Done


http://gerrit.cloudera.org:8080/#/c/8400/3/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
File testdata/workloads/functional-planner/queries/PlannerTest/ddl.test:

http://gerrit.cloudera.org:8080/#/c/8400/3/testdata/workloads/functional-planner/queries/PlannerTest/ddl.test@207
PS3, Line 207: # IMPALA-4167: clustered CTAS into partitioned table adds sort 
node.
> Can you add tests for noclustered and for sort by()?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 17 Nov 2017 00:09:02 +
Gerrit-HasComments: Yes