[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1874/


--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 03 Feb 2018 02:11:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-02 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 11: Code-Review+2

Hit IMPALA-6472, rebased, carrying Alex's +2.


--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 03 Feb 2018 02:10:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1866/


--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 03 Feb 2018 01:41:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1866/


--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Feb 2018 22:04:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-02 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 10: Code-Review+2

PS10 is a final rebase before submitting the change. Carrying Alex's +2.


--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Feb 2018 22:04:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-02 Thread Lars Volker (Code Review)
Hello Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9153

to look at the new patch set (#10).

Change subject: IMPALA-5293: Turn insert clustering on by default
..

IMPALA-5293: Turn insert clustering on by default

This change enables clustering by default. IMPALA-2521 introduced the
'clustered' hint which inserts a local sort by the partitioning columns
to a query plan. The hint is only effective for HDFS and Kudu tables.

Like before, the 'noclustered' hint prevents clustering. If a table has
ordering columns defined, the 'noclustered' hint is ignored and we
issue a warning.

This change removes some tests that were added specifically to test
that clustering can be enabled using the 'clustered' hint. It changes
some tests to use the 'noclustered' hint to make sure that clustering
can be disabled. It also adds tests to make sure that we cover the
'noclustered' case properly.

Cherry-picks: not for 2.x.

Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
12 files changed, 248 insertions(+), 66 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/10
--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-01 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 8:

PS8 makes test_stats_extrapolation resilient against input data order 
variations. PS9 rebases the change. Carrying Alex's +2.


--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 8
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Feb 2018 18:10:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-02-01 Thread Lars Volker (Code Review)
Hello Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9153

to look at the new patch set (#8).

Change subject: IMPALA-5293: Turn insert clustering on by default
..

IMPALA-5293: Turn insert clustering on by default

This change enables clustering by default. IMPALA-2521 introduces the
'clustered' hint which inserts a local sort by the partitioning columns
to a query plan. The hint is only effective for HDFS and Kudu tables.

Like before, the 'noclustered' hint prevents clustering. If a table has
ordering columns defined, the 'noclustered' hint is ignored and we
issue a warning.

This change removes some tests that were added specifically to test
that clustering can be enabled using the 'clustered' hint. It changes
some tests to use the 'noclustered' hint to make sure that clustering
can be disabled. It also adds tests to make sure that we cover the
'noclustered' case properly.

Cherry-picks: not for 2.x.

Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
12 files changed, 248 insertions(+), 66 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/8
--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 8
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Feb 2018 00:49:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 7:

(1 comment)

Please see PS7.

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813
PS5, Line 813:
> I think we should remove this planner test because it's somewhat misleading
Done



--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Feb 2018 00:45:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Lars Volker (Code Review)
Hello Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9153

to look at the new patch set (#7).

Change subject: IMPALA-5293: Turn insert clustering on by default
..

IMPALA-5293: Turn insert clustering on by default

This change enables clustering by default. IMPALA-2521 introduces the
'clustered' hint which inserts a local sort by the partitioning columns
to a query plan. The hint is only effective for HDFS and Kudu tables.

Like before, the 'noclustered' hint prevents clustering. If a table has
ordering columns defined, the 'noclustered' hint is ignored and we
issue a warning.

This change removes some tests that were added specifically to test
that clustering can be enabled using the 'clustered' hint. It changes
some tests to use the 'noclustered' hint to make sure that clustering
can be disabled. It also adds tests to make sure that we cover the
'noclustered' case properly.

Cherry-picks: not for 2.x.

Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
12 files changed, 255 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/7
--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813
PS5, Line 813: # IMPALA-6280: clustered (default) with outer join, inline view, 
and TupleisNullPredicate
> I changed it to use alltypessmall like the other tests in this file.
I think we should remove this planner test because it's somewhat misleading. 
The bug in IMPALA-6280 only shows up at runtime, so this planner test arguably 
does not cover IMPALA-6280.



--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Feb 2018 00:33:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 6:

(4 comments)

Thanks for the review, please see my inline comments and PS6.

http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG@11
PS5, Line 11: to a query plan. The hint is only effective for HDFS and Kudu 
tables.
> Why does clustered issue a warning? We change defaults all the time and use
Leftover from a previous PS. :(

Removed it.


http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1893
PS5, Line 1893:   // Test clustered hint.
> Where's the expected warning?
Removed in last PS. Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813
PS5, Line 813: # IMPALA-6280: clustered (default) with outer join, inline view, 
and TupleisNullPredicate
> Where did this come from?
I changed it to use alltypessmall like the other tests in this file.

This query is tested in QueryTest/insert.test and in one of the previous PSs, 
Tim pointed out that it should still have a clustering node. I added this test 
to make sure that it does so in the plan, too. Let me know if you'd like to 
remove the test.


http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
File testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test:

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test@550
PS5, Line 550: # sure that noclustered hint prevents addition of a sort node 
before writing to HDFS.
> of a sort node
Done



--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Feb 2018 00:03:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Lars Volker (Code Review)
Hello Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9153

to look at the new patch set (#6).

Change subject: IMPALA-5293: Turn insert clustering on by default
..

IMPALA-5293: Turn insert clustering on by default

This change enables clustering by default. IMPALA-2521 introduces the
'clustered' hint which inserts a local sort by the partitioning columns
to a query plan. The hint is only effective for HDFS and Kudu tables.

Like before, the 'noclustered' hint prevents clustering. If a table has
ordering columns defined, the 'noclustered' hint is ignored and we
issue a warning.

This change removes some tests that were added specifically to test
that clustering can be enabled using the 'clustered' hint. It changes
some tests to use the 'noclustered' hint to make sure that clustering
can be disabled. It also adds tests to make sure that we cover the
'noclustered' case properly.

Cherry-picks: not for 2.x.

Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
12 files changed, 305 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/6
--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9153/5//COMMIT_MSG@11
PS5, Line 11: to a query plan. Specifying 'clustered' will now issue a warning. 
The
Why does clustered issue a warning? We change defaults all the time and users 
might want to fix the plan with hints. I don't think we should warn if hints 
have no effect.


http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/9153/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1893
PS5, Line 1893:   // Test that clustered hint issues a warning.
Where's the expected warning?


http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@813
PS5, Line 813: # IMPALA-6280: clustered (default) with outer join, inline view, 
and TupleisNullPredicate
Where did this come from?

Would be great to avoid using alltypesinsert outside of the end-to-end tests. 
That table is pretty weird and probably should not be part of our 
functional_schema_template, but that's a longer story


http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
File testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test:

http://gerrit.cloudera.org:8080/#/c/9153/5/testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test@550
PS5, Line 550: # sure that noclustered hint prevents addition of a sort not 
before writing to HDFS.
of a sort node



--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 31 Jan 2018 23:04:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 5:

(1 comment)

Thanks for the review. The latest PS also fixes some tests that had failed in a 
private exhaustive run. I'm currently rerunning an exhaustive build with the 
latest PS.

http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@831
PS3, Line 831: hasClusteredHint_ = true;
> I'd lean towards re-adding it for Kudu tables, although it is not that usef
Sounds good to me. I created IMPALA-6466 to add documentation for the clustered 
hint for Kudu tables.



--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 31 Jan 2018 19:33:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-31 Thread Lars Volker (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9153

to look at the new patch set (#5).

Change subject: IMPALA-5293: Turn insert clustering on by default
..

IMPALA-5293: Turn insert clustering on by default

This change enables clustering by default. IMPALA-2521 introduces the
'clustered' hint which inserts a local sort by the partitioning columns
to a query plan. Specifying 'clustered' will now issue a warning. The
hint is only effective for HDFS and Kudu tables.

Like before, the 'noclustered' hint prevents clustering. If a table has
ordering columns defined, the 'noclustered' hint is ignored and we
issue a warning.

This change also changes the behavior for Kudu tables. Specifying
'clustered' will no longer force clustering. Instead, clustering is
always enabled except in these cases:

* the 'noclustered' hint is present
* the plan only involves a single execution node
* the target table is unpartitioned.

This change removes some tests that were added specifically to test
that clustering can be enabled using the 'clustered' hint. It changes
some tests to use the 'noclustered' hint to make sure that clustering
can be disabled. It also adds tests to make sure that we cover the
'noclustered' case properly.

Cherry-picks: not for 2.x.

Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
12 files changed, 302 insertions(+), 77 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/5
--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-30 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@831
PS3, Line 831: analyzer.addWarning("Clustering is enabled by default, 
use 'noclustered' to " +
> Adding the hint should not have any effect since we don't ensure to add a c
I'd lean towards re-adding it for Kudu tables, although it is not that useful - 
ignoring the hint seems a bit weird.

I do feel like we shouldn't warn here since it could be annoying and there's no 
way to disable the warning.

I understand that confused users might add it without understanding it's 
currently a no-op, but that seems fairly harmless. It's a bit of a problem if 
the user actually wants to retain the hint - either it they might be carried 
over from a query written when it wasn't a default, or maybe the user is 
intending to force a particular plan to prevent future plan changes (or ensure 
consistent behaviour on different versions of Impala).



--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Jan 2018 23:18:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-30 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9153 )

Change subject: IMPALA-5293: Turn insert clustering on by default
..


Patch Set 4:

(5 comments)

Thanks for the review. Please see my inline comments and PS4.

http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@831
PS3, Line 831: analyzer.addWarning("Clustering is enabled by default, 
use 'noclustered' to " +
> I'm wondering if the warning is necessary. I don't think we generally warn
Adding the hint should not have any effect since we don't ensure to add a 
clustering sort if it is present. Would you prefer to re-add it for Kudu tables 
(see Planner.java)?


http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

http://gerrit.cloudera.org:8080/#/c/9153/3/fe/src/main/java/org/apache/impala/planner/Planner.java@615
PS3, Line 615: primar
> primary
Done


http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/empty.test
File testdata/workloads/functional-planner/queries/PlannerTest/empty.test:

http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/empty.test@302
PS3, Line 302: insert into functional.alltypes partition(year, month)
> IMO this seems ok, the benefit of optimising this case would be very margin
Thanks for the feedback. I double checked with Alex and removed the todo.


http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@101
PS3, Line 101: # IMPALA-5293: noclustered hint prevents adding sort node
> Do we also need to test the case when there's a noclustered hint and sort.c
Insert tests for tables with sort columns are in their own file, the particular 
case is covered in 
testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test:49.


http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-query/queries/QueryTest/insert.test
File testdata/workloads/functional-query/queries/QueryTest/insert.test:

http://gerrit.cloudera.org:8080/#/c/9153/3/testdata/workloads/functional-query/queries/QueryTest/insert.test@949
PS3, Line 949: # IMPALA-6280: clustered (default) with outer join, inline view, 
and TupleisNullPredicate
> It looks like this one should still be a clustered insert, since IMPALA-628
Thanks for catching this. I also added a test for this query to 
PlannerTest/insert.test to make sure we get the plan right.



--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Jan 2018 20:06:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5293: Turn insert clustering on by default

2018-01-30 Thread Lars Volker (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9153

to look at the new patch set (#4).

Change subject: IMPALA-5293: Turn insert clustering on by default
..

IMPALA-5293: Turn insert clustering on by default

This change enables clustering by default. IMPALA-2521 introduces the
'clustered' hint which inserts a local sort by the partitioning columns
to a query plan. Specifying 'clustered' will now issue a warning. The
hint is only effective for HDFS and Kudu tables.

Like before, the 'noclustered' hint prevents clustering. If a table has
ordering columns defined, the 'noclustered' hint is ignored and we
issue a warning.

This change also changes the behavior for Kudu tables. Specifying
'clustered' will no longer force clustering. Instead, clustering is
always enabled except in these cases:

* the 'noclustered' hint is present
* the plan only involves a single execution node
* the target table is unpartitioned.

This change removes some tests that were added specifically to test
that clustering can be enabled using the 'clustered' hint. It changes
some tests to use the 'noclustered' hint to make sure that clustering
can be disabled. It also adds tests to make sure that we cover the
'noclustered' case properly.

Cherry-picks: not for 2.x.

Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
11 files changed, 302 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9153/4
--
To view, visit http://gerrit.cloudera.org:8080/9153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbf2368cf4415e6ecfa65058daf6ff87ef62f9d9
Gerrit-Change-Number: 9153
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Tim Armstrong