[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 13: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Reviewed-on: http://gerrit.cloudera.org:8080/4745
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 418 insertions(+), 44 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 13: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 13:

Alex, can you +2 this again? The tests got renamed and I didn't catch the 
syntax error when running locally. :(

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Hello Marcel Kornacker, Internal Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#13).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 418 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/13
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 12: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/360/

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 12: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 12:

Rebase done in PS12

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Hello Marcel Kornacker, Internal Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#12).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 417 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/12
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 11:

Fixed AnalyzeStmtsTest, also needs another rebase due to more Kudu related 
changes making it into master before this one. Will push another PS.

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Lars Volker (Code Review)
Hello Marcel Kornacker, Internal Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#11).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 418 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/11
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 11: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/357/

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-25 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/356/

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/355/

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10: Code-Review+1

Thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10:

(1 comment)

I added a test. I also had to rebase onto Dimitris' change, so now the diff 
between PS 9 and 10 is a bit messy. Sorry for that.

http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

Line 635: |--02:AGGREGATE [FINALIZE]
> Sorry just thought of this. Can you also add a test like this that has a WH
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Lars Volker (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#10).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 417 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

Line 635: # IMPALA-2521: clustered insert into non-partitioned table does not 
add sort node.
Sorry just thought of this. Can you also add a test like this that has a 
WHERE-clause subquery? We should test the reset() + analyze() path when 
subqueries are rewritten.


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 9:

(6 comments)

Thanks for the comments, please see PS9.

http://gerrit.cloudera.org:8080/#/c/4745/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 602:   }
> Thanks. The reason why it works without is that SortNode.init() during plan
Ok


Line 604:   if (matchFound && table_ instanceof KuduTable) {
> I vote for keep
Ok


http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 603:   // Store exprs for Kudu key columns.
> add line breaks between if blocks, it's harder to navigate without those ('
Done


http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 237:* outputs of aggregation.
> something missing there. (i know, not your fault.)
Yes, I couldn't figure out in git blame what that sentence originally meant. 
Since I don't understand enough of the code to complete it, I removed it. Let 
me know if I should replace it with something else.


http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 247: # IMPALA-2521: clustered insert into table adds sort node, correctly 
substituting exprs.
> you can't always see the correct substitution until you run the query (beca
Done


Line 249: select id, name, maxzip as zip
> use fewer columns, add an analytic function in the inline view
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Lars Volker (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#9).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 367 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-23 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 8:

(3 comments)

i won't look at the substitution logic in detail because it gives me a 
headache. :)

but please add that test.

http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 603:   // Store exprs for Kudu key columns.
add line breaks between if blocks, it's harder to navigate without those ('{' 
in vi let's you jump by a 'paragraph'). at least add one at l613


http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 237:* the outputs of aggregation. Invoked for UnionStmt as well since
something missing there. (i know, not your fault.)


http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 247: # IMPALA-2521: clustered insert into table adds sort node, correctly 
substituting exprs.
you can't always see the correct substitution until you run the query (because 
the column labels can be the same even when you're pointing at the wrong tuple).

let's add query tests as well, what you have so far should run (even if you 
won't see any benefit in the hdfs case).


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 8: Code-Review+1

(3 comments)

I'm happy with the patch after the remaining comments. I imagine Marcel will 
want to take another pass, so only giving +1.

http://gerrit.cloudera.org:8080/#/c/4745/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 602:   }
> I added a test for this, however it also seems to work without the substitu
Thanks. The reason why it works without is that SortNode.init() during planning 
will substitute the ordering exprs appropriately. However, it's cleaner to 
leave your changes because the PK exprs do need to be substituted like the 
other ones, but it just happens to work in this case because of how the PK 
exprs are used exactly.


Line 604:   if (matchFound && table_ instanceof KuduTable) {
> Done, though after finding that List supports contains I might as well inli
I vote for keep


http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 249: select id, bool_col, tinyint_col, smallint_col, int_col, bigint_col, 
float_col, double_col,
use fewer columns, add an analytic function in the inline view


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-21 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 8:

(7 comments)

Thanks for the review, please see PS8, as well as my comment regarding expr 
substitution.

http://gerrit.cloudera.org:8080/#/c/4745/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 132:   private ArrayList primaryKeyExprs_ = Lists.newArrayList();
> how about primaryKeyExprs_?
Done


Line 517:* expressions.
> Mention that as point 4
Done


Line 601: }
> Kudu (it's a proper noun)
Done


Line 602:   }
> You also need to substitute the kuduExprs_ in InsertStmt.substituteResultEx
I added a test for this, however it also seems to work without the 
substitution. What am I missing? Also, are c_i and k_j particular columns? I 
tried reordering them, but it still does not fail when omitting the 
substitution.


Line 604:   if (matchFound && table_ instanceof KuduTable) {
> add helper KuduTable.isPrimaryKeyColumn()
Done, though after finding that List supports contains I might as well inline 
it as it is just one line. Inline or Keep?


http://gerrit.cloudera.org:8080/#/c/4745/5/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 499:* (key columns for Kudu tables), so that partitions can be written 
sequentially in the
> Kudu tables
Done


http://gerrit.cloudera.org:8080/#/c/4745/5/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 248: insert into table functional_kudu.alltypes /*+ clustered */
> shuffle/noshuffle don't do anything for Kudu tables, so I don't think this 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-21 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#8).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
8 files changed, 298 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-21 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#7).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
8 files changed, 295 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-21 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#6).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
8 files changed, 295 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-19 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/4745/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 132:   private ArrayList kuduKeyExprs_ = Lists.newArrayList();
how about primaryKeyExprs_?


Line 517:* expressions. Result exprs for key columns of kudu tables are 
stored in kuduKeyExprs_.
Mention that as point 4


Line 601:   // Store exprs for kudu key columns.
Kudu (it's a proper noun)


Line 602:   if (matchFound && table_ instanceof KuduTable) {
You also need to substitute the kuduExprs_ in 
InsertStmt.substituteResultExprs().

You can try something like this (and add a test):

INSERT INTO kudu_table /*+ clustered */
SELECT * FROM (SELECT c1, c2, k1, k2) FROM src_table


Line 604: for (String kuduKeyColumnName: 
kuduTable.getKuduKeyColumnNames()) {
add helper KuduTable.isPrimaryKeyColumn()


http://gerrit.cloudera.org:8080/#/c/4745/5/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 499:* (key columns for kudu tables), so that partitions can be written 
sequentially in the
Kudu tables


http://gerrit.cloudera.org:8080/#/c/4745/5/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 248: insert into table functional_kudu.alltypes /*+ clustered,shuffle */
shuffle/noshuffle don't do anything for Kudu tables, so I don't think this is 
necessary

they might do something in the future


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-19 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#5).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
7 files changed, 288 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 4:

(8 comments)

> (8 comments)
 > 
 > Nice! Current PS looks good to me. Do you intent to add Kudu
 > support in a separate patch?

Thanks for the comments. I had a look and it didn't add too much code, so I 
added kudu support and tests here. It also seemed to fit into the scope of the 
Jira.

http://gerrit.cloudera.org:8080/#/c/4745/4/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 255:   SlotRef origSlotRef = (SlotRef) smap.getLhs().get(i);
> Seems clearer to call these inputSlotRef and outputSlotRef
Done


http://gerrit.cloudera.org:8080/#/c/4745/4/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

Line 149:* output by the sort node. Done by materializing slot refs in the 
order-by and result
> ... and given result expressions...
Done


Line 151:* slot refs into the new tuple. This simplifies sorting logic for 
total and top-n
> the sorting logic
Done


Line 164: // The tuple descriptor for the sort output. It will contain the 
materialized tuples,
> Not clear what "It" refers to in the second sentence, can you rephrase?
Done


Line 185: // ones that point to the slot refs the new, materialized input 
rows.
> .. slot refs into the sort's output tuple.
Done


Line 188: // Update the tuple descriptor used to materialize the input 
tuple of the sort.
> ... used to materialize the input of the sort.
Done


http://gerrit.cloudera.org:8080/#/c/4745/4/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 496:* Insert a sort node into the plan, depending on the 
clustered/noclustered plan hint.
> Insert a sort node on top of the plan, ...
Done


Line 505: List partitionExprs = 
Lists.newArrayList(insertStmt.getPartitionKeyExprs());
> orderingExprs
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-19 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 4:

(8 comments)

Nice! Current PS looks good to me. Do you intent to add Kudu support in a 
separate patch?

http://gerrit.cloudera.org:8080/#/c/4745/4/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 255:   SlotRef origSlotRef = (SlotRef) smap.getLhs().get(i);
Seems clearer to call these inputSlotRef and outputSlotRef


http://gerrit.cloudera.org:8080/#/c/4745/4/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

Line 149:* output by the sort node. Done by materializing slot refs in the 
order-by and result
... and given result expressions...


Line 151:* slot refs into the new tuple. This simplifies sorting logic for 
total and top-n
the sorting logic


Line 164: // The tuple descriptor for the sort output. It will contain the 
materialized tuples,
Not clear what "It" refers to in the second sentence, can you rephrase?


Line 185: // ones that point to the slot refs the new, materialized input 
rows.
.. slot refs into the sort's output tuple.


Need to be careful with the terms 'tuple' and 'row' because they have a very 
specific and different meanings in Impala.


Line 188: // Update the tuple descriptor used to materialize the input 
tuple of the sort.
... used to materialize the input of the sort.


http://gerrit.cloudera.org:8080/#/c/4745/4/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 496:* Insert a sort node into the plan, depending on the 
clustered/noclustered plan hint.
Insert a sort node on top of the plan, ...


Line 505: List partitionExprs = 
Lists.newArrayList(insertStmt.getPartitionKeyExprs());
orderingExprs


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-19 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#4).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
6 files changed, 216 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 4:

(24 comments)

Thanks for the comments, please see PS4. As pointed out in one of the replies I 
will look at Kudu support next.

http://gerrit.cloudera.org:8080/#/c/4745/2//COMMIT_MSG
Commit Message:

Line 11: plan, just before the table sink. This has the effect that data will be
> -distributed (also adds to the single-node plan)
Done


http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 121:   private boolean hasClusteredHint_ = false;
> but how do you capture 'noclustered'? that's not the same as the absence of
Right now the default is 'noclustered' and is not captured explicitly. Once the 
default changes to 'clustered' we can change the code to reflect that.


http://gerrit.cloudera.org:8080/#/c/4745/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 238:*/
> move this TODO into SortInfo.createSortTupleInfo()
Done


Line 253:   Preconditions.checkState(smap.getLhs().get(i) instanceof 
SlotRef);
> these few lines can also be moved into createSortTupleInfo() right?
Excellent, I didn't see this.


http://gerrit.cloudera.org:8080/#/c/4745/2/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

Line 125: 
> refer to the new tuple materialized by this sort instead of the original in
Done


Line 145:   public SortInfo clone() { return new SortInfo(this); }
> shorter: by the sort node
Done


Line 146: 
> comment about result expressions only makes sense if we pass in the result 
True, now it's aligned with the resultExprs actually being passed in.


Line 147:   /**
> we also do this for top-n
Done


Line 152:* sorts. The substitution map is returned.
> I'd suggest removing the 'stmt' parameter and instead returning the ExprEub
Done


Line 153:* TODO: We could do something more sophisticated than simply 
copying input slot refs -
> Isn't this the tuple descriptor for the sort output? The secnd sentence is 
I think it is both, since the sort node operates only on these. However output 
seems to be more intuitive, so I changed it.


Line 163: 
> builds up the tuple operated on and returned by sort node
Done


Line 171: // substOrderBy is the mapping from slot refs in the sort node's 
input to slot refs in
> I think the else case may not be needed/possible, but the explanation is th
Yes, thanks for the explanation.


Line 173: // the tuple operated on and returned by the sort node.
> Can we also move this to the caller? The returned smap can be used to creat
Done


Line 181:   sortTupleExprs.add(origSlotRef);
> let's let the caller make this decision without passing a flag
Done


http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

Line 275:   private PlanFragment createScanFragment(PlanNode node) {
> never mind, the structure looks good as it is.
Ok.


http://gerrit.cloudera.org:8080/#/c/4745/2/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 149:   // Add optional sort node to the plan, based on 
clustered/noclustered plan hint.
> Let's keep it concrete: we're adding a sort node
Done


Line 151:   // set up table sink for root fragment
> there's no ClusteringNode. something like 'clusterInsertInput'/ 'createClus
Done


Line 497:* This will sort the data produced by 'inputFragment' by the 
clustering columns, so
> Mention the ordering of the sort
Done


Line 498:* that partitions can be written sequentially in the table sink.
> Second sentence is kind of vague. I understand that sorting is just an inst
Done


Line 503: if (!insertStmt.hasClusteredHint()) return;
> single line (and elsewhere)
Done


Line 508: 
> partitioning?
I took that comment from DistributedPlanner::createInsertFragment() and I 
thought it made sense to remove all constants during the clustering. Changed 
the word.


http://gerrit.cloudera.org:8080/#/c/4745/2/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

Line 573: # IMPALA-2521: clustered insert into partitioned table adds sort node.
> Let's test these combinations:
Done


Line 575: select * from functional.alltypes
> Let's also test Kudu tables because there we want to sort on the PKs. Let's
I'll try to address this in a subsequent patchset, so we can make progress on 
the rest.


Line 576:  PLAN
> also add  PLAN section
Done


-- 
To view, visit http://gerrit.cloude

[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-19 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#3).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
6 files changed, 217 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-18 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 121:   private boolean hasClusteredHint_ = false;
> We actually only need the ternary logic to detect conflicting hints, so I m
but how do you capture 'noclustered'? that's not the same as the absence of 
'clustered', because we'll want to use 'clustered' eventually by default.


http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

Line 275:   private PlanFragment createScanFragment(PlanNode node) {
> i don't think of the logic as orthogonal, because it modifies the insert pr
never mind, the structure looks good as it is.


http://gerrit.cloudera.org:8080/#/c/4745/2/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 151:   addClusteringNode(insertStmt, rootFragment, 
ctx_.getRootAnalyzer());
there's no ClusteringNode. something like 'clusterInsertInput'/ 
'createClusteringSort'/... would be clearer.


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-18 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

Line 275:   public PlanFragment addClusteringFragment(PlanFragment 
inputFragment,
> Done
i don't think of the logic as orthogonal, because it modifies the insert 
process.

also, i don't see how this is relevant for single-node plans - if you're 
inserting a lot of data with num_nodes = 1, there's something wrong.

you can restructure createInsertFragment to use to two helper functions, for 
distribution and clustering.


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-18 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 2:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/4745/2//COMMIT_MSG
Commit Message:

Line 11: distributed plan, just before the table sink. This has the effect that
-distributed (also adds to the single-node plan)


http://gerrit.cloudera.org:8080/#/c/4745/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 238:* TODO: We could do something more sophisticated than simply 
copying input
move this TODO into SortInfo.createSortTupleInfo()


Line 253: Set sourceSlots = Sets.newHashSet();
these few lines can also be moved into createSortTupleInfo() right?


http://gerrit.cloudera.org:8080/#/c/4745/2/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

Line 125:* the ordering exprs refer to the materialized tuples instead of 
the original input.
refer to the new tuple materialized by this sort instead of the original input.


Line 145:* output by the exec node implementing the sort. Done by 
materializing slot refs in the
shorter: by the sort node


Line 146:* order-by and result expressions. Those slot refs in the ordering 
and result exprs are
comment about result expressions only makes sense if we pass in the result 
expressions to this function, otherwise it's not really clear where those come 
from


Line 147:* substituted with slot refs into the new tuple. This simplifies 
sorting logic for
we also do this for top-n


Line 152:   public void createSortTupleInfo(Set sourceSlots, boolean 
materializeSlots, StatementBase stmt, Analyzer analyzer) {
I'd suggest removing the 'stmt' parameter and instead returning the 
ExprEubstitutionMap. Then the stmt callers can apply that smap to whatever 
exprs they need to change.


Line 153: // The tuple descriptor for the sort input. It will contain the 
materialized tuples,
Isn't this the tuple descriptor for the sort output? The secnd sentence is a 
little imprecise because ("it contains") because a tuple descriptor does not 
contain materialized tuples.


Line 163: // the internal rows of the sort node.
builds up the tuple operated on and returned by sort node


Line 171:   // TODO: I copied this from QueryStmt.java and would like to 
add an explanatory
I think the else case may not be needed/possible, but the explanation is that 
it's ok to move predicates outside of this query block, but it's not ok to move 
predicates in if there is a limit. Consider this query:

select count(*) from t1 inner join
(select id, name from t2
 where id < 10
 order by limit 10) v
on (t1.id = v.id and t1.name = v.name)
where t1.name = 'test'

In the example above it's ok to generate 't1.id < 10' and put it into the scan 
of t1, but it's not ok to generate 't2.name = 'test'' and move it into the scan 
of t2.

Hope this makes sense.


Line 173:   if (stmt.hasLimit()) {
Can we also move this to the caller? The returned smap can be used to create 
this. The reason is that it doesn't really make sense to add these value 
transfers during plan generation (for the clustered hint use case).


Line 181: if (materializeSlots) {
let's let the caller make this decision without passing a flag


http://gerrit.cloudera.org:8080/#/c/4745/2/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 149:   // Add optional clustering node to the plan, based on 
clustered/noclustered plan
Let's keep it concrete: we're adding a sort node


Line 497:* Insert a sort node into the plan, depending on the 
clustered/noclustered plan hint.
Mention the ordering of the sort


Line 498:* This will re-order the data produced by 'inputFragment' in a way 
that partitions can
Second sentence is kind of vague. I understand that sorting is just an instance 
of a more general clustering concept, but let's stick to what we are actually 
doing here :)


Line 503: if (!insertStmt.hasClusteredHint()) {
single line (and elsewhere)


Line 508: // Ignore constants for the sake of partitioning.
partitioning?


http://gerrit.cloudera.org:8080/#/c/4745/2/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

Line 573: # IMPALA-2521: Test to make sure that clustering adds a sort node 
into the plan.
Let's test these combinations:
1. Partitioned tables
1.1. With clustered hint
1.2. With clustered and noshuffle hint
2. Unpartitioned tables
2.1. With clustered hint
2.2. With clustered and shuffle hint


Line 575: select * from functional.alltypes
Let's also test Kudu tables because there we want to sort on the PKs. Let's add 
those tests to a Kudu-specific .test file


Line

[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 1:

(12 comments)

Thanks for the reviews, please see PS2

http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 121:   private boolean hasClusteredHint_ = false;
> might be better to use ternary logic here with a single Boolean (which can 
We actually only need the ternary logic to detect conflicting hints, so I made 
it a boolean and used another local variable in analyzePlanHints() to detect 
that error.


Line 631: if (hasNoShuffleHint_) {
> did you mean noclustered?
Yes, done.


Line 632:   throw new AnalysisException("Conflicting INSERT hint: " + 
hint);
> list both conflicting hints (otherwise it's hard to tell how to fix the pro
Done


http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

Line 68:* sorted. Its source exprs are update to those in tupleSlotExprs.
> second sentence incomprehensible.
Done


Line 123:* the exprs used by the SortNode refer to the materialized tuples 
instead of the
> don't refer to the sort node here, SortInfo shouldn't have to know about th
Done


http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

Line 269:* clustered/noclustered plan hint. The clustering re-order the 
data in 'inputFragment'
> Better to be specific: Say that it adds a sort node that orders on the part
Done


Line 272:   // TODO: this function shares some code with 
QueryStmt::createSortTupleInfo(). However
> You could move most of the code into SortInfo.createSortTupleInfo(List you should move the logic (minus the createSortTupleInfo part that alex bro
createInsertFragment has several early exits (return inputFragment) and it 
isn't that clear to me, how adding this orthogonal functionality would fit in 
nicely. We need to support both cases, adding shuffling and/or clustering 
independently. Should I replace them with nested if-else-clauses?

For now I moved it to the Planner as suggested by Alex and enabled it for 
single-node-plans, too.


Line 275:   public PlanFragment addClusteringFragment(PlanFragment 
inputFragment,
> createInsertFragment() is only called for distributed plans. Should cluster
Done


Line 309: sortTupleDesc.setIsMaterialized(true);
> To indicate that this tuple is a physical tuple needed during runtime execu
Done


Line 324: // TODO Why is this needed here but not in QueryStmt.java? If it 
is omitted, the query
> Before plan generation we call QueryStmt.materializeRequiredSlots() to make
Done


http://gerrit.cloudera.org:8080/#/c/4745/1/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 144: rootFragment = distributedPlanner.addClusteringFragment(
> 1. we should also do this for single-node plans, so adding the clustering d
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-18 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
distributed plan, just before the table sink. This has the effect that
data will be clustered by its partition prior to writing partitions,
which therefore can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
7 files changed, 154 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker