[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..

IMPALA-8718: project out collection slots in analytic's sort tuple

Subplan node is mainly used to extract collection values. It evaluates
its right plan tree (usually a nested loop join) for every row from its
left child (usually a scan producing tuples with collection values), and
returns those rows produced by the right child. Each row (TupleRow)
produced by the join node consists of several tuples from the join
operands. So the scan node tuple that contains collection values will be
part of the output of the join node, then become part of the output of
the subplan node.

When generating analytic plan, a TupleDescriptor for sort is created
based on the materialized slots of the input. If the input comes from a
subplan node, there are collection slots in it. These collection slots
will be picked out into the sort tuple, and occur in the smap of it.
Then the output smap of the analytic plan will contain the collection
slot consequently. This causes IllegalStateException if the analytic
plan is the nullable side of an outer join. The exception is thrown when
we are checking the necessary of adding a TupleIsNullPredicate for each
output slot.

We should project out the collection slots in creating the sort tuple of
analytic plan to avoid causing such an exception. Projecting out them is
safe since outputs of the analytic node must be in the select list of
the block with the analytic, and we don't allow collection types to be
returned from a select block, and also don't support any builtin or UDF
functions that take collection types as an argument.

Tests
 - Add Planner test in analytic-fns.test with VALIDATE_CARDINALITY
 enabled. Also fix some incorrect row-sizes of existing tests.
 - Add e2e test in nested-types-runtime.test to verify that collection
 slots are projected out.

Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Reviewed-on: http://gerrit.cloudera.org:8080/14135
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
5 files changed, 137 insertions(+), 14 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 11 Sep 2019 18:18:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-11 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..


Patch Set 2:

Passed exhaustive tests: https://jenkins.impala.io/job/pre-review-test/430/


--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 11 Sep 2019 13:58:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 11 Sep 2019 13:59:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4936/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 11 Sep 2019 13:59:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 10 Sep 2019 19:56:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4496/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 09 Sep 2019 10:14:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-09 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14135 )

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..


Patch Set 2:

(1 comment)

Thank for looking into this, Tim! Update and publish the patch for review.

http://gerrit.cloudera.org:8080/#/c/14135/1/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

http://gerrit.cloudera.org:8080/#/c/14135/1/fe/src/main/java/org/apache/impala/analysis/SortInfo.java@161
PS1, Line 161:   return e instanceof SlotRef && !e.isBound(sortTid_);
> I think it may be better to fix this in AnalyticPlanner.createSortInfo() in
Thanks for the excellent explanation! Add it in the comment and commit message.

Yes, we can move this into AnalyticPlanner.createSortInfo() so we can limit the 
impact of this change. The bug occurs when the planner creates a Sort node on 
top of the Subplan node, and the output of the Sort node is passed to the 
nullable side of an outer join. Looks like Analytic evaluation is the only way 
to hit these conditions. Without analytic, manually adding a sort clause for 
the collection Join in the inline view won't hit the bug since the sort will be 
ignored. Manually adding a sort-limit clause won't hit the bug too since the 
sort node becomes a top-n node and it already has projection.



--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 09 Sep 2019 09:35:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple

2019-09-09 Thread Quanlong Huang (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14135

to look at the new patch set (#2).

Change subject: IMPALA-8718: project out collection slots in analytic's sort 
tuple
..

IMPALA-8718: project out collection slots in analytic's sort tuple

Subplan node is mainly used to extract collection values. It evaluates
its right plan tree (usually a nested loop join) for every row from its
left child (usually a scan producing tuples with collection values), and
returns those rows produced by the right child. Each row (TupleRow)
produced by the join node consists of several tuples from the join
operands. So the scan node tuple that contains collection values will be
part of the output of the join node, then become part of the output of
the subplan node.

When generating analytic plan, a TupleDescriptor for sort is created
based on the materialized slots of the input. If the input comes from a
subplan node, there are collection slots in it. These collection slots
will be picked out into the sort tuple, and occur in the smap of it.
Then the output smap of the analytic plan will contain the collection
slot consequently. This causes IllegalStateException if the analytic
plan is the nullable side of an outer join. The exception is thrown when
we are checking the necessary of adding a TupleIsNullPredicate for each
output slot.

We should project out the collection slots in creating the sort tuple of
analytic plan to avoid causing such an exception. Projecting out them is
safe since outputs of the analytic node must be in the select list of
the block with the analytic, and we don't allow collection types to be
returned from a select block, and also don't support any builtin or UDF
functions that take collection types as an argument.

Tests
 - Add Planner test in analytic-fns.test with VALIDATE_CARDINALITY
 enabled. Also fix some incorrect row-sizes of existing tests.
 - Add e2e test in nested-types-runtime.test to verify that collection
 slots are projected out.

Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
---
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
5 files changed, 137 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14135/2
--
To view, visit http://gerrit.cloudera.org:8080/14135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655
Gerrit-Change-Number: 14135
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong