[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. IMPALA-8718: project out collection slots in analytic's sort tuple Subplan node is mainly used to extract collection values. It evaluates its right plan tree (usually a nested loop join) for every row from its left child (usually a scan producing tuples with collection values), and returns those rows produced by the right child. Each row (TupleRow) produced by the join node consists of several tuples from the join operands. So the scan node tuple that contains collection values will be part of the output of the join node, then become part of the output of the subplan node. When generating analytic plan, a TupleDescriptor for sort is created based on the materialized slots of the input. If the input comes from a subplan node, there are collection slots in it. These collection slots will be picked out into the sort tuple, and occur in the smap of it. Then the output smap of the analytic plan will contain the collection slot consequently. This causes IllegalStateException if the analytic plan is the nullable side of an outer join. The exception is thrown when we are checking the necessary of adding a TupleIsNullPredicate for each output slot. We should project out the collection slots in creating the sort tuple of analytic plan to avoid causing such an exception. Projecting out them is safe since outputs of the analytic node must be in the select list of the block with the analytic, and we don't allow collection types to be returned from a select block, and also don't support any builtin or UDF functions that take collection types as an argument. Tests - Add Planner test in analytic-fns.test with VALIDATE_CARDINALITY enabled. Also fix some incorrect row-sizes of existing tests. - Add e2e test in nested-types-runtime.test to verify that collection slots are projected out. Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Reviewed-on: http://gerrit.cloudera.org:8080/14135 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test 5 files changed, 137 insertions(+), 14 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 11 Sep 2019 18:18:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. Patch Set 2: Passed exhaustive tests: https://jenkins.impala.io/job/pre-review-test/430/ -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 11 Sep 2019 13:58:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 11 Sep 2019 13:59:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4936/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 11 Sep 2019 13:59:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 10 Sep 2019 19:56:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4496/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 09 Sep 2019 10:14:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14135 ) Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. Patch Set 2: (1 comment) Thank for looking into this, Tim! Update and publish the patch for review. http://gerrit.cloudera.org:8080/#/c/14135/1/fe/src/main/java/org/apache/impala/analysis/SortInfo.java File fe/src/main/java/org/apache/impala/analysis/SortInfo.java: http://gerrit.cloudera.org:8080/#/c/14135/1/fe/src/main/java/org/apache/impala/analysis/SortInfo.java@161 PS1, Line 161: return e instanceof SlotRef && !e.isBound(sortTid_); > I think it may be better to fix this in AnalyticPlanner.createSortInfo() in Thanks for the excellent explanation! Add it in the comment and commit message. Yes, we can move this into AnalyticPlanner.createSortInfo() so we can limit the impact of this change. The bug occurs when the planner creates a Sort node on top of the Subplan node, and the output of the Sort node is passed to the nullable side of an outer join. Looks like Analytic evaluation is the only way to hit these conditions. Without analytic, manually adding a sort clause for the collection Join in the inline view won't hit the bug since the sort will be ignored. Manually adding a sort-limit clause won't hit the bug too since the sort node becomes a top-n node and it already has projection. -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 09 Sep 2019 09:35:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8718: project out collection slots in analytic's sort tuple
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14135 to look at the new patch set (#2). Change subject: IMPALA-8718: project out collection slots in analytic's sort tuple .. IMPALA-8718: project out collection slots in analytic's sort tuple Subplan node is mainly used to extract collection values. It evaluates its right plan tree (usually a nested loop join) for every row from its left child (usually a scan producing tuples with collection values), and returns those rows produced by the right child. Each row (TupleRow) produced by the join node consists of several tuples from the join operands. So the scan node tuple that contains collection values will be part of the output of the join node, then become part of the output of the subplan node. When generating analytic plan, a TupleDescriptor for sort is created based on the materialized slots of the input. If the input comes from a subplan node, there are collection slots in it. These collection slots will be picked out into the sort tuple, and occur in the smap of it. Then the output smap of the analytic plan will contain the collection slot consequently. This causes IllegalStateException if the analytic plan is the nullable side of an outer join. The exception is thrown when we are checking the necessary of adding a TupleIsNullPredicate for each output slot. We should project out the collection slots in creating the sort tuple of analytic plan to avoid causing such an exception. Projecting out them is safe since outputs of the analytic node must be in the select list of the block with the analytic, and we don't allow collection types to be returned from a select block, and also don't support any builtin or UDF functions that take collection types as an argument. Tests - Add Planner test in analytic-fns.test with VALIDATE_CARDINALITY enabled. Also fix some incorrect row-sizes of existing tests. - Add e2e test in nested-types-runtime.test to verify that collection slots are projected out. Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 --- M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test 5 files changed, 137 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14135/2 -- To view, visit http://gerrit.cloudera.org:8080/14135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7edf74ff0f603dfd33ff546e61545bc724990655 Gerrit-Change-Number: 14135 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong